Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Jeff Law




On 1/6/25 5:59 PM, Linus Torvalds wrote:

On Mon, 6 Jan 2025 at 16:13, Jeff Law  wrote:


But in the case of concurrent accesses, shouldn't these objects be
declared as atomic?


No.

They aren't concurrent accesses to the same variable.

They are concurrent accesses to *different* memory locations, and the
compiler is not allowed to mess them up.

You're absolutely right.  Definitely not kosher.  Thanks.

Jeff



Re: [PATCH 1/2] testsuite: add testcase for fixed PR117546

2025-01-06 Thread Sam James
Jeff Law  writes:

> On 1/3/25 11:11 AM, Sam James wrote:
>> Sam James  writes:
>> 
>>> PR117546 was fixed by Eric's r14-10693-gadab597af288d6 change, but
>>> the testcase here is sufficiently different to be worth including
>>> in torture/.
>>>
>>> gcc/testsuite/ChangeLog:
>>> PR ipa/117546
>>> * gcc.dg/torture/pr117546.c: New test.
>>> ---
>> Sorry, forgot to ask: OK? (for both)
> OK for both.

Thanks! Pushed.

>
> jeff


[PATCH] LoongArch: Adjust the cost of ADDRESS_REG_REG [PR114978].

2025-01-06 Thread Lulu Cheng
After changing this cost from 1 to 3, the performance of spec2006
401 473 416 465 482 can be improved by about 2% on LA664.

Add option '-maddr-reg-reg-cost='.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in: Add
option '-maddr-reg-reg-cost='.
* config/loongarch/loongarch-def.cc
(loongarch_rtx_cost_data::loongarch_rtx_cost_data): Initialize
addr_reg_reg_cost to 3.
* config/loongarch/loongarch-opts.cc
(loongarch_target_option_override): If '-maddr-reg-reg-cost='
is not used, set it to the initial value.
* config/loongarch/loongarch-tune.h
(struct loongarch_rtx_cost_data): Add the member
addr_reg_reg_cost and its assignment function to the structure
loongarch_rtx_cost_data.
* config/loongarch/loongarch.cc (loongarch_address_insns):
Use la_addr_reg_reg_cost to set the cost of ADDRESS_REG_REG.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.opt.urls: Regenerate.
* doc/invoke.texi: Add description of '-maddr-reg-reg-cost='.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/const-double-zero-stx.c: Add
'-maddr-reg-reg-cost=1'.
* gcc.target/loongarch/stack-check-alloca-1.c: Likewise.

---
 gcc/config/loongarch/genopts/loongarch.opt.in  | 4 
 gcc/config/loongarch/loongarch-def.cc  | 1 +
 gcc/config/loongarch/loongarch-opts.cc | 3 +++
 gcc/config/loongarch/loongarch-tune.h  | 7 +++
 gcc/config/loongarch/loongarch.cc  | 2 +-
 gcc/config/loongarch/loongarch.opt | 4 
 gcc/config/loongarch/loongarch.opt.urls| 3 +++
 gcc/doc/invoke.texi| 7 ++-
 gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c | 2 +-
 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-1.c  | 2 +-
 10 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 8c292c8600d..39c1545e540 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -177,6 +177,10 @@ mbranch-cost=
 Target RejectNegative Joined UInteger Var(la_branch_cost) Save
 -mbranch-cost=COST Set the cost of branches to roughly COST instructions.
 
+maddr-reg-reg-cost=
+Target RejectNegative Joined UInteger Var(la_addr_reg_reg_cost) Save
+-maddr-reg-reg-cost=COST  Set the cost of ADDRESS_REG_REG to the value 
calculated by COST.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV) Save
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-def.cc 
b/gcc/config/loongarch/loongarch-def.cc
index b0271eb3b9a..5f235a04ef2 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -136,6 +136,7 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
 movcf2gr (COSTS_N_INSNS (7)),
 movgr2cf (COSTS_N_INSNS (15)),
 branch_cost (6),
+addr_reg_reg_cost (3),
 memory_latency (4) {}
 
 /* The following properties cannot be looked up directly using "cpucfg".
diff --git a/gcc/config/loongarch/loongarch-opts.cc 
b/gcc/config/loongarch/loongarch-opts.cc
index 36342cc9373..c2a63f75fc2 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -1010,6 +1010,9 @@ loongarch_target_option_override (struct loongarch_target 
*target,
   if (!opts_set->x_la_branch_cost)
 opts->x_la_branch_cost = loongarch_cost->branch_cost;
 
+  if (!opts_set->x_la_addr_reg_reg_cost)
+opts->x_la_addr_reg_reg_cost = loongarch_cost->addr_reg_reg_cost;
+
   /* other stuff */
   if (ABI_LP64_P (target->abi.base))
 opts->x_flag_pcc_struct_return = 0;
diff --git a/gcc/config/loongarch/loongarch-tune.h 
b/gcc/config/loongarch/loongarch-tune.h
index e69173ebf79..f7819fe7678 100644
--- a/gcc/config/loongarch/loongarch-tune.h
+++ b/gcc/config/loongarch/loongarch-tune.h
@@ -38,6 +38,7 @@ struct loongarch_rtx_cost_data
   unsigned short movcf2gr;
   unsigned short movgr2cf;
   unsigned short branch_cost;
+  unsigned short addr_reg_reg_cost;
   unsigned short memory_latency;
 
   /* Default RTX cost initializer, implemented in loongarch-def.cc.  */
@@ -115,6 +116,12 @@ struct loongarch_rtx_cost_data
 return *this;
   }
 
+  loongarch_rtx_cost_data addr_reg_reg_cost_ (unsigned short 
_addr_reg_reg_cost)
+  {
+addr_reg_reg_cost = _addr_reg_reg_cost;
+return *this;
+  }
+
   loongarch_rtx_cost_data memory_latency_ (unsigned short _memory_latency)
   {
 memory_latency = _memory_latency;
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 89237c377e7..d5e90bfd1e1 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2383,7 +2383,7 @@ loongarch_address_insns (rtx x, machin

RE: [PATCH v1 1/4] Match: Refactor the signed SAT_SUB match patterns [NFC]

2025-01-06 Thread Li, Pan2
Kindly ping for the series.

Pan

-Original Message-
From: Li, Pan2 
Sent: Monday, December 23, 2024 3:09 PM
To: gcc-patches@gcc.gnu.org
Cc: richard.guent...@gmail.com; tamar.christ...@arm.com; juzhe.zh...@rivai.ai; 
kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com
Subject: RE: [PATCH v1 1/4] Match: Refactor the signed SAT_SUB match patterns 
[NFC]

Kindly ping for this series, and Merry Christmas!

Pan

-Original Message-
From: Li, Pan2  
Sent: Thursday, December 12, 2024 4:42 PM
To: gcc-patches@gcc.gnu.org
Cc: richard.guent...@gmail.com; tamar.christ...@arm.com; juzhe.zh...@rivai.ai; 
kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com; Li, Pan2 

Subject: [PATCH v1 1/4] Match: Refactor the signed SAT_SUB match patterns [NFC]

From: Pan Li 

This patch would like to refactor the all signed SAT_ADD patterns,
aka:
* Extract type check outside.
* Re-arrange the related match pattern forms together.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Refactor sorts of signed SAT_SUB match patterns.

Signed-off-by: Pan Li 
---
 gcc/match.pd | 98 +---
 1 file changed, 40 insertions(+), 58 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index dd5302015c7..1ef504f141f 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3375,6 +3375,46 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (wi::bit_and (wi::to_wide (@1), wi::to_wide (@3)) == 0)))
 )
 
+(if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
+ (match (signed_integer_sat_sub @0 @1)
+  /* T Z = (T)((UT)X - (UT)Y);
+ SAT_S_SUB = (X ^ Y) & (X ^ Z) < 0 ? (-(T)(X < 0) ^ MAX) : Z  */
+  (cond^ (lt (bit_and:c (bit_xor:c @0 @1)
+   (bit_xor @0 (nop_convert@2 (minus (nop_convert @0)
+ (nop_convert @1)
+integer_zerop)
+(bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
+@2))
+ (match (signed_integer_sat_sub @0 @1)
+  /* T Z = (T)((UT)X - (UT)Y);
+ SAT_S_SUB = (X ^ Y) & (X ^ Z) >= 0 ? Z : (-(T)(X < 0) ^ MAX)  */
+  (cond^ (ge (bit_and:c (bit_xor:c @0 @1)
+   (bit_xor @0 (nop_convert@2 (minus (nop_convert @0)
+ (nop_convert @1)
+integer_zerop)
+@2
+(bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)))
+ (match (signed_integer_sat_sub @0 @1)
+  /* T Z = (T)((UT)X - (UT)Y);
+ SAT_S_SUB = (X ^ Y) < 0 & (X ^ Z) < 0 ? (-(T)(X < 0) ^ MAX) : Z  */
+  (cond^ (bit_and:c (lt (bit_xor @0 (nop_convert@2 (minus (nop_convert @0)
+ (nop_convert @1
+   integer_zerop)
+   (lt (bit_xor:c @0 @1) integer_zerop))
+(bit_xor:c (nop_convert (negate (nop_convert (convert
+  (lt @0 integer_zerop)
+   max_value)
+@2))
+ (match (signed_integer_sat_sub @0 @1)
+  /* Z = .SUB_OVERFLOW (X, Y)
+ SAT_S_SUB = IMAGPART (Z) != 0 ? (-(T)(X < 0) ^ MAX) : REALPART (Z)  */
+  (cond^ (ne (imagpart (IFN_SUB_OVERFLOW@2 @0 @1)) integer_zerop)
+(bit_xor:c (nop_convert?
+   (negate (nop_convert? (convert (lt @0 integer_zerop)
+   max_value)
+(realpart @2))
+  (if (types_match (type, @0, @1)
+
 /* The boundary condition for case 10: IMM = 1:
SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
simplify (X != 0 ? X + ~0 : 0) to X - (X != 0).  */
@@ -3386,64 +3426,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(with { tree itype = TREE_TYPE (@2); }
 (convert (minus @2 (convert:itype @1))
 
-/* Signed saturation sub, case 1:
-   T minus = (T)((UT)X - (UT)Y);
-   SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
-
-   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
-(match (signed_integer_sat_sub @0 @1)
- (cond^ (lt (bit_and:c (bit_xor:c @0 @1)
-  (bit_xor @0 (nop_convert@2 (minus (nop_convert @0)
-(nop_convert @1)
-   integer_zerop)
-   (bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
-   @2)
- (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type
-
-/* Signed saturation sub, case 2:
-   T minus = (T)((UT)X - (UT)Y);
-   SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
-
-   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
-(match (signed_integer_sat_sub @0 @1)
- (cond^ (ge (bit_and:c (bit_xor:c @0 @1)
-  (bit_xor @0 (nop_convert@2 (minus (nop_convert @0)
-(nop_convert @1)
-   integer_zerop)
-   @2
-   (bit_xor:c (negate (convert (lt @0 integer_zerop

[PATCH] c-pretty-print.cc (pp_c_tree_decl_identifier): Strip private name encoding, PR118303

2025-01-06 Thread Hans-Peter Nilsson
Regtested native x86_64-linux.  Also tested mmix-knuth-mmixware, 
where it fixes ONE testcase, but one which is a regression on 
master.  The PR component is currently ipa, changed from the 
original middle-end.  IIUC this bug-fix doesn't fit the ipa 
category IMHO, but rather more general tree-optimization or 
rather middle-end, to which I'll change the component unless I 
see a reason for this fitting ipa stated.

Ok to commit?

-- >8 --
This is a part of PR118303.  It fixes
FAIL: gcc.dg/analyzer/CVE-2005-1689-minimal.c (test for excess errors)
FAIL: gcc.dg/analyzer/CVE-2005-1689-minimal.c inbuf.data (test for warnings, 
line 62)
for targets where the parameter on that line is subject to
TARGET_CALLEE_COPIES being true.

c-family:
PR middle-end/118303
* c-pretty-print.cc (c_pretty_printer::primary_expression) :
Call primary_expression for all SSA_NAME_VAR nodes and instead move the
DECL_ARTIFICIAL private name stripping to...
(pp_c_tree_decl_identifier): ...here.
---
 gcc/c-family/c-pretty-print.cc | 39 +-
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/gcc/c-family/c-pretty-print.cc b/gcc/c-family/c-pretty-print.cc
index 22a71d1e3558..0b6810e12242 100644
--- a/gcc/c-family/c-pretty-print.cc
+++ b/gcc/c-family/c-pretty-print.cc
@@ -1398,29 +1398,7 @@ c_pretty_printer::primary_expression (tree e)
 
 case SSA_NAME:
   if (SSA_NAME_VAR (e))
-   {
- tree var = SSA_NAME_VAR (e);
- if (tree id = SSA_NAME_IDENTIFIER (e))
-   {
- const char *name = IDENTIFIER_POINTER (id);
- const char *dot;
- if (DECL_ARTIFICIAL (var) && (dot = strchr (name, '.')))
-   {
- /* Print the name without the . suffix (such as in VLAs).
-Use pp_c_identifier so that it can be converted into
-the appropriate encoding.  */
- size_t size = dot - name;
- char *ident = XALLOCAVEC (char, size + 1);
- memcpy (ident, name, size);
- ident[size] = '\0';
- pp_c_identifier (this, ident);
-   }
- else
-   primary_expression (var);
-   }
- else
-   primary_expression (var);
-   }
+   primary_expression (SSA_NAME_VAR (e));
   else if (gimple_assign_single_p (SSA_NAME_DEF_STMT (e)))
{
  /* Print only the right side of the GIMPLE assignment.  */
@@ -3033,7 +3011,20 @@ pp_c_tree_decl_identifier (c_pretty_printer *pp, tree t)
   gcc_assert (DECL_P (t));
 
   if (DECL_NAME (t))
-name = IDENTIFIER_POINTER (DECL_NAME (t));
+{
+  const char *dot;
+  name = IDENTIFIER_POINTER (DECL_NAME (t));
+  if (DECL_ARTIFICIAL (t) && (dot = strchr (name, '.')))
+   {
+ /* Print the name without the . suffix (such as in VLAs and
+callee-copied parameters).  */
+ size_t size = dot - name;
+ char *ident = XALLOCAVEC (char, size + 1);
+ memcpy (ident, name, size);
+ ident[size] = '\0';
+ name = ident;
+   }
+}
   else
 {
   static char xname[8];
-- 
2.39.2



Re: [PATCH] testsuite: generalized field-merge tests for <32-bit int [PR118025]

2025-01-06 Thread Mike Stump
On Jan 6, 2025, at 3:05 PM, Alexandre Oliva  wrote:
> 
> On Dec 22, 2024, Alexandre Oliva  wrote:
> 
>> for  gcc/testsuite/ChangeLog
> 
>>  PR testsuite/118025
>>  * gcc.dg/field-merge-1.c: Convert constants to desired types.
>>  * gcc.dg/field-merge-3.c: Likewise.
>>  * gcc.dg/field-merge-4.c: Likewise.
>>  * gcc.dg/field-merge-5.c: Likewise.
>>  * gcc.dg/field-merge-11.c: Likewise.
>>  * gcc.dg/field-merge-17.c: Don't mess with padding bits.
> 
> Ping?
> https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672194.html

Ok.



Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Jeff Law




On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:

With non-BWX Alpha implementations we have a problem of data races where
a 8-bit byte or 16-bit word quantity is to be written to memory in that
in those cases we use an unprotected RMW access of a 32-bit longword or
64-bit quadword width.  If contents of the longword or quadword accessed
outside the byte or word to be written are changed midway through by a
concurrent write executing on the same CPU such as by a signal handler
or a parallel write executing on another CPU such as by another thread
or via a shared memory segment, then the concluding write of the RMW
access will clobber them.  This is especially important for the safety
of RCU algorithms, but is otherwise an issue anyway.
But in the case of concurrent accesses, shouldn't these objects be 
declared as atomic?  Similarly for objects potentially accessed in a 
signal hnadler shouldn't they be accessed via sig_atomic_t?


Point being I'm not 100% sure we really need to tackle this problem in a 
fully generic manner for all 8/16 bit accesses


What am I missing here?

jeff


Re: [PATCH 1/2] testsuite: add testcase for fixed PR117546

2025-01-06 Thread Jeff Law




On 1/3/25 11:11 AM, Sam James wrote:

Sam James  writes:


PR117546 was fixed by Eric's r14-10693-gadab597af288d6 change, but
the testcase here is sufficiently different to be worth including
in torture/.

gcc/testsuite/ChangeLog:
PR ipa/117546
* gcc.dg/torture/pr117546.c: New test.
---


Sorry, forgot to ask: OK? (for both)

OK for both.

jeff



Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Paul E. McKenney
On Mon, Jan 06, 2025 at 05:12:57PM -0700, Jeff Law wrote:
> 
> 
> On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:
> > With non-BWX Alpha implementations we have a problem of data races where
> > a 8-bit byte or 16-bit word quantity is to be written to memory in that
> > in those cases we use an unprotected RMW access of a 32-bit longword or
> > 64-bit quadword width.  If contents of the longword or quadword accessed
> > outside the byte or word to be written are changed midway through by a
> > concurrent write executing on the same CPU such as by a signal handler
> > or a parallel write executing on another CPU such as by another thread
> > or via a shared memory segment, then the concluding write of the RMW
> > access will clobber them.  This is especially important for the safety
> > of RCU algorithms, but is otherwise an issue anyway.
> But in the case of concurrent accesses, shouldn't these objects be declared
> as atomic?  Similarly for objects potentially accessed in a signal hnadler
> shouldn't they be accessed via sig_atomic_t?
> 
> Point being I'm not 100% sure we really need to tackle this problem in a
> fully generic manner for all 8/16 bit accesses
> 
> What am I missing here?

Doesn't the behavior Maciej is describing constitute a data race injected
by the compiler?  As of C11 and C++11, this is forbidden, correct?

Thanx, Paul


Re: [PATCH v3] aarch64: remove extra XTN in vector concatenation

2025-01-06 Thread Richard Sandiford
Akram Ahmad  writes:
> Hi Richard,
>
> Thanks for the feedback. I've copied in the resulting patch here- if
> this is okay, please could it be committed on my behalf? The patch
> continues below.
>
> Many thanks,
>
> Akram

Thanks. LGTM.  Pushed to trunk.

Richard

> ---
>
> GIMPLE code which performs a narrowing truncation on the result of a
> vector concatenation currently results in an unnecessary XTN being
> emitted following a UZP1 to concate the operands. In cases such as this,
> UZP1 should instead use a smaller arrangement specifier to replace the
> XTN instruction. This is seen in cases such as in this GIMPLE example:
>
>   int32x2_t foo (svint64_t a, svint64_t b)
>   {
> vector(2) int vect__2.8;
> long int _1;
> long int _3;
> vector(2) long int _12;
>
>  [local count: 1073741824]:
> _1 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, a_6(D));
> _3 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, b_7(D));
> _12 = {_1, _3};
> vect__2.8_13 = (vector(2) int) _12;
> return vect__2.8_13;
>
>   }
>
> Original assembly generated:
>
>   bar:
>   ptrue   p3.b, all
>   uaddv   d0, p3, z0.d
>   uaddv   d1, p3, z1.d
>   uzp1v0.2d, v0.2d, v1.2d
>   xtn v0.2s, v0.2d
>   ret
>
> This patch therefore defines the *aarch64_trunc_concat insn which
> truncates the concatenation result, rather than concatenating the
> truncated operands (such as in *aarch64_narrow_trunc), resulting
> in the following optimised assembly being emitted:
>
>   bar:
>   ptrue   p3.b, all
>   uaddv   d0, p3, z0.d
>   uaddv   d1, p3, z1.d
>   uzp1v0.2s, v0.2s, v1.2s
>   ret
>
> This patch passes all regression tests on aarch64 with no new failures.
> A supporting test for this optimisation is also written and passes.
>
> OK for master? I do not have commit rights so I cannot push the patch
> myself.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-simd.md: (*aarch64_trunc_concat)
> new insn definition.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/truncated_concatenation_1.c: new test
> for the above example and other modes covered by insn
> definitions.
> ---
>  gcc/config/aarch64/aarch64-simd.md| 16 ++
>  .../aarch64/sve/truncated_concatenation_1.c   | 32 +++
>  2 files changed, 48 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index cfe95bd4c31..6c129d6c4a8 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1872,6 +1872,22 @@
>[(set_attr "type" "neon_permute")]
>  )
>  
> +(define_insn "*aarch64_trunc_concat"
> +  [(set (match_operand: 0 "register_operand" "=w")
> + (truncate:
> +   (vec_concat:VQN
> + (match_operand: 1 "register_operand" "w")
> + (match_operand: 2 "register_operand" "w"]
> +  "TARGET_SIMD"
> +{
> +  if (!BYTES_BIG_ENDIAN)
> +return "uzp1\\t%0., %1., %2.";
> +  else
> +return "uzp1\\t%0., %2., %1.";
> +}
> +  [(set_attr "type" "neon_permute")]
> +)
> +
>  ;; Packing doubles.
>  
>  (define_expand "vec_pack_trunc_"
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
> new file mode 100644
> index 000..95577a1a9ef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -Wall -march=armv8.2-a+sve" } */
> +
> +#include 
> +#include 
> +
> +int8x8_t f1 (int16x4_t a, int16x4_t b) {
> +int8x8_t ab = vdup_n_s8 (0);
> +int16x8_t ab_concat = vcombine_s16 (a, b);
> +ab = vmovn_s16 (ab_concat);
> +return ab;
> +}
> +
> +int16x4_t f2 (int32x2_t a, int32x2_t b) {
> +int16x4_t ab = vdup_n_s16 (0);
> +int32x4_t ab_concat = vcombine_s32 (a, b);
> +ab = vmovn_s32 (ab_concat);
> +return ab;
> +}
> +
> +int32x2_t f3 (svint64_t a, svint64_t b) {
> +int32x2_t ab = vdup_n_s32 (0);
> +ab = vset_lane_s32 ((int)svaddv_s64 (svptrue_b64 (), a), ab, 0);
> +ab = vset_lane_s32 ((int)svaddv_s64 (svptrue_b64 (), b), ab, 1);
> +return ab;
> +}
> +
> +/* { dg-final { scan-assembler-not {\txtn\t} } }*/
> +/* { dg-final { scan-assembler-not {\tfcvtn\t} } }*/
> +/* { dg-final { scan-assembler-times {\tuzp1\tv[0-9]+\.8b, v[0-9]+\.8b, 
> v[0-9]+\.8b} 1 } }*/
> +/* { dg-final { scan-assembler-times {\tuzp1\tv[0-9]+\.4h, v[0-9]+\.4h, 
> v[0-9]+\.4h} 1 } }*/
> +/* { dg-final { scan-assembler-times {\tuzp1\tv[0-9]+\.2s, v[0-9]+\.2s, 
> v[0-9]+\.2s} 1 } }*/
> \ No newline at end of file


[COMMITTED 06/30] ada: Silence unused parameter warning on linux

2025-01-06 Thread Marc Poulhiès
From: Tonu Naks 

In __gnat_locate_exec_on_path (char *exec_name, int current_dir_on_windows)
the recently added second parameter is for windows only. On non-windows
platforms its usage is removed by the preprocessor and the compiler
reports unused parameter.

gcc/ada/ChangeLog:

* adaint.c: void parameter on non-windows platforms

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/adaint.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index 0b6d4bb6b4e..0459956ff5b 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -3110,6 +3110,10 @@ __gnat_locate_exec_on_path (char *exec_name, int 
current_dir_on_windows)
   }
 
 #else
+  /* Tell the compiler that we are not going to use this parameter
+ on non-windows platforms. */
+  (void)current_dir_on_windows;
+
   const char *path_val = getenv ("PATH");
 
   /* If PATH is not defined, proceed with __gnat_locate_exec anyway, so we can
-- 
2.43.0



[COMMITTED 07/30] ada: Follow-on to Use inheritance in Gen_IL

2025-01-06 Thread Marc Poulhiès
From: Bob Duff 

Fix too-long line.

gcc/ada/ChangeLog:

* gen_il-gen.adb: Fix too-long line.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-gen.adb | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/gen_il-gen.adb b/gcc/ada/gen_il-gen.adb
index 5064e8c6eb7..c200abc0223 100644
--- a/gcc/ada/gen_il-gen.adb
+++ b/gcc/ada/gen_il-gen.adb
@@ -983,7 +983,8 @@ package body Gen_IL.Gen is
  end loop;
 
  Iterate_Types (Node_Kind, Pre => Check_Potential_Inheritance'Access);
- Iterate_Types (Entity_Kind, Pre => 
Check_Potential_Inheritance'Access);
+ Iterate_Types
+   (Entity_Kind, Pre => Check_Potential_Inheritance'Access);
 
  if Could_Be_Inherited_Error then
 raise Illegal with "some fields could be inherited";
-- 
2.43.0



[COMMITTED 01/30] ada: Remove workaround for RM_Size being unable to represent unknown size

2025-01-06 Thread Marc Poulhiès
From: Piotr Trojanek 

When validating instances of Ada.Unchecked_Conversion, we explicitly detected
null arrays, because a size of 0 was used to indicate both an unknown size and
an actual size of 0. This limitation has been lifted, so we can remove
detection of null arrays.

Code cleanup; behavior is unaffected.

gcc/ada/ChangeLog:

* sem_ch13.adb (Validate_Unchecked_Conversions): Remove detection of
null arrays; remove tests for sizes being present, which are redundant
after calling Known_Static_RM_Size.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 42 +++---
 1 file changed, 3 insertions(+), 39 deletions(-)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index f425048222e..fe6429fea97 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -18870,39 +18870,6 @@ package body Sem_Ch13 is

 
procedure Validate_Unchecked_Conversions is
-  function Is_Null_Array (T : Entity_Id) return Boolean;
-  --  We want to warn in the case of converting to a wrong-sized array of
-  --  bytes, including the zero-size case. This returns True in that case,
-  --  which is necessary because a size of 0 is used to indicate both an
-  --  unknown size and a size of 0. It's OK for this to return True in
-  --  other zero-size cases, but we don't go out of our way; for example,
-  --  we don't bother with multidimensional arrays.
-
-  function Is_Null_Array (T : Entity_Id) return Boolean is
-  begin
- if Is_Array_Type (T) and then Is_Constrained (T) then
-declare
-   Index : constant Node_Id := First_Index (T);
-   R : Node_Id; -- N_Range
-begin
-   case Nkind (Index) is
-  when N_Range =>
- R := Index;
-  when N_Subtype_Indication =>
- R := Range_Expression (Constraint (Index));
-  when N_Identifier | N_Expanded_Name =>
- R := Scalar_Range (Entity (Index));
-  when others =>
- raise Program_Error;
-   end case;
-
-   return Is_Null_Range (Low_Bound (R), High_Bound (R));
-end;
- end if;
-
- return False;
-  end Is_Null_Array;
-
begin
   for N in Unchecked_Conversions.First .. Unchecked_Conversions.Last loop
  declare
@@ -18933,9 +18900,8 @@ package body Sem_Ch13 is
goto Continue;
 end if;
 
-if (Known_Static_RM_Size (Source)
-  and then Known_Static_RM_Size (Target))
-  or else Is_Null_Array (Target)
+if Known_Static_RM_Size (Source)
+  and then Known_Static_RM_Size (Target)
 then
--  This validation check, which warns if we have unequal sizes
--  for unchecked conversion, and thus implementation dependent
@@ -18946,9 +18912,7 @@ package body Sem_Ch13 is
Source_Siz := RM_Size (Source);
Target_Siz := RM_Size (Target);
 
-   if Present (Source_Siz) and then Present (Target_Siz)
- and then Source_Siz /= Target_Siz
-   then
+   if Source_Siz /= Target_Siz then
   Error_Msg
 ("?z?types for unchecked conversion have different sizes!",
  Eloc, Act_Unit);
-- 
2.43.0



[COMMITTED 02/30] ada: Fix markup in user's guide

2025-01-06 Thread Marc Poulhiès
From: Ronan Desplanques 

gcc/ada/ChangeLog:

* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Fix markup.
* gnat_ugn.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 .../doc/gnat_ugn/building_executable_programs_with_gnat.rst   | 2 +-
 gcc/ada/gnat_ugn.texi | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst 
b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
index 48e47eb8336..4f46fba93ae 100644
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -6747,7 +6747,7 @@ be presented in subsequent sections.
   .. index:: -o   (gnatbind)
 
 :switch:`-o {file}`
-  Name the output file ``file`` (default is :file:`b~`xxx`.adb`).
+  Name the output file ``file`` (default is :file:`b~{xxx}.adb`).
   Note that if this option is used, then linking must be done manually,
   gnatlink cannot be used.
 
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index 544d73e7f36..252b32f74ae 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -19,7 +19,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Dec 12, 2024
+GNAT User's Guide for Native Platforms , Jan 03, 2025
 
 AdaCore
 
@@ -16077,7 +16077,7 @@ equivalent @code{gnatmake} flag (@ref{d1,,Switches for 
gnatmake}).
 
 @item @code{-o `file'}
 
-Name the output file @code{file} (default is @code{b~`xxx}.adb`).
+Name the output file @code{file} (default is @code{b~`xxx'.adb}).
 Note that if this option is used, then linking must be done manually,
 gnatlink cannot be used.
 
-- 
2.43.0



[PATCH v3 0/6] c++: Add some missing LAMBDA_EXPR_EXTRA_SCOPEs

2025-01-06 Thread Nathaniel Shead
This patch series fixes some ABI issues in lambdas, with a side effect
of fixing some issues with module streaming.  This doesn't completely
fix the ABI for lambdas (in particular, namespace scope aliases are
still broken) but it at least improves the situation.

Successfully bootstrapped and regtested on x86_64-pc-linux-gnu.

Nathaniel Shead (6):
  c++: Fix mangling of lambdas in static data member initializers
[PR107741]
  c++: Fix mangling of otherwise unattached class-scope lambdas
[PR118245]
  c++: Fix ABI for lambdas declared in alias templates [PR116568]
  c++: Update mangling of lambdas in expressions
  c++/modules: Add testcase for fixed ICE [PR116568]
  c++/modules: Diagnose TU-local lambdas, give mangling scope to lambdas
in concepts

 gcc/c-family/c-opts.cc|   2 +-
 gcc/common.opt|   5 +-
 gcc/cp/cp-tree.h  |   9 +-
 gcc/cp/decl2.cc   |  77 
 gcc/cp/lambda.cc  |  33 +-
 gcc/cp/mangle.cc  |  15 ++-
 gcc/cp/module.cc  |   7 +-
 gcc/cp/parser.cc  | 111 --
 gcc/cp/pt.cc  |  50 +---
 gcc/doc/invoke.texi   |   3 +
 gcc/testsuite/g++.dg/abi/lambda-ctx2-19.C |  10 ++
 gcc/testsuite/g++.dg/abi/lambda-ctx2-19vs20.C |   8 ++
 gcc/testsuite/g++.dg/abi/lambda-ctx2-20.C |  10 ++
 gcc/testsuite/g++.dg/abi/lambda-ctx2.h|  27 +
 gcc/testsuite/g++.dg/abi/lambda-ctx3.C|  21 
 gcc/testsuite/g++.dg/abi/lambda-ctx4.C|  29 +
 gcc/testsuite/g++.dg/abi/macro0.C |   2 +-
 gcc/testsuite/g++.dg/abi/mangle74.C   |   4 +-
 .../g++.dg/cpp0x/static-member-init-1.C   |   5 +
 .../g++.dg/cpp2a/lambda-generic-mangle1.C |   2 +-
 .../g++.dg/cpp2a/lambda-generic-mangle1a.C|   2 +-
 gcc/testsuite/g++.dg/cpp2a/lambda-uneval20.C  |   7 ++
 gcc/testsuite/g++.dg/modules/internal-4_b.C   |   6 +-
 gcc/testsuite/g++.dg/modules/lambda-8.h   |   7 ++
 gcc/testsuite/g++.dg/modules/lambda-8_a.H |   5 +
 gcc/testsuite/g++.dg/modules/lambda-8_b.C |   5 +
 gcc/testsuite/g++.dg/modules/lambda-9.h   |   2 +
 gcc/testsuite/g++.dg/modules/lambda-9_a.H |   4 +
 gcc/testsuite/g++.dg/modules/lambda-9_b.C |   6 +
 gcc/testsuite/g++.dg/modules/late-ret-3_a.H   |   2 +-
 gcc/testsuite/g++.dg/other/fold1.C|   2 +-
 31 files changed, 402 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2-19.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2-19vs20.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2-20.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2.h
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx3.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/static-member-init-1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval20.C
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-8.h
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-8_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-8_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-9.h
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-9_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-9_b.C

-- 
2.47.0



[PATCH v3 1/6] c++: Fix mangling of lambdas in static data member initializers [PR107741]

2025-01-06 Thread Nathaniel Shead
This fixes an issue where lambdas declared in the initializer of a
static data member within the class body do not get a mangling scope of
that variable; this results in mangled names that do not conform to the
ABI spec.

To do this, the patch splits up grokfield for this case specifically,
allowing a declaration to be build and used in start_lambda_scope before
parsing the initializer, so that record_lambda_scope works correctly.

As a drive-by, this also fixes the issue of a static member not being
visible within its own initializer.

PR c++/107741

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Bump ABI version.

gcc/ChangeLog:

* common.opt: Add -fabi-version=20.
* doc/invoke.texi: Likewise.

gcc/cp/ChangeLog:

* cp-tree.h (start_initialized_static_member): Declare.
(finish_initialized_static_member): Declare.
* decl2.cc (start_initialized_static_member): New function.
(finish_initialized_static_member): New function.
* lambda.cc (record_lambda_scope): Support falling back to old
ABI (maybe with warning).
* parser.cc (cp_parser_member_declaration): Build decl early
when parsing an initialized static data member.

gcc/testsuite/ChangeLog:

* g++.dg/abi/macro0.C: Bump ABI version.
* g++.dg/abi/mangle74.C: Remove XFAILs.
* g++.dg/other/fold1.C: Restore originally raised error.
* g++.dg/abi/lambda-ctx2-19.C: New test.
* g++.dg/abi/lambda-ctx2-19vs20.C: New test.
* g++.dg/abi/lambda-ctx2-20.C: New test.
* g++.dg/abi/lambda-ctx2.h: New test.
* g++.dg/cpp0x/static-member-init-1.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/c-family/c-opts.cc|  2 +-
 gcc/common.opt|  5 +-
 gcc/cp/cp-tree.h  |  3 +
 gcc/cp/decl2.cc   | 77 +++
 gcc/cp/lambda.cc  | 33 ++--
 gcc/cp/parser.cc  | 74 --
 gcc/doc/invoke.texi   |  3 +
 gcc/testsuite/g++.dg/abi/lambda-ctx2-19.C | 10 +++
 gcc/testsuite/g++.dg/abi/lambda-ctx2-19vs20.C |  8 ++
 gcc/testsuite/g++.dg/abi/lambda-ctx2-20.C | 10 +++
 gcc/testsuite/g++.dg/abi/lambda-ctx2.h| 27 +++
 gcc/testsuite/g++.dg/abi/macro0.C |  2 +-
 gcc/testsuite/g++.dg/abi/mangle74.C   |  4 +-
 .../g++.dg/cpp0x/static-member-init-1.C   |  5 ++
 gcc/testsuite/g++.dg/other/fold1.C|  2 +-
 15 files changed, 230 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2-19.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2-19vs20.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2-20.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx2.h
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/static-member-init-1.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b81d1350b1a..87b231861a6 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1084,7 +1084,7 @@ c_common_post_options (const char **pfilename)
 
   /* Change flag_abi_version to be the actual current ABI level, for the
  benefit of c_cpp_builtins, and to make comparison simpler.  */
-  const int latest_abi_version = 19;
+  const int latest_abi_version = 20;
   /* Generate compatibility aliases for ABI v13 (8.2) by default.  */
   const int abi_compat_default = 13;
 
diff --git a/gcc/common.opt b/gcc/common.opt
index e2ac99df1d0..4c2560a0632 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1034,10 +1034,13 @@ Driver Undocumented
 ; Default in G++ 13.
 ;
 ; 19: Emits ABI tags if needed in structured binding mangled names.
-; Ignores cv-quals on [[no_unique_object]] members.
+; Ignores cv-quals on [[no_unique_address]] members.
 ; Mangles constraints on function templates.
 ; Default in G++ 14.
 ;
+; 20: Fix mangling of lambdas in static data member initializers.
+; Default in G++ 15.
+;
 ; Additional positive integers will be assigned as new versions of
 ; the ABI become the default version of the ABI.
 fabi-version=
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1d741ecedc3..f77a325bdd0 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7224,6 +7224,9 @@ extern tree grokfield (const cp_declarator *, 
cp_decl_specifier_seq *,
   tree, bool, tree, tree);
 extern tree grokbitfield (const cp_declarator *, cp_decl_specifier_seq *,
  tree, tree, tree);
+extern tree start_initialized_static_member(const cp_declarator *,
+cp_decl_specifier_seq *, tree);
+extern void finish_initialized_static_member   (tree, tree, tree);
 extern tree splice_template_attributes (tree *, tree);
 extern bool any_dependent_type_attributes_p(tree);
 extern tree cp_reconstruct_complex_type   

[PATCH v3 2/6] c++: Fix mangling of otherwise unattached class-scope lambdas [PR118245]

2025-01-06 Thread Nathaniel Shead
Something like this should probably be backported to GCC 14 too, since
my change in r14-9232-g3685fae23bb008 inadvertantly caused ICEs that
this fixes.  But without the previous patch this patch will cause ABI
changes, and I'm not sure how easily it would be to divorce those
changes from the fix here.

I suppose probably the true issue is that r14-9232 inadvertantly changed
ABI for lambdas in base classes, and we should just revert the parser.cc
changes for 14.3?  (And accept that it'll regress those modules tests.)

-- >8 --

This is a step closer to implementing the suggested changes for
https://github.com/itanium-cxx-abi/cxx-abi/pull/85.  Most lambdas
defined within a class should have an extra scope of that class so that
uses across different TUs are properly merged by the linker.  This also
needs to happen during template instantiation.

While I was working on this I found some other cases where the mangling
of lambdas was incorrect and causing issues, notably the testcase
lambda-ctx3.C which currently emits the same mangling for the base class
and member lambdas, causing mysterious assembler errors since r14-9232.
This is also the root cause of PR c++/118245.

One notable case not handled either here or in the ABI is what is
supposed to happen with lambdas declared in alias templates; see
lambda-ctx4.C.  I believe that by the C++ standard, such lambdas should
also dedup across TUs, but this isn't currently implemented (for
class-scope or not).  I wasn't able to work out how to fix the mangling
logic for this case easily so I've just excluded alias templates from
the class-scope mangling rules in template instantiation.

Since this should only affect usage of lambdas in unevaluated contexts
(a C++20 feature) this patch does not add an ABI flag to control this
behaviour.

PR c++/118245

gcc/cp/ChangeLog:

* cp-tree.h (LAMBDA_EXPR_EXTRA_SCOPE): Adjust comment.
* parser.cc (cp_parser_class_head): Start (and do not finish)
lambda scope for all valid types.
(cp_parser_class_specifier): Finish lambda scope after parsing
members instead.
(cp_parser_member_declaration): Adjust comment to mention
missing lambda scoping for static member initializers.
* pt.cc (instantiate_class_template): Add lambda scoping.
(instantiate_template): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/abi/lambda-ctx3.C: New test.
* g++.dg/abi/lambda-ctx4.C: New test.
* g++.dg/cpp2a/lambda-uneval20.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/cp-tree.h |  3 ++-
 gcc/cp/parser.cc | 23 
 gcc/cp/pt.cc | 14 +++-
 gcc/testsuite/g++.dg/abi/lambda-ctx3.C   | 21 ++
 gcc/testsuite/g++.dg/abi/lambda-ctx4.C   | 22 +++
 gcc/testsuite/g++.dg/cpp2a/lambda-uneval20.C |  7 ++
 6 files changed, 79 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx3.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-ctx4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval20.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f77a325bdd0..824b7bb61ae 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -1562,7 +1562,8 @@ enum cp_lambda_default_capture_mode_type {
   (((struct tree_lambda_expr *)LAMBDA_EXPR_CHECK (NODE))->locus)
 
 /* The mangling scope for the lambda: FUNCTION_DECL, PARM_DECL, VAR_DECL,
-   FIELD_DECL or NULL_TREE.  If this is NULL_TREE, we have no linkage.  */
+   FIELD_DECL, TYPE_DECL, or NULL_TREE.  If this is NULL_TREE, we have no
+   linkage.  */
 #define LAMBDA_EXPR_EXTRA_SCOPE(NODE) \
   (((struct tree_lambda_expr *)LAMBDA_EXPR_CHECK (NODE))->extra_scope)
 
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index dd98649..c851185a86d 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -27313,6 +27313,8 @@ cp_parser_class_specifier (cp_parser* parser)
   if (!braces.require_open (parser))
 {
   pop_deferring_access_checks ();
+  if (type != error_mark_node)
+   finish_lambda_scope ();
   return error_mark_node;
 }
 
@@ -27377,7 +27379,10 @@ cp_parser_class_specifier (cp_parser* parser)
   if (cp_parser_allow_gnu_extensions_p (parser))
 attributes = cp_parser_gnu_attributes_opt (parser);
   if (type != error_mark_node)
-type = finish_struct (type, attributes);
+{
+  type = finish_struct (type, attributes);
+  finish_lambda_scope ();
+}
   if (nested_name_specifier_p)
 pop_inner_scope (old_scope, scope);
 
@@ -28217,6 +28222,12 @@ cp_parser_class_head (cp_parser* parser,
   if (flag_concepts)
 type = associate_classtype_constraints (type);
 
+  /* Lambdas in bases and members must have the same mangling scope for ABI.
+ We open this scope now, and will close it in cp_parser_class_specifier
+ after parsing the member list.  */
+  if (type && type != erro

[PATCH v3 3/6] c++: Fix ABI for lambdas declared in alias templates [PR116568]

2025-01-06 Thread Nathaniel Shead
I'm not 100% sure I've handled this properly, any feedback welcome.
In particular, maybe should I use `DECL_IMPLICIT_TYPEDEF_P` in the
mangling logic instead of `!TYPE_DECL_ALIAS_P`?  They both seem to work
in this case but not sure which would be clearer.

I also looked into trying do a limited form of 'start_decl' before
parsing the type but there were too many circular dependencies for me to
work through, so I think any such changes would have to wait till GCC16
(if they're even possible at all).

-- >8 --

This adds mangling support for lambdas with a mangling context of an
alias template, and gives that context when instantiating such a lambda.

This only currently works for class-scope alias templates, however, due
to the

  if (LAMBDA_EXPR_EXTRA_SCOPE (t))
record_lambda_scope (r);

condition in 'tsubst_lambda_scope'.

For namespace-scope alias templates, we can't easily add the mangling
context: we can't build the TYPE_DECL to record against until after
we've parsed the type (and already recorded lambda scope), as
`start_decl` relies on the type being passed in correctly, and setting
the mangling scope after parsing is too late because e.g.
'template_class_depth' (called from grokfndecl when building the lambda
functions while parsing the type) relies on the LAMBDA_EXPR_EXTRA_SCOPE
already being properly set.  This will also likely matter for
'determine_visibility'.  I'm not sure what a good way to break this
recursive dependency is.

This change also requires a slight adjustment to the late-ret-3
testcase, as changing the order of creating the node for the alias
adjusted the tiebreak sorting of cluster members when logging.

PR c++/116568

gcc/cp/ChangeLog:

* mangle.cc (maybe_template_info): Support getting template info
of alias templates.
(canonicalize_for_substitution): Don't canonicalise aliases.
(decl_mangling_context): Don't treat aliases as lambda closure
types.
(write_unqualified_name): Likewise.
* pt.cc (tsubst_decl): Start lambda scope for alias templates.
(instantiate_template): No longer need to special case alias
templates here.

gcc/testsuite/ChangeLog:

* g++.dg/abi/lambda-ctx4.C: Adjust mangling, include namespace
scope alias templates (XFAILed for now).
* g++.dg/modules/late-ret-3_a.H: Adjust cluster order.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/mangle.cc| 13 +++-
 gcc/cp/pt.cc| 23 +
 gcc/testsuite/g++.dg/abi/lambda-ctx4.C  | 21 ---
 gcc/testsuite/g++.dg/modules/late-ret-3_a.H |  2 +-
 4 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 170dafd52c1..9d457e2a2f3 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -292,7 +292,7 @@ abi_check (int ver)
 static tree
 maybe_template_info (const tree decl)
 {
-  if (TREE_CODE (decl) == TYPE_DECL)
+  if (TREE_CODE (decl) == TYPE_DECL && !TYPE_DECL_ALIAS_P (decl))
 {
   /* TYPE_DECLs are handled specially.  Look at its type to decide
 if this is a template instantiation.  */
@@ -305,7 +305,7 @@ maybe_template_info (const tree decl)
 {
   /* Check if the template is a primary template.  */
   if (DECL_LANG_SPECIFIC (decl) != NULL
- && VAR_OR_FUNCTION_DECL_P (decl)
+ && (VAR_OR_FUNCTION_DECL_P (decl) || TREE_CODE (decl) == TYPE_DECL)
  && DECL_TEMPLATE_INFO (decl)
  && PRIMARY_TEMPLATE_P (DECL_TI_TEMPLATE (decl)))
return DECL_TEMPLATE_INFO (decl);
@@ -402,8 +402,8 @@ write_exception_spec (tree spec)
 static inline tree
 canonicalize_for_substitution (tree node)
 {
-  /* For a TYPE_DECL, use the type instead.  */
-  if (TREE_CODE (node) == TYPE_DECL)
+  /* For a non-alias TYPE_DECL, use the type instead.  */
+  if (TREE_CODE (node) == TYPE_DECL && !TYPE_DECL_ALIAS_P (node))
 node = TREE_TYPE (node);
   if (TYPE_P (node)
   && TYPE_CANONICAL (node) != node
@@ -1044,6 +1044,7 @@ decl_mangling_context (tree decl)
 decl = DECL_TEMPLATE_RESULT (decl);
 
   if (TREE_CODE (decl) == TYPE_DECL
+  && !TYPE_DECL_ALIAS_P (decl)
   && LAMBDA_TYPE_P (TREE_TYPE (decl)))
 {
   tree extra = LAMBDA_TYPE_EXTRA_SCOPE (TREE_TYPE (decl));
@@ -1588,7 +1589,9 @@ write_unqualified_name (tree decl)
   if (TREE_CODE (decl) == TYPE_DECL
   && TYPE_UNNAMED_P (type))
 write_unnamed_type_name (type);
-  else if (TREE_CODE (decl) == TYPE_DECL && LAMBDA_TYPE_P (type))
+  else if (TREE_CODE (decl) == TYPE_DECL
+  && !TYPE_DECL_ALIAS_P (decl)
+  && LAMBDA_TYPE_P (type))
 write_closure_type_name (type);
   else
 write_source_name (DECL_NAME (decl));
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index cb234b53a55..cba7b97ef70 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -15753,6 +15753,7 @@ tsubst_decl (tree t, tree args, tsubst_flags_t com

[PATCH v3 4/6] c++: Update mangling of lambdas in expressions

2025-01-06 Thread Nathaniel Shead
https://github.com/itanium-cxx-abi/cxx-abi/pull/85 clarifies that
mangling a lambda expression should use 'L' rather than "tl".  This only
affects C++20 (and later) so no ABI flag is given.

gcc/cp/ChangeLog:

* mangle.cc (write_expression): Update mangling for lambdas.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-generic-mangle1.C: Update mangling.
* g++.dg/cpp2a/lambda-generic-mangle1a.C: Likewise.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/mangle.cc | 2 +-
 gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1.C  | 2 +-
 gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1a.C | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 9d457e2a2f3..e2e8fb2c71b 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -3769,7 +3769,7 @@ write_expression (tree expr)
 equivalent.
 
 So just use the closure type mangling.  */
-  write_string ("tl");
+  write_char ('L');
   write_type (LAMBDA_EXPR_CLOSURE (expr));
   write_char ('E');
 }
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1.C
index 0051307f53d..306959a4f9f 100644
--- a/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1.C
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1.C
@@ -6,4 +6,4 @@ struct C {
   void f(decltype([](T, auto) { return 0; })) {}
 };
 void g() { C().f({}); }
-// { dg-final { scan-assembler "_ZN1C1fIiEEvDTtlNS_UlT_TL0__E_EEE" } }
+// { dg-final { scan-assembler "_ZN1C1fIiEEvDTLNS_UlT_TL0__E_EEE" } }
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1a.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1a.C
index dc7b0125631..b7bd4ecdd46 100644
--- a/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1a.C
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic-mangle1a.C
@@ -7,4 +7,4 @@ struct C {
   void f(decltype([](T, auto) { return 0; })) {}
 };
 void g() { C().f({}); }
-// { dg-final { scan-assembler "_ZN1C1fIiEEvDTtlNS_UlT_T_E_EEE" } }
+// { dg-final { scan-assembler "_ZN1C1fIiEEvDTLNS_UlT_T_E_EEE" } }
-- 
2.47.0



[PATCH v3 6/6] c++/modules: Diagnose TU-local lambdas, give mangling scope to lambdas in concepts

2025-01-06 Thread Nathaniel Shead
Happy to defer this till GCC16 if preferred.

-- >8 --

This fills in a hole left in r15-6378-g9016c5ac94c557 with regards to
detection of TU-local lambdas.  Now that LAMBDA_EXPR_EXTRA_SCOPE is
properly set for most lambdas we can use it to detect lambdas that are
TU-local.

Lambdas in concept definitions I believe should not be considered
TU-local, since they are always unevaluated and should never be emitted.
This patch gives these lambdas a mangling scope (though it will never be
actually used in name mangling).

Namespace-scope alias declarations currently do not have an extra scope
set either, so this patch will cause them to be falsely detected as
TU-local and error.  However, I believe this is better than the
alternative, as such lambdas cannot be properly referred to from other
TUs now anyway causing strange issues down the line if they were
allowed.  (Clang makes such lambdas internal linkage.)

If in the future we ever correctly give an extra scope to these lambdas
they will automatically start working in modules again, but in the
meantime they should be avoided (or wrapped in an unnamed namespace).

gcc/cp/ChangeLog:

* cp-tree.h (finish_concept_definition): Adjust parameters.
(start_concept_definition): Declare.
* module.cc (depset::hash::is_tu_local_entity): Use
LAMBDA_EXPR_EXTRA_SCOPE to detect TU-local lambdas.
* parser.cc (cp_parser_concept_definition): Start a lambda scope
for concept definitions.
* pt.cc (tsubst_lambda_expr): Namespace-scope lambdas may now
have extra scope.
(finish_concept_definition): Split into...
(start_concept_definition): ...this new function.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-4_b.C: Remove XFAILs, add new XFAIL
for lambda alias.
* g++.dg/modules/lambda-9.h: New test.
* g++.dg/modules/lambda-9_a.H: New test.
* g++.dg/modules/lambda-9_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/cp-tree.h|  3 ++-
 gcc/cp/module.cc|  7 ---
 gcc/cp/parser.cc| 14 +-
 gcc/cp/pt.cc| 21 +++--
 gcc/testsuite/g++.dg/modules/internal-4_b.C |  6 +-
 gcc/testsuite/g++.dg/modules/lambda-9.h |  2 ++
 gcc/testsuite/g++.dg/modules/lambda-9_a.H   |  4 
 gcc/testsuite/g++.dg/modules/lambda-9_b.C   |  6 ++
 8 files changed, 51 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-9.h
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-9_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-9_b.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 824b7bb61ae..ec8935243de 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8679,7 +8679,8 @@ struct diagnosing_failed_constraint
 extern cp_expr finish_constraint_or_expr   (location_t, cp_expr, cp_expr);
 extern cp_expr finish_constraint_and_expr  (location_t, cp_expr, cp_expr);
 extern cp_expr finish_constraint_primary_expr  (cp_expr);
-extern tree finish_concept_definition  (cp_expr, tree, tree);
+extern tree start_concept_definition   (cp_expr);
+extern tree finish_concept_definition  (tree, tree, tree);
 extern tree combine_constraint_expressions  (tree, tree);
 extern tree append_constraint  (tree, tree);
 extern tree get_constraints (const_tree);
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 5350e6c4bad..aa0fee1c6cf 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -13365,9 +13365,10 @@ depset::hash::is_tu_local_entity (tree decl, bool 
explain/*=false*/)
   tree main_decl = TYPE_MAIN_DECL (type);
   if (!DECL_CLASS_SCOPE_P (main_decl)
  && !decl_function_context (main_decl)
- /* FIXME: Lambdas defined outside initializers.  We'll need to more
-thoroughly set LAMBDA_TYPE_EXTRA_SCOPE to check this.  */
- && !LAMBDA_TYPE_P (type))
+ /* LAMBDA_EXPR_EXTRA_SCOPE will be set for lambdas defined in
+contexts where they would not be TU-local.  */
+ && !(LAMBDA_TYPE_P (type)
+  && LAMBDA_TYPE_EXTRA_SCOPE (type)))
{
  if (explain)
inform (loc, "%qT has no name and is not defined within a class, "
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index c851185a86d..f29d2fd0b00 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -31680,8 +31680,20 @@ cp_parser_concept_definition (cp_parser *parser)
   return error_mark_node;
 }
 
+  tree decl = start_concept_definition (id);
+  if (decl == error_mark_node)
+{
+  cp_parser_skip_to_end_of_statement (parser);
+  cp_parser_consume_semicolon_at_end_of_statement (parser);
+  return error_mark_node;
+}
+
   processing_constraint_expression_sentinel parsing_constraint;
+
+  start_lambda_scope (decl);
   tree init = cp_parser_constr

Re: [PATCH v2] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 11:59:08AM +0100, Stefan Schulze Frielinghaus wrote:
> For some targets uptr is mapped to unsigned int and size_t to unsigned
> long and sizeof(int)==sizeof(long) holds.  Still, these are distinct
> types and type checking may fail.  Therefore, replace uptr by
> usize/SIZE_T wherever a size_t is expected.
> 
> Part of #116957
> 
> Cherry picked from LLVM commit 9a156f6b2b0c892d8713ba907f07f027b24953d8
> (removed memprof, msan, and nsan parts).
> 
> libsanitizer/ChangeLog:
> 
>   PR sanitizer/117725
>   * asan/asan_interceptors.cpp: Cherry picked LLVM commit
>   9a156f6b2b0c892d8713ba907f07f027b24953d8.
>   * asan/asan_interceptors.h: Ditto.
>   * asan/asan_interceptors_memintrinsics.h: Ditto.
>   * asan/sanitizer_common_interceptors.inc: Ditto.
>   * sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:
>   Ditto.
>   * sanitizer_common/sanitizer_platform_limits_posix.h:
>   Ditto.
>   * tsan/tsan_interceptors_posix.cpp: Ditto.

LGTM, thanks.

Jakub



libgo patch committed: fix Config.Time in tests with expired certificates

2025-01-06 Thread Ian Lance Taylor
PR 118286 points out that some libgo tests are starting to fail
because they use test certificates that expired on January 1. This
libgo patch is a backport of https://go.dev/cl/640237 in the main
repo.  It uses the existing config.Time field to avoid these test
failures.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline.

Ian
ed1493e12ed75e837e9b9aa794ed24daf397df7c
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index d189a9c6df0..7c93a2e9123 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-f4956f807f1a33e406cf1b3bf3479a9ac1c1015a
+96014b17d9a846d1d878ac4732c2baaf5ee8b2d2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/crypto/tls/handshake_client_test.go 
b/libgo/go/crypto/tls/handshake_client_test.go
index 0950bb0ac45..40409a42c5b 100644
--- a/libgo/go/crypto/tls/handshake_client_test.go
+++ b/libgo/go/crypto/tls/handshake_client_test.go
@@ -881,6 +881,7 @@ func testResumption(t *testing.T, version uint16) {
MaxVersion:   version,
CipherSuites: []uint16{TLS_RSA_WITH_RC4_128_SHA, 
TLS_ECDHE_RSA_WITH_RC4_128_SHA},
Certificates: testConfig.Certificates,
+   Time: testTime,
}
 
issuer, err := x509.ParseCertificate(testRSACertificateIssuer)
@@ -897,6 +898,7 @@ func testResumption(t *testing.T, version uint16) {
ClientSessionCache: NewLRUClientSessionCache(32),
RootCAs:rootCAs,
ServerName: "example.golang",
+   Time:   testTime,
}
 
testResumeState := func(test string, didResume bool) {
@@ -944,20 +946,20 @@ func testResumption(t *testing.T, version uint16) {
}
 
// An old session ticket can resume, but the server will provide a 
ticket encrypted with a fresh key.
-   serverConfig.Time = func() time.Time { return 
time.Now().Add(24*time.Hour + time.Minute) }
+   serverConfig.Time = func() time.Time { return 
testTime().Add(24*time.Hour + time.Minute) }
testResumeState("ResumeWithOldTicket", true)
if bytes.Equal(ticket[:ticketKeyNameLen], 
getTicket()[:ticketKeyNameLen]) {
t.Fatal("old first ticket matches the fresh one")
}
 
// Now the session tickey key is expired, so a full handshake should 
occur.
-   serverConfig.Time = func() time.Time { return 
time.Now().Add(24*8*time.Hour + time.Minute) }
+   serverConfig.Time = func() time.Time { return 
testTime().Add(24*8*time.Hour + time.Minute) }
testResumeState("ResumeWithExpiredTicket", false)
if bytes.Equal(ticket, getTicket()) {
t.Fatal("expired first ticket matches the fresh one")
}
 
-   serverConfig.Time = func() time.Time { return time.Now() } // reset the 
time back
+   serverConfig.Time = func() time.Time { return testTime() } // reset the 
time back
key1 := randomKey()
serverConfig.SetSessionTicketKeys([][32]byte{key1})
 
@@ -974,11 +976,11 @@ func testResumption(t *testing.T, version uint16) {
testResumeState("KeyChangeFinish", true)
 
// Age the session ticket a bit, but not yet expired.
-   serverConfig.Time = func() time.Time { return 
time.Now().Add(24*time.Hour + time.Minute) }
+   serverConfig.Time = func() time.Time { return 
testTime().Add(24*time.Hour + time.Minute) }
testResumeState("OldSessionTicket", true)
ticket = getTicket()
// Expire the session ticket, which would force a full handshake.
-   serverConfig.Time = func() time.Time { return 
time.Now().Add(24*8*time.Hour + time.Minute) }
+   serverConfig.Time = func() time.Time { return 
testTime().Add(24*8*time.Hour + 2*time.Minute) }
testResumeState("ExpiredSessionTicket", false)
if bytes.Equal(ticket, getTicket()) {
t.Fatal("new ticket wasn't provided after old ticket expired")
@@ -988,7 +990,7 @@ func testResumption(t *testing.T, version uint16) {
d := 0 * time.Hour
for i := 0; i < 13; i++ {
d += 12 * time.Hour
-   serverConfig.Time = func() time.Time { return time.Now().Add(d) 
}
+   serverConfig.Time = func() time.Time { return testTime().Add(d) 
}
testResumeState("OldSessionTicket", true)
}
// Expire it (now a little more than 7 days) and make sure a full
@@ -996,7 +998,7 @@ func testResumption(t *testing.T, version uint16) {
// TLS 1.3 since the client should be using a fresh ticket sent over
// by the server.
d += 12 * time.Hour
-   serverConfig.Time = func() time.Time { return time.Now().Add(d) }
+   serverConfig.Time = func() time.Time { return testTime().Add(d) }
if version == VersionTLS13 {
testResumeState("ExpiredSessionTicket", true)
} else 

Re: [PATCH v2 3/7] Alpha: Fix a block move pessimisation with zero-extension after LDWU

2025-01-06 Thread Jeff Law




On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:

For the BWX case we have a pessimisation in `alpha_expand_block_move'
for HImode loads where we place the data loaded into a HImode register
as well, therefore losing information that indeed the data loaded has
already been zero-extended to the full DImode width of the register.
Later on when we store this data in QImode quantities into an unaligned
destination, we zero-extend it again for the purpose of right-shifting,
such as with the test case included producing code at `-O2' as follows:

ldah $2,unaligned_src_hi($29)   !gprelhigh
lda $1,unaligned_src_hi($2) !gprellow
ldwu $6,unaligned_src_hi($2)!gprellow
ldwu $5,2($1)
ldwu $4,4($1)
bis $31,$31,$31
zapnot $6,3,$3  # Redundant!
ldbu $7,6($1)
zapnot $5,3,$2  # Redundant!
stb $6,0($16)
zapnot $4,3,$1  # Redundant!
stb $5,2($16)
srl $3,8,$3
stb $4,4($16)
srl $2,8,$2
stb $3,1($16)
srl $1,8,$1
stb $2,3($16)
stb $1,5($16)
stb $7,6($16)

The non-BWX case is unaffected, because there we use byte insertion, so
we don't care that data is held in a HImode register.

Address this by making the holding RTX a HImode subreg of the original
DImode register, which the RTL passes can then see through and eliminate
the zero-extension where otherwise required, resulting in this shortened
code:

ldah $2,unaligned_src_hi($29)   !gprelhigh
lda $1,unaligned_src_hi($2) !gprellow
ldwu $4,unaligned_src_hi($2)!gprellow
ldwu $3,2($1)
ldwu $2,4($1)
bis $31,$31,$31
srl $4,8,$6
ldbu $1,6($1)
srl $3,8,$5
stb $4,0($16)
stb $6,1($16)
srl $2,8,$4
stb $3,2($16)
stb $5,3($16)
stb $2,4($16)
stb $4,5($16)
stb $1,6($16)

While at it reformat the enclosing do-while statement according to the
GNU Coding Standards, observing that in this case it does not obfuscate
the change owing to the odd original indentation.

gcc/
* config/alpha/alpha.cc (alpha_expand_block_move): Use a HImode
subreg of a DImode register to hold data from an aligned HImode
load.

Generally OK.



+   data_regs[nregs++] = gen_rtx_SUBREG (HImode, tmp, 0);
Would this be better as a gen_lowpart or something like that rather than 
generating the SUBREG directly?  Though I guess you don't have to worry 
about big endian at all, so i may not matter in practice.  Your call.


jeff



Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Richard Sandiford
Jennifer Schmitz  writes:
>> It would also be good to check for performance regressions, now that we have 
>> a patch to test:
>> I will run SPEC2017 with -mcpu=generic and -mcpu=native on Grace, but we 
>> would appreciate help with benchmarking on other platforms.
>> Tamar, would you still be willing to test the patch on other platforms?
>> 
>> If there are no other changes necessary and assuming there are no 
>> performance regressions, I was planning to commit the patch in January after 
>> returning from christmas break.
>> 
>> In the meantime I wish everyone happy holidays.
>> Jennifer
> On Grace, the patch has no non-noise impact on performance for SPEC2017 with 
> -mcpu=generic and -mcpu=native. I also re-validated on aarch64 today, no 
> regression.
> Do you advise to run additional performance tests or is the patch ready to be 
> pushed to trunk?

Go for it :)  The patch is clearly moving in the right direction and it's a
question of "when" not "if" we remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS.
We'll also get more performance coverage on non-aarch64 targets once the
patch is pushed.

Thanks again for doing this.

Richard



RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Tamar Christina
> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, January 6, 2025 5:54 PM
> To: Jennifer Schmitz 
> Cc: Richard Biener ; Richard Biener
> ; Tamar Christina ;
> gcc-patches@gcc.gnu.org; Kyrylo Tkachov 
> Subject: Re: [RFC][PATCH] AArch64: Remove
> AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS
> 
> Jennifer Schmitz  writes:
> >> It would also be good to check for performance regressions, now that we 
> >> have a
> patch to test:
> >> I will run SPEC2017 with -mcpu=generic and -mcpu=native on Grace, but we
> would appreciate help with benchmarking on other platforms.
> >> Tamar, would you still be willing to test the patch on other platforms?
> >>
> >> If there are no other changes necessary and assuming there are no 
> >> performance
> regressions, I was planning to commit the patch in January after returning 
> from
> christmas break.
> >>
> >> In the meantime I wish everyone happy holidays.
> >> Jennifer
> > On Grace, the patch has no non-noise impact on performance for SPEC2017 with
> -mcpu=generic and -mcpu=native. I also re-validated on aarch64 today, no
> regression.
> > Do you advise to run additional performance tests or is the patch ready to 
> > be
> pushed to trunk?
> 
> Go for it :)  The patch is clearly moving in the right direction and it's a
> question of "when" not "if" we remove
> AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS.
> We'll also get more performance coverage on non-aarch64 targets once the
> patch is pushed.
> 
> Thanks again for doing this.

Same, Sorry I had missed the question from me above.

Just go for it and I'll triage any issues after it's been committed.

We can address any fallout in stage-4.

Thanks for sticking with it.

Tamar.

> 
> Richard



Re: [PATCH] c: Restore warning for incomplete structures declared in parameter list [PR117866]

2025-01-06 Thread Joseph Myers
On Mon, 6 Jan 2025, Martin Uecker wrote:

> 
> Happy new year! Please consider the following patch.
> 
> Bootstrapped and regression tested on x86_64.
> 
> 
> c: Restore warning for incomplete structures declared in parameter list 
> [PR117866]
> 
> In C23 mode the warning about declaring structures and union in
> parameter lists was removed, because it is possible to redeclare
> a compatible type elsewhere.  This is not the case for incomplete types,
> so restore the warning for those types.
> 
> PR c/117886
> 
> gcc/c/ChangeLog:
> * c-decl.cc (get_parm_info): Change condition for warning.
> 
> gcc/testsuite/ChangeLog:
> * gcc.dg/pr117866.c: New test.
> * gcc.dg/struct-pr118007.c: Adapt.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v2 2/7] Alpha: Optimize block moves coming from longword-aligned source

2025-01-06 Thread Jeff Law




On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:

Now that we have proper alignment determination for block moves in place
the case of copying a block of longword-aligned data has become real, so
implement the merging of loaded data from pairs of SImode registers into
single DImode registers for the purpose of using with unaligned stores
efficiently, as suggested by a comment in `alpha_expand_block_move' and
discard the comment.  Provide test cases accordingly.

gcc/
* config/alpha/alpha.cc (alpha_expand_block_move): Merge loaded
data from pairs of SImode registers into single DImode registers
if to be used with unaligned stores.

gcc/testsuite/
* gcc.target/alpha/memcpy-si-aligned.c: New file.
* gcc.target/alpha/memcpy-si-unaligned.c: New file.
* gcc.target/alpha/memcpy-si-unaligned-dst.c: New file.
* gcc.target/alpha/memcpy-si-unaligned-src.c: New file.
* gcc.target/alpha/memcpy-si-unaligned-src-bwx.c: New file.

OK
jeff



Re: [PATCH] testsuite: generalized field-merge tests for <32-bit int [PR118025]

2025-01-06 Thread Alexandre Oliva
On Dec 22, 2024, Alexandre Oliva  wrote:

> for  gcc/testsuite/ChangeLog

>   PR testsuite/118025
>   * gcc.dg/field-merge-1.c: Convert constants to desired types.
>   * gcc.dg/field-merge-3.c: Likewise.
>   * gcc.dg/field-merge-4.c: Likewise.
>   * gcc.dg/field-merge-5.c: Likewise.
>   * gcc.dg/field-merge-11.c: Likewise.
>   * gcc.dg/field-merge-17.c: Don't mess with padding bits.

Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672194.html

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PATCH 0/4] vect, cfgloopmanip: Fix profile consistency of early break loops [PR117790]

2025-01-06 Thread Alex Coplan
This patch series aims to fix the consistency of the CFG profile for vectorized
early break loops.  I.e., if the CFG profile entering the vectorizer is
consistent for a given early break loop, this series aims to ensure that the
final vectorized loop also has a consistent profile.

For SPEC CPU 2017 on Neoverse V1, the series shows a 4.8% improvement
in imagick with no significant regressions elsewhere.

The series is structured as follows:
 - 1/4 adds missing code to set the counts of the exit blocks when
   creating the CFG for a vectorized early break loop.
 - 2/4 does most of the heavy lifting, adding new code to cfgloopmanip
   to correctly scale the profile of a loop with multiple exits,
   adjusting scale_loop_profile to use it and exposing new helpers in
   cfgloopmanip.h used by subsequent vect patches.
 - 3/4 makes various fixes to ensure adding an epilog skip edge
   preserves consistency of the profile.
 - 4/4 adjusts scale_profile_for_vect_loop to make it correctly handle
   loops with multiple exits.

Many thanks to Andrew Carlotti who gave me significant help in debugging
and discussing the ideas used in this patch series.

Bootstrapped/regtested on aarch64-linux-gnu, arm-linux-gnueabihf, and
x86_64-linux-gnu.

Alex Coplan (4):
  vect: Set counts of early break exit blocks correctly [PR117790]
  cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]
  vect: Ensure profile consistency when adding epilog guard [PR117790]
  vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

 gcc/cfgloopmanip.cc   | 309 --
 gcc/cfgloopmanip.h|   7 +
 .../gcc.dg/vect/vect-early-break-profile-1.c  |  10 +
 .../gcc.dg/vect/vect-early-break-profile-2.c  |  21 ++
 gcc/tree-vect-loop-manip.cc   |  58 +++-
 gcc/tree-vect-loop.cc |  21 +-
 6 files changed, 378 insertions(+), 48 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break-profile-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break-profile-2.c


[PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]

2025-01-06 Thread Alex Coplan
As it stands, scale_loop_profile doesn't correctly handle loops with
multiple exits.  In particular, in the case where the expected niters
exceeds iteration_bound, scale_loop_profile attempts to reduce the
number of iterations with a call to scale_loop_frequencies, which
multiplies the count of each BB by a given probability.  This
transformation preserves the relationships between the counts of the BBs
within the loop (and thus the edge probabilities stay the same) but this
cannot possibly work for loops with multiple exits, since in order for
the expected niters to reduce (and counts along exit edges to remain the
same), the exit edge probabilities must increase, thus decreasing the
probabilities of the internal edges, meaning that the ratios of the
counts of the BBs inside the loop must change.  So we need a different
approach (not a straightforward multiplicative scaling) to adjust the
expected niters of a loop with multiple exits.

This patch introduces a new helper, flow_scale_loop_freqs, which can be
used to correctly scale the profile of a loop with multiple exits.  It
is parameterized by a probability (with which to scale the header and
therefore the expected niters) and a lambda which gives the desired
counts for the exit edges.  In this patch, to make things simpler,
flow_scale_loop_freqs only handles loop shapes without internal control
flow, and we introduce a predicate can_flow_scale_loop_freqs_p to test
whether a given loop meets these criteria.  This restriction is
reasonable since this patch is motivated by fixing the profile
consistency for early break vectorization, and we don't currently
vectorize loops with internal control flow.  We also fall back to a
multiplicative scaling (the status quo) for loops that
flow_scale_loop_freqs can't handle, so the patch should be a net
improvement.

We wrap the call to flow_scale_loop_freqs in a helper
scale_loop_freqs_with_exit_counts which handles the above-mentioned
fallback.  This wrapper is still generic in that it accepts a lambda to
allow overriding the desired exit edge counts.  We specialize this with
another wrapper, scale_loop_freqs_hold_exit_counts (keeping the
counts along exit edges fixed), which is then used to implement the
niters-scaling case of scale_loop_profile, thus fixing this path through
the function for loops with multiple exits.

Finally, we expose two new wrapper functions in cfgloopmanip.h for use
in subsequent vectorizer patches.  scale_loop_profile_hold_exit_counts
is a variant of scale_loop_profile which assumes we want to keep the
counts along exit edges of the loop fixed through both parts of the
transformation (including the initial probability scale).
scale_loop_freqs_with_new_exit_count is intended to be used in a
subsequent patch when adding a skip edge around the epilog, where the
reduction of count entering the loop is mirrored by a reduced count
along a given exit edge.

Bootstrapped/regtested as a series on aarch64-linux-gnu,
x86_64-linux-gnu, and arm-linux-gnueabihf.  OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR tree-optimization/117790
* cfgloopmanip.cc (can_flow_scale_loop_freqs_p): New.
(flow_scale_loop_freqs): New.
(scale_loop_freqs_with_exit_counts): New.
(scale_loop_freqs_hold_exit_counts): New.
(scale_loop_profile): Refactor to use the newly-added
scale_loop_profile_1, and use scale_loop_freqs_hold_exit_counts to
correctly handle reducing the expected niters for loops with multiple
exits.
(scale_loop_freqs_with_new_exit_count): New.
(scale_loop_profile_1): New.
(scale_loop_profile_hold_exit_counts): New.
* cfgloopmanip.h (scale_loop_profile_hold_exit_counts): New.
(scale_loop_freqs_with_new_exit_count): New.
---
 gcc/cfgloopmanip.cc | 309 
 gcc/cfgloopmanip.h  |   7 +
 2 files changed, 294 insertions(+), 22 deletions(-)

diff --git a/gcc/cfgloopmanip.cc b/gcc/cfgloopmanip.cc
index 534e556e1e4..1714f312d42 100644
--- a/gcc/cfgloopmanip.cc
+++ b/gcc/cfgloopmanip.cc
@@ -677,17 +677,243 @@ update_loop_exit_probability_scale_dom_bbs (class loop *loop,
   return exit_edge;
 }
 
-/* Scale profile in LOOP by P.
-   If ITERATION_BOUND is not -1, scale even further if loop is predicted
-   to iterate too many times.
-   Before caling this function, preheader block profile should be already
-   scaled to final count.  This is necessary because loop iterations are
-   determined by comparing header edge count to latch ege count and thus
-   they need to be scaled synchronously.  */
+/* Check if LOOP is a simple loop for scaling purposes; that is, it
+   has no internal control flow and at most one exit from each BB.
+
+   Also check if the sum of the new exit counts given by
+   GET_EXIT_COUNT will differ significantly from the count coming in to
+   the loop.  Return false if so, since flow_scale_loop_freqs relies on
+   loops having consistent profiles in 

[PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-01-06 Thread Alex Coplan
This adds missing code to correctly set the counts of the exit blocks we
create when building the CFG for a vectorized early break loop.

Tested as a series on aarch64-linux-gnu, arm-linux-gnueabihf, and
x86_64-linux-gnu.  OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR tree-optimization/117790
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Set profile counts for {main,alt}_loop_exit_block.
---
 gcc/tree-vect-loop-manip.cc | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 5d1b70aea43..53d36eaa25f 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1686,6 +1686,16 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
 
 	  set_immediate_dominator (CDI_DOMINATORS, new_preheader,
    loop->header);
+
+	  /* Fix up the profile counts of the new exit blocks.
+	 main_loop_exit_block was created by duplicating the
+	 preheader, so needs its count scaling according to the main
+	 exit edge's probability.  The remaining count from the
+	 preheader goes to the alt_loop_exit_block, since all
+	 alternative exits have been redirected there.  */
+	  main_loop_exit_block->count = loop_exit->count ();
+	  alt_loop_exit_block->count
+	= preheader->count - main_loop_exit_block->count;
 	}
 
   /* Adjust the epilog loop PHI entry values to continue iteration.


[PATCH 3/4] vect: Ensure profile consistency when adding epilog guard [PR117790]

2025-01-06 Thread Alex Coplan
This patch tries to make the CFG profile consistent when adding a guard
edge to skip the epilog during peeling.

The changes can be summarized as follows:
 - We avoid adding the guard edge entirely if the guard condition folds
   to false, otherwise the profile will become inconsistent since
   the cfgcleanup code doesn't attempt to update it on removing the dead
   edge.
 - If the guard condition instead folds to true, we account for this by
   giving the skip edge 100% probability (otherwise the profile will
   again become inconsistent when removing the other now-dead edge).
 - Finally, we use the new helper scale_loop_freqs_with_new_exit_count instead
   of scale_loop_profile to update the epilog frequencies / probabiltiies.
   We make the assumption here that if the IV exit is taken in the vector loop,
   then it will also be taken in the epilog (and not an early exit).  Since we
   add the guard to the vector iv exit, we know any reduction in count
   associated with the epilog skip should be accounted for by a reduction in the
   epilog's iv exit edge count.

Bootstrapped/regtested as a series on aarch64-linux-gnu, arm-linux-gnueabihf,
and x86_64-linux-gnu.  OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR tree-optimization/117790
* tree-vect-loop-manip.cc (vect_do_peeling): Attempt to maintain
consistency of the CFG profile when adding an epilog skip edge.

gcc/testsuite/ChangeLog:

PR tree-optimization/117790
* gcc.dg/vect/vect-early-break-profile-1.c: New test.
---
 .../gcc.dg/vect/vect-early-break-profile-1.c  | 10 
 gcc/tree-vect-loop-manip.cc   | 48 ++-
 2 files changed, 47 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break-profile-1.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break-profile-1.c b/gcc/testsuite/gcc.dg/vect/vect-early-break-profile-1.c
new file mode 100644
index 000..5387e3a0465
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break-profile-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-additional-options "-fdump-tree-vect-blocks-details" } */
+int a[100];
+void f()
+{
+  for (int i = 0; i < 100 && a[i]; i++)
+a[i]++;
+}
+/* { dg-final { scan-tree-dump-not "Invalid sum" "vect" } } */
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 53d36eaa25f..4d472ab56ab 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -3546,18 +3546,23 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
 
   /* If we have a peeled vector iteration we will never skip the epilog loop
 	 and we can simplify the cfg a lot by not doing the edge split.  */
-  if (skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
+  guard_cond = fold_build2 (EQ_EXPR, boolean_type_node,
+niters, niters_vector_mult_vf);
+  if ((skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
+	  && !integer_zerop (guard_cond))
 	{
-	  guard_cond = fold_build2 (EQ_EXPR, boolean_type_node,
-niters, niters_vector_mult_vf);
+	  profile_probability prob_skip
+	= integer_onep (guard_cond)
+	? profile_probability::always ()
+	: prob_epilog.invert ();
 
 	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
+	  edge enter_e = single_succ_edge (guard_bb);
 	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
 	  guard_to = epilog_e->dest;
 	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
 	   skip_vector ? anchor : guard_bb,
-	   prob_epilog.invert (),
-	   irred_flag);
+	   prob_skip, irred_flag);
 	  doms.safe_push (guard_to);
 	  if (vect_epilogues)
 	epilogue_vinfo->skip_this_loop_edge = guard_e;
@@ -3586,15 +3591,36 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
 		}
 	}
 
-	  /* Only need to handle basic block before epilog loop if it's not
-	 the guard_bb, which is the case when skip_vector is true.  */
-	  if (guard_bb != bb_before_epilog)
+	  basic_block epilog_ph = loop_preheader_edge (epilog)->src;
+
+	  profile_probability epilog_scale
+	= (epilog_ph->count - guard_e->count ()).probability_in (epilog_ph->count);
+
+	  enter_e->dest->count -= guard_e->count ();
+
+	  /* If we added a vector skip then ENTER_E may not target the epilog
+	 preheader but a block that has the epilog preheader as its single
+	 successor: handle that case.  */
+	  if (enter_e->dest != epilog_ph)
 	{
-	  prob_epilog = prob_vector * prob_epilog + prob_vector.invert ();
+	  gcc_assert (single_succ (enter_e->dest) == epilog_ph);
+	  epilog_ph->count -= guard_e->count ();
+	}
 
-	  scale_bbs_frequencies (&bb_before_epilog, 1, prob_epilog);
+	  if (epilog_e->count () < guard_e->count ())
+	{
+	  if (dump_enabled_p ())
+		dump_printf_loc (MSG_NOTE, vect_location,
+ "epilog iv exit count < guard count, "
+ "profile will become inco

[PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

2025-01-06 Thread Alex Coplan
This adjusts scale_profile_for_vect_loop to DTRT for loops with multiple exits,
namely using scale_loop_profile_hold_exit_counts instead and scaling the
expected niters by 1 / VF.

Tested as a series on aarch64-linux-gnu, arm-linux-gnueabihf, and
x86_64-linux-gnu.  OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR tree-optimization/117790
* tree-vect-loop.cc (scale_profile_for_vect_loop): Use
scale_loop_profile_hold_exit_counts instead of scale_loop_profile.  Drop
the exit edge parameter, since the code now handles multiple exits.
Adjust the caller ...
(vect_transform_loop): ... here.

gcc/testsuite/ChangeLog:

PR tree-optimization/117790
* gcc.dg/vect/vect-early-break-profile-2.c: New test.
---
 .../gcc.dg/vect/vect-early-break-profile-2.c  | 21 +++
 gcc/tree-vect-loop.cc | 21 ++-
 2 files changed, 27 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break-profile-2.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break-profile-2.c b/gcc/testsuite/gcc.dg/vect/vect-early-break-profile-2.c
new file mode 100644
index 000..03c67802b74
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break-profile-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-additional-options "-fdump-tree-vect-blocks-details" } */
+int DECPOWERS[11];
+int multies[] = {5, 3, 1049, 0};
+short decNumberSquareRoot_accnext;
+int decNumberSquareRoot_accunits;
+void decGetDigits(short *, int);
+void decNumberSquareRoot() {
+  int exponent, drop = 0;
+  for (;; drop++) {
+if (exponent >= 0)
+  break;
+if (decNumberSquareRoot_accnext * multies[drop] >> 7 * DECPOWERS[drop])
+  break;
+exponent++;
+  }
+  if (drop)
+decGetDigits(&decNumberSquareRoot_accnext, decNumberSquareRoot_accunits);
+}
+/* { dg-final { scan-tree-dump-not "Invalid sum" "vect" } } */
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 2b9d5956635..ca65c0058db 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -12022,7 +12022,7 @@ vect_gen_loop_len_mask (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
profile.  */
 
 static void
-scale_profile_for_vect_loop (class loop *loop, edge exit_e, unsigned vf, bool flat)
+scale_profile_for_vect_loop (class loop *loop, unsigned vf, bool flat)
 {
   /* For flat profiles do not scale down proportionally by VF and only
  cap by known iteration count bounds.  */
@@ -12053,18 +12053,10 @@ scale_profile_for_vect_loop (class loop *loop, edge exit_e, unsigned vf, bool fl
   vf /= 2;
 }
 
-  if (entry_count.nonzero_p ())
-set_edge_probability_and_rescale_others
-	(exit_e,
-	 entry_count.probability_in (loop->header->count / vf));
-  /* Avoid producing very large exit probability when we do not have
- sensible profile.  */
-  else if (exit_e->probability < profile_probability::always () / (vf * 2))
-set_edge_probability_and_rescale_others (exit_e, exit_e->probability * vf);
-  loop->latch->count = single_pred_edge (loop->latch)->count ();
-
-  scale_loop_profile (loop, profile_probability::always () / vf,
-		  get_likely_max_loop_iterations_int (loop));
+  const auto likely_max_niters = get_likely_max_loop_iterations_int (loop);
+  scale_loop_profile_hold_exit_counts (loop,
+   profile_probability::always () / vf,
+   likely_max_niters);
 }
 
 /* For a vectorized stmt DEF_STMT_INFO adjust all vectorized PHI
@@ -12874,8 +12866,7 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple *loop_vectorized_call)
 			  assumed_vf) - 1
 	 : wi::udiv_floor (loop->nb_iterations_estimate + bias_for_assumed,
 			   assumed_vf) - 1);
-  scale_profile_for_vect_loop (loop, LOOP_VINFO_IV_EXIT (loop_vinfo),
-			   assumed_vf, flat);
+  scale_profile_for_vect_loop (loop, assumed_vf, flat);
 
   if (dump_enabled_p ())
 {


Re: [PATCH] c: do not warn about truncating NUL char when initializing nonstring arrays [PR117178]

2025-01-06 Thread Marek Polacek
On Sun, Dec 15, 2024 at 08:02:57PM -0800, Kees Cook wrote:
> When initializing a nonstring char array when compiled with
> -Wunterminated-string-initialization the warning trips even when
> truncating the trailing NUL character from the string constant. Only
> warn about this when running under -Wc++-compat since under C++ we should
> not initialize nonstrings from C strings.

Thanks for the patch.

> 
>   PR c/117178
> 
> gcc/c/ChangeLog:
> 
> * c-typeck.cc (digest_init): Check for nonstring attribute and
>   avoid warning when only the trailing NUL is truncated.
> (build_c_cast): Update function call prototype.
> (store_init_value): Ditto.
> (output_init_element): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/Wunterminated-string-initialization.c: Update for
>   new warning text and check for nonstring cases.
> * gcc.dg/Wunterminated-string-initialization-c++.c: Duplicate
>   C test for -Wc++-compat.
> ---
>  gcc/c/c-typeck.cc | 73 ---
>  .../Wunterminated-string-initialization-c++.c | 28 +++
>  .../Wunterminated-string-initialization.c | 26 ++-
>  3 files changed, 98 insertions(+), 29 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.dg/Wunterminated-string-initialization-c++.c
> 
> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> index 10b02da8752d..5cd0c07b87b1 100644
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -116,8 +116,8 @@ static void push_member_name (tree);
>  static int spelling_length (void);
>  static char *print_spelling (char *);
>  static void warning_init (location_t, int, const char *);
> -static tree digest_init (location_t, tree, tree, tree, bool, bool, bool, 
> bool,
> -  bool, bool);
> +static tree digest_init (location_t, tree, tree, tree, tree, bool, bool, 
> bool,
> +  bool, bool, bool);
>  static void output_init_element (location_t, tree, tree, bool, tree, tree, 
> bool,
>bool, struct obstack *);
>  static void output_pending_init_elements (int, struct obstack *);
> @@ -6731,7 +6731,7 @@ build_c_cast (location_t loc, tree type, tree expr)
> t = build_constructor_single (type, field, t);
> if (!maybe_const)
>   t = c_wrap_maybe_const (t, true);
> -   t = digest_init (loc, type, t,
> +   t = digest_init (loc, field, type, t,
>  NULL_TREE, false, false, false, true, false, false);
> TREE_CONSTANT (t) = TREE_CONSTANT (value);
> return t;
> @@ -8646,8 +8646,8 @@ store_init_value (location_t init_loc, tree decl, tree 
> init, tree origtype)
>  }
>bool constexpr_p = (VAR_P (decl)
> && C_DECL_DECLARED_CONSTEXPR (decl));
> -  value = digest_init (init_loc, type, init, origtype, npc, int_const_expr,
> -arith_const_expr, true,
> +  value = digest_init (init_loc, decl, type, init, origtype, npc,
> +int_const_expr, arith_const_expr, true,
>  TREE_STATIC (decl) || constexpr_p, constexpr_p);
>  
>/* Store the expression if valid; else report error.  */
> @@ -8996,8 +8996,8 @@ check_constexpr_init (location_t loc, tree type, tree 
> init,
> on initializers for 'constexpr' objects apply.  */
>  
>  static tree
> -digest_init (location_t init_loc, tree type, tree init, tree origtype,
> -  bool null_pointer_constant, bool int_const_expr,
> +digest_init (location_t init_loc, tree field, tree type, tree init,
> +  tree origtype, bool null_pointer_constant, bool int_const_expr,
>bool arith_const_expr, bool strict_string,
>bool require_constant, bool require_constexpr)

Please document the new FIELD parameter.

>  {
> @@ -9132,27 +9132,46 @@ digest_init (location_t init_loc, tree type, tree 
> init, tree origtype,
> && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
>   {
> unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (inside_init);
> -   unsigned unit = TYPE_PRECISION (typ1) / BITS_PER_UNIT;
> -
> -   /* Subtract the size of a single (possibly wide) character
> -  because it's ok to ignore the terminating null char
> -  that is counted in the length of the constant.  */
> -   if (compare_tree_int (TYPE_SIZE_UNIT (type), len - unit) < 0)
> - pedwarn_init (init_loc, 0,
> -   ("initializer-string for array of %qT "
> -"is too long"), typ1);
> -   else if (warn_unterminated_string_initialization
> -&& compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
> - warning_at (init_loc, OPT_Wunterminated_string_initialization,
> - ("initializer-string for array of %qT "
> -  "is too long"), typ1);
> +
> if (compare_tree_int (TYPE_SIZE_UNIT (type), len) <

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Linus Torvalds
On Mon, 6 Jan 2025 at 16:13, Jeff Law  wrote:
>
> But in the case of concurrent accesses, shouldn't these objects be
> declared as atomic?

No.

They aren't concurrent accesses to the same variable.

They are concurrent accesses to *different* memory locations, and the
compiler is not allowed to mess them up.

IOW, if you have

struct myvar {
pthread_mutex_t buffer_lock;
pthread_mutex_t another_var_lock;
char buffer[7];
char another_var;
};

and "buffer_lock" serializes accesses to "buffer[]", and
"another_var_lock" serializes accesses to "another_var", then the
compiler IS NOT ALLOWED TO TOUCH "another_var" when the code touches
"buffer[]".

So if a compiler turns "memset(var->buffer, 0, 7)" into "load 8 bytes,
clear 7 of the bytes, store 8 bytes", then the compiler is buggy.

Because that messes up another thread that accesses "another_var", and
the 8-byte write may write back an old value that is no longer valid.

There is absolutely no gray area here. It was always buggy, and the
alpha architecture was always completely and fundamentally
mis-designed.

C11 made it explicitly clear:

  "Different threads of execution are always allowed to access (read
and modify) different memory locations concurrently, with no
interference and no synchronization requirements"

but honestly, that was just codifying something that should have been
blindingly obvious even before.

Linus


Re: [PATCH] COBOL 1/8 hdr: header files

2025-01-06 Thread Joseph Myers
On Sat, 4 Jan 2025, James K. Lowden wrote:

> On Fri, 3 Jan 2025 19:46:38 +0100
> Jakub Jelinek  wrote:
> 
> > Again, the question is if it needs to be supported everywhere, or
> > just error out on targets which don't have _Float128
> 
> Our preference is simply to error out on targets that don't support
> _Float128.  

"error out" when?

GCC should build --enable-languages=all for all targets.  But some 
languages or libraries might get disabled if not supported for the target.  
So it shouldn't literally be failing the build of GCC, though it might be 
disabling COBOL or the COBOL runtime libraries.

Furthermore, in general whether _Float128 is supported depends on the 
multilib, which strongly indicates that this should affect the configure 
and build logic for the runtime libraries rather than the build of the 
compiler.  Specifically, look at libquadmath/configure.ac: the configure 
script is still run when __float128 isn't supported, but BUILD_LIBQUADMATH 
is disabled so that no library actually gets built when the type isn't 
supported.

I don't know how well the powerpc64*-linux-gnu --enable-targets=all 
configurations actually work in practice, but in principle they have both 
BE and LE multilibs (LE supports _Float128, BE doesn't).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v2 1/7] Alpha: Always respect -mbwx, -mcix, -mfix, -mmax, and their inverse

2025-01-06 Thread Jeff Law




On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:

Contrary to user documentation the `-mbwx', `-mcix', `-mfix', `-mmax'
feature options and their inverse forms are ignored whenever `-mcpu='
option is in effect, either by having been given explicitly or where
configured as the default such as with the `alphaev56-linux-gnu' target.
In the latter case there is no way to change the settings these options
are supposed to tweak other than with `-mcpu=' and the settings cannot
be individually controlled, making all the feature options permanently
inactive.

It seems a regression from commit 7816bea0e23b ("config.gcc: Reorganize
--with-cpu logic.") back in 2003, which replaced the setting of the
default feature mask with the setting of the default CPU across a few
targets, and the complementing logic in the Alpha backend wasn't updated
accordingly.

Fix this by making the individual feature options take precedence over
`-mcpu='.  Add test cases to verify this is the case, and to cover the
defaults as well for the boundary cases.

This has a drawback where the order of the options is ignored between
`-mcpu=' and these individual options, so e.g. `-mno-bwx -mcpu=ev6' will
keep the BWX feature disabled even though `-mcpu=ev6' comes later in the
command line.  This may affect some scenarios involving user overrides
such as with CFLAGS passed to `configure' and `make' invocations.  I do
believe it has been our practice anyway for more finegrained options to
override group options regardless of their relative order on the command
line and in any case using `-mcpu=ev6 -mbwx' as the override will do the
right thing if required, canceling any previous `-mno-bwx'.

This has been spotted with `alphaev56-linux-gnu' target verification and
a recently added test case:

FAIL: gcc.target/alpha/stwx0.c   -O1   scan-assembler-times \\sldq_u\\s 2
FAIL: gcc.target/alpha/stwx0.c   -O1   scan-assembler-times \\smskwh\\s 1
FAIL: gcc.target/alpha/stwx0.c   -O1   scan-assembler-times \\smskwl\\s 1
FAIL: gcc.target/alpha/stwx0.c   -O1   scan-assembler-times \\sstq_u\\s 2

(and similarly for the remaining optimization levels covered) which this
fix has addressed.

gcc/
* config/alpha/alpha.cc (alpha_option_override): Ignore CPU
flags corresponding to features the enabling or disabling of
which has been requested with an individual feature option.

gcc/testsuite/
* gcc.target/alpha/target-bwx-1.c: New file.
* gcc.target/alpha/target-bwx-2.c: New file.
* gcc.target/alpha/target-bwx-3.c: New file.
* gcc.target/alpha/target-bwx-4.c: New file.
* gcc.target/alpha/target-cix-1.c: New file.
* gcc.target/alpha/target-cix-2.c: New file.
* gcc.target/alpha/target-cix-3.c: New file.
* gcc.target/alpha/target-cix-4.c: New file.
* gcc.target/alpha/target-fix-1.c: New file.
* gcc.target/alpha/target-fix-2.c: New file.
* gcc.target/alpha/target-fix-3.c: New file.
* gcc.target/alpha/target-fix-4.c: New file.
* gcc.target/alpha/target-max-1.c: New file.
* gcc.target/alpha/target-max-2.c: New file.
* gcc.target/alpha/target-max-3.c: New file.
* gcc.target/alpha/target-max-4.c: New file.

OK
jeff



Re: [PATCH v2 4/7] Alpha: Export `emit_unlikely_jump' for a subsequent change to use

2025-01-06 Thread Jeff Law




On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:

Rename `emit_unlikely_jump' function to `alpha_emit_unlikely_jump', so
as to avoid namespace pollution, updating callers accordingly and export
it for use in the machine description.  Make it return the insn emitted.

gcc/
* config/alpha/alpha-protos.h (alpha_emit_unlikely_jump): New
prototype.
* config/alpha/alpha.cc (emit_unlikely_jump): Rename to...
(alpha_emit_unlikely_jump): ... this.  Return the insn emitted.
(alpha_split_atomic_op, alpha_split_compare_and_swap)
(alpha_split_compare_and_swap_12, alpha_split_atomic_exchange)
(alpha_split_atomic_exchange_12): Update call sites accordingly.

OK
jeff



Re: [PATCH] ifcombine field-merge: improve handling of dwords

2025-01-06 Thread Alexandre Oliva
On Dec 21, 2024, Alexandre Oliva  wrote:

> On Dec 20, 2024, Jakub Jelinek  wrote:
>> On Wed, Dec 18, 2024 at 12:59:11AM -0300, Alexandre Oliva wrote:
>>> * gcc.dg/field-merge-16.c: New.

>> Note the test FAILs on i686-linux or on x86_64-linux with -m32.

> Indeed, thanks.  Here's a fix.

Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672161.html

> for  gcc/ChangeLog

>   * gimple-fold.cc (fold_truth_andor_for_ifcombine): Limit
>   boundary choice by word size as well.  Try aligned double-word
>   loads as a last resort.

> for  gcc/testsuite/ChangeLog

>   * gcc.dg/field-merge-17.c: New.


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Jeff Law




On 1/6/25 6:03 AM, Maciej W. Rozycki wrote:

With non-BWX Alpha implementations we have a problem of data races where
a 8-bit byte or 16-bit word quantity is to be written to memory in that
in those cases we use an unprotected RMW access of a 32-bit longword or
64-bit quadword width.  If contents of the longword or quadword accessed
outside the byte or word to be written are changed midway through by a
concurrent write executing on the same CPU such as by a signal handler
or a parallel write executing on another CPU such as by another thread
or via a shared memory segment, then the concluding write of the RMW
access will clobber them.  This is especially important for the safety
of RCU algorithms, but is otherwise an issue anyway.

To guard against these data races with byte and aligned word quantities
introduce the `-msafe-bwa' command-line option (standing for Safe Byte &
Word Access) that instructs the compiler to instead use an atomic RMW
access sequence where byte and word memory access machine instructions
are not available.  There is no change to code produced for BWX targets.

It would be sufficient for the secondary reload handle to use a pair of
scratch registers, as requested by `reload_out', but it would end
with poor code produced as one of the scratches would be occupied by
data retrieved and the other one would have to be reloaded with repeated
calculations, all within the LL/SC sequence.

Therefore I chose to add a dedicated `reload_out_safe_bwa' handler
and ask for more scratches there by defining a 256-bit OI integer mode.
While reload is documented in our manual to support an arbitrary number
of scratches in reality it hasn't been implemented for IRA:

/* ??? It would be useful to be able to handle only two, or more than
three, operands, but for now we can only handle the case of having
exactly three: output, input and one temp/scratch.  */

and it seems to be the case for LRA as well.  Do what everyone else does
then and just have one wide multi-register scratch.

I note that the atomic sequences emitted are suboptimal performance-wise
as the looping branch for the unsuccessful completion of the sequence
points backwards, which means it will be predicted as taken despite that
in most cases it will fall through.  I do not see it as a deficiency of
this change proposed as it takes care of recording that the branch is
unlikely to be taken, by calling `alpha_emit_unlikely_jump'.  Therefore
generic code elsewhere shou
Looks like this got truncated.  Anyway, easy to forget how limited 
branch prediction was in this era.  I haven't pondered static branch 
prediction in forever.  I wouldn't worry too much about this case -- we 
can always come back to it if generic doesn't do the right thing with 
code layout.




Add test cases accordingly.

There are notable regressions between a plain `-mno-bwx' configuration
and a `-mno-bwx -msafe-bwa' one:

FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -O0  execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -O1  execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -O2  execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -O3 -g  execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -Os  execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
FAIL: gcc.dg/torture/inline-mem-cpy-cmp-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
FAIL: g++.dg/init/array25.C  -std=c++17 execution test
FAIL: g++.dg/init/array25.C  -std=c++98 execution test
FAIL: g++.dg/init/array25.C  -std=c++26 execution test

They come from the fact that these test cases play tricks with alignment
and end up calling code that expects a reference to aligned data but is
handed one to unaligned data.

This doesn't cause a visible problem with plain `-mno-bwx' code, because
the resulting alignment exception is fixed up by Linux.  There's no such
handling currently implemented for LDL_L or LDQ_L instructions (which
are first in the sequence) and consequently the offender is issued with
SIGBUS instead.  Suitable handling will be added to Linux to complement
this change, so these regressions are seen as harmless and expected.

gcc/
PR target/117759
* config/alpha/alpha-modes.def (OI): New integer mode.
* config/alpha/alpha-protos.h (alpha_expand_mov_safe_bwa): New
prototype.
* config/alpha/alpha.cc (alpha_expand_mov_safe_bwa): New
function.
(alpha_secondary_reload): Handle TARGET_SAFE_BWA.
* config/alpha/alpha.md (aligned_store_safe_bwa)
(unaligned_store_safe_bwa, reload_out_safe_bwa)
(reload_out_unaligned_safe_bwa): New expanders.
(mov, movcqi, reload_out_aligned): Handle
TARGET_SAFE_BWA.
(reload_out): Guard against TARGET_SAFE_BWA.
* config/alpha/alpha.opt (msafe-bwa): New option.
* config/alpha/alpha.opt.urls: Regenerate.
   

[PATCH v1] LoongArch: Opitmize the cost of vec_construct.

2025-01-06 Thread chenxiaolong
  When analyzing 525 on LoongArch architecture, it was found that the
for loop of hotspot function x264_pixel_satd_8x4 could not be quantized
256-bit due to the cost of vec_construct setting. After re-adjusting
vec_construct, the performance of 525 program was improved by 16.57%.
It was found that this function can be vectorized on the aarch64 and
x86 architectures, see [PR98138].

Co-Authored-By: Deng Jianbo .

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_builtin_vectorization_cost): Modify the
construction cost of the vec_construct vector.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vect-slp-two-operator.c: New test.
---
 gcc/config/loongarch/loongarch.cc |  6 ++--
 .../loongarch/vect-slp-two-operator.c | 36 +++
 2 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-slp-two-operator.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 89237c377e7..ff27b96c31e 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4127,10 +4127,10 @@ loongarch_builtin_vectorization_cost (enum 
vect_cost_for_stmt type_of_cost,
 
   case vec_construct:
elements = TYPE_VECTOR_SUBPARTS (vectype);
-   if (ISA_HAS_LASX)
- return elements + 1;
+   if (LASX_SUPPORTED_MODE_P (mode) && !LSX_SUPPORTED_MODE_P (mode))
+ return elements / 2 + 3;
else
- return elements;
+ return elements / 2 + 1;
 
   default:
gcc_unreachable ();
diff --git a/gcc/testsuite/gcc.target/loongarch/vect-slp-two-operator.c 
b/gcc/testsuite/gcc.target/loongarch/vect-slp-two-operator.c
new file mode 100644
index 000..f27492e10f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vect-slp-two-operator.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlasx -ftree-vectorize -fdump-tree-vect 
-fdump-tree-vect-details" } */
+
+typedef unsigned char uint8_t;
+typedef unsigned int uint32_t;
+
+#define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\
+int t0 = s0 + s1;\
+int t1 = s0 - s1;\
+int t2 = s2 + s3;\
+int t3 = s2 - s3;\
+d0 = t0 + t2;\
+d1 = t1 + t3;\
+d2 = t0 - t2;\
+d3 = t1 - t3;\
+}
+
+void sink(uint32_t tmp[4][4]);
+
+int x264_pixel_satd_8x4( uint8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2 )
+{
+uint32_t tmp[4][4];
+int sum = 0;
+for( int i = 0; i < 4; i++, pix1 += i_pix1, pix2 += i_pix2 )
+{
+uint32_t a0 = (pix1[0] - pix2[0]) + ((pix1[4] - pix2[4]) << 16);
+uint32_t a1 = (pix1[1] - pix2[1]) + ((pix1[5] - pix2[5]) << 16);
+uint32_t a2 = (pix1[2] - pix2[2]) + ((pix1[6] - pix2[6]) << 16);
+uint32_t a3 = (pix1[3] - pix2[3]) + ((pix1[7] - pix2[7]) << 16);
+HADAMARD4( tmp[i][0], tmp[i][1], tmp[i][2], tmp[i][3], a0,a1,a2,a3 );
+}
+sink(tmp);
+}
+
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-- 
2.20.1



Re: [PATCH] arm: [MVE intrinsics] Another fix for moves of tuples (PR target/118131)

2025-01-06 Thread Christophe Lyon
ping?

On Fri, 20 Dec 2024 at 23:53, Christophe Lyon
 wrote:
>
> Commit r15-6389-g670df03e5294a3 only partially fixed support for moves
> of large modes: despite the introduction of V2x* and V4x* modes in
> r15-6245-g4f4e13dd235b to support MVE tuples, we still need to support
> TI, OI and XI modes, which appear for instance in gcc.dg/pr100887.c.
>
> The problem was noticed when running the testsuite with
> -mthumb/-march=armv8.1-m.main+mve.fp+fp.dp/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto
> where several tests would ICE in output_move_neon.
>
> gcc/ChangeLog:
>
> PR target/118131
> * config/arm/arm.h (VALID_MVE_STRUCT_MODE): Accept TI, OI and XI
> modes again.
> ---
>  gcc/config/arm/arm.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index b2044db938b..4246b34b6e0 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -1128,7 +1128,10 @@ extern const int arm_arch_cde_coproc_bits[];
> || (MODE) == CImode || (MODE) == XImode)
>
>  #define VALID_MVE_STRUCT_MODE(MODE)\
> -  ((MODE) == V2x16QImode   \
> +  ((MODE) == TImode\
> +   || (MODE) == OImode \
> +   || (MODE) == XImode \
> +   || (MODE) == V2x16QImode\
> || (MODE) == V2x8HImode \
> || (MODE) == V2x4SImode \
> || (MODE) == V2x8HFmode \
> --
> 2.34.1
>


Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2025-01-06 Thread Richard Biener
On Tue, Dec 31, 2024 at 2:04 PM Tamar Christina  wrote:
>
> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, November 20, 2024 11:28 AM
> > To: Andrew Pinski 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use
> > cache and look back further [PR111422]
> >
> > On Sat, Nov 16, 2024 at 5:25 AM Andrew Pinski 
> > wrote:
> > >
> > > After fixing loop-im to do the correct overflow rewriting
> > > for pointer types too. We end up with code like:
> > > ```
> > > _9 = (unsigned long) &g;
> > > _84 = _9 + 18446744073709551615;
> > > _11 = _42 + _84;
> > > _44 = (signed char *) _11;
> > > ...
> > > *_44 = 10;
> > > g ={v} {CLOBBER(eos)};
> > > ...
> > > n[0] = &f;
> > > *_44 = 8;
> > > g ={v} {CLOBBER(eos)};
> > > ```
> > >
> > > Which was not being recongized by the scope conflicts code.
> > > This was because it only handled one level walk backs rather than 
> > > multiple ones.
> > > This fixes the issue by having a cache which records all references to 
> > > addresses
> > > of stack variables.
> > >
> > > Unlike the previous patch, this only records and looks at addresses of 
> > > stack
> > variables.
> > > The cache uses a bitmap and uses the index as the bit to look at.
> > >
> > > PR middle-end/117426
> > > PR middle-end/111422
> > > gcc/ChangeLog:
> > >
> > > * cfgexpand.cc (struct vars_ssa_cache): New class.
> > > (vars_ssa_cache::vars_ssa_cache): New constructor.
> > > (vars_ssa_cache::~vars_ssa_cache): New deconstructor.
> > > (vars_ssa_cache::create): New method.
> > > (vars_ssa_cache::exists): New method.
> > > (vars_ssa_cache::add_one): New method.
> > > (vars_ssa_cache::update): New method.
> > > (vars_ssa_cache::dump): New method.
> > > (add_scope_conflicts_2): Factor mostly out to
> > > vars_ssa_cache::operator(). New cache argument.
> > > Walk the bitmap cache for the stack variables addresses.
> > > (vars_ssa_cache::operator()): New method factored out from
> > > add_scope_conflicts_2. Rewrite to be a full walk of all operands
> > > and use a worklist.
> > > (add_scope_conflicts_1): Add cache new argument for the addr 
> > > cache.
> > > Just call add_scope_conflicts_2 for the phi result instead of 
> > > calling
> > > for the uses and don't call walk_stmt_load_store_addr_ops for 
> > > phis.
> > > Update call to add_scope_conflicts_2 to add cache argument.
> > > (add_scope_conflicts): Add cache argument and update calls to
> > > add_scope_conflicts_1.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/torture/pr117426-1.c: New test.
> > >
> > > Signed-off-by: Andrew Pinski 
> > > ---
> > >  gcc/cfgexpand.cc  | 292 +++---
> > >  gcc/testsuite/gcc.dg/torture/pr117426-1.c |  53 
> > >  2 files changed, 308 insertions(+), 37 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr117426-1.c
> > >
> > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> > > index b88e8827667..841d3c1254e 100644
> > > --- a/gcc/cfgexpand.cc
> > > +++ b/gcc/cfgexpand.cc
> > > @@ -585,35 +585,243 @@ visit_conflict (gimple *, tree op, tree, void 
> > > *data)
> > >return false;
> > >  }
> > >
> > > -/* Helper function for add_scope_conflicts_1.  For USE on
> > > -   a stmt, if it is a SSA_NAME and in its SSA_NAME_DEF_STMT is known to 
> > > be
> > > -   based on some ADDR_EXPR, invoke VISIT on that ADDR_EXPR.  */
> > > +/* A cache for ssa name to address of stack variables.
> > > +   When taking into account if a ssa name refers to an
> > > +   address of a stack variable, we need to walk the
> > > +   expressions backwards to find the addresses. This
> > > +   cache is there so we don't need to walk the expressions
> > > +   all the time.  */
> > > +struct vars_ssa_cache
> > > +{
> > > +private:
> > > +  /* Currently an entry is a bitmap of all of the known stack variables
> > > + addresses that are referenced by the ssa name.
> > > + When the bitmap is the nullptr, then there is no cache.
> > > + Currently only empty bitmaps are shared.
> > > + The reason for why empty cache is not just a null is so we know the
> > > + cache for an entry is filled in.  */
> > > +  struct entry
> > > +  {
> > > +bitmap bmap = nullptr;
> > > +  };
> > > +  entry *vars_ssa_caches;
> > > +public:
> > >
> > > -static inline void
> > > -add_scope_conflicts_2 (tree use, bitmap work,
> > > -  walk_stmt_load_store_addr_fn visit)
> > > +  vars_ssa_cache();
> > > +  ~vars_ssa_cache();
> > > +  const_bitmap operator() (tree name);
> > > +  void dump (FILE *file);
> > > +
> > > +private:
> > > +  /* Can't copy. */
> > > +  vars_ssa_cache(const vars_ssa_cache&) = delete;
> > > +  vars_ssa_cache(vars_ssa_cache&&) = delete;
> > > +
> > > +  /* The shared empty bitmap.  */

Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Jeff Law




On 1/6/25 10:02 AM, Stafford Horne wrote:

On Mon, Jan 06, 2025 at 07:37:56AM -0700, Jeff Law wrote:



On 1/6/25 6:01 AM, Stafford Horne wrote:

In the OpenRISC build we get the following warning:

  ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies 
executable stack
  ld: NOTE: This behaviour is deprecated and will be removed in a future 
version of the linker

Fix this by adding a .note.GNU-stack to indicate the stack does not need to be
executable for the lib1funcs.

Note, this is also needed for the upcoming glibc 2.41.

libgcc/
* config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.

OK for the trunk.  You've got write privs, correct?


Thanks,

Yes I should still have access.  Also, I would like to add to the release-14
branch.  Is there anything special I need to do to add the patch there?

Nothing really special on the release branches.

jeff


Re: [Ping, Fortran, Patch, PR116669, v1] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Jerry D

On 1/6/25 2:06 AM, Andre Vehreschild wrote:

Hi all,

pinging attached rebased patch.

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

- Andre

On Thu, 12 Dec 2024 14:50:13 +0100
Andre Vehreschild  wrote:


Hi all,

attached patch improves analysis of cycles in derived types, i.e. type
dependencies ala:

type(T)
   type(T2), allocatable :: c
end type

type(T2)
   type(T), allocatable :: t
end type

are now detected and deallocating an object that is of any of the types
now no longer crashes the compiler because of an endless recursion. To
accomplish this, I stored the symbols of the types seen in a C++ set
and checked if a component's type is already present in there. When a
type has such an indirect self-reference, it gets marked by setting its
symbol_attribute::recursive flag. Later steps then can make use of it.

Furthermore are _deallocate members of the vtab populated when a type
has the recursive and the alloc_comp flag set.

Bootstraps and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
mainline?


This one is good to go.

Jerry



Note: The patch was developed on top of my coarray patch, but should
apply with delta on a regular trunk w/o issues.

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gcc dot gnu dot org



--
Andre Vehreschild * Email: vehre ad gmx dot de




Re: [-Ping-, Fortran, Patch, PR116669, v3] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Jerry D

On 1/6/25 6:21 AM, Andre Vehreschild wrote:

Hi all,

during looking for something completely different, I figured, that gcc does not
use std::set internally, but its implementation of hash_set. I therefore
adapted the patch to use it. Nothing more changed.

Still regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?



... and this tweak is OK. Proceed.

Jerry


Regards,
Andre


On Mon, 6 Jan 2025 11:06:46 +0100
Andre Vehreschild  wrote:


Hi all,

pinging attached rebased patch.

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

- Andre

On Thu, 12 Dec 2024 14:50:13 +0100
Andre Vehreschild  wrote:


Hi all,

attached patch improves analysis of cycles in derived types, i.e. type
dependencies ala:

type(T)
   type(T2), allocatable :: c
end type

type(T2)
   type(T), allocatable :: t
end type

are now detected and deallocating an object that is of any of the types
now no longer crashes the compiler because of an endless recursion. To
accomplish this, I stored the symbols of the types seen in a C++ set
and checked if a component's type is already present in there. When a
type has such an indirect self-reference, it gets marked by setting its
symbol_attribute::recursive flag. Later steps then can make use of it.

Furthermore are _deallocate members of the vtab populated when a type
has the recursive and the alloc_comp flag set.

Bootstraps and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
mainline?

Note: The patch was developed on top of my coarray patch, but should
apply with delta on a regular trunk w/o issues.

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gcc dot gnu dot org



--
Andre Vehreschild * Email: vehre ad gmx dot de



--
Andre Vehreschild * Email: vehre ad gmx dot de




Re: [Ping, Fortran, Patch, PR114612, v1] Fix missing deep-copy for allocatable components of derived types having cycles.

2025-01-06 Thread Jerry D

On 1/6/25 2:08 AM, Andre Vehreschild wrote:

Hi all,

attached patch has been rebased to latest trunk. Just pinging!

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

- Andre

On Fri, 13 Dec 2024 12:10:58 +0100
Andre Vehreschild  wrote:


Hi all,

attached patch fixes deep-copying (or rather its former absence) for
allocatable components of derived types having cyclic dependencies.

Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline?



Yes, OK.

Thanks.

Jerry


Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gmx dot de



--
Andre Vehreschild * Email: vehre ad gmx dot de




[PATCH v3] aarch64: remove extra XTN in vector concatenation

2025-01-06 Thread Akram Ahmad
Hi Richard,

Thanks for the feedback. I've copied in the resulting patch here- if
this is okay, please could it be committed on my behalf? The patch
continues below.

Many thanks,

Akram

---

GIMPLE code which performs a narrowing truncation on the result of a
vector concatenation currently results in an unnecessary XTN being
emitted following a UZP1 to concate the operands. In cases such as this,
UZP1 should instead use a smaller arrangement specifier to replace the
XTN instruction. This is seen in cases such as in this GIMPLE example:

int32x2_t foo (svint64_t a, svint64_t b)
{
  vector(2) int vect__2.8;
  long int _1;
  long int _3;
  vector(2) long int _12;

   [local count: 1073741824]:
  _1 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, a_6(D));
  _3 = svaddv_s64 ({ -1, 0, 0, 0, 0, 0, 0, 0, ... }, b_7(D));
  _12 = {_1, _3};
  vect__2.8_13 = (vector(2) int) _12;
  return vect__2.8_13;

}

Original assembly generated:

bar:
ptrue   p3.b, all
uaddv   d0, p3, z0.d
uaddv   d1, p3, z1.d
uzp1v0.2d, v0.2d, v1.2d
xtn v0.2s, v0.2d
ret

This patch therefore defines the *aarch64_trunc_concat insn which
truncates the concatenation result, rather than concatenating the
truncated operands (such as in *aarch64_narrow_trunc), resulting
in the following optimised assembly being emitted:

bar:
ptrue   p3.b, all
uaddv   d0, p3, z0.d
uaddv   d1, p3, z1.d
uzp1v0.2s, v0.2s, v1.2s
ret

This patch passes all regression tests on aarch64 with no new failures.
A supporting test for this optimisation is also written and passes.

OK for master? I do not have commit rights so I cannot push the patch
myself.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md: (*aarch64_trunc_concat)
  new insn definition.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/truncated_concatenation_1.c: new test
  for the above example and other modes covered by insn
  definitions.
---
 gcc/config/aarch64/aarch64-simd.md| 16 ++
 .../aarch64/sve/truncated_concatenation_1.c   | 32 +++
 2 files changed, 48 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index cfe95bd4c31..6c129d6c4a8 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1872,6 +1872,22 @@
   [(set_attr "type" "neon_permute")]
 )
 
+(define_insn "*aarch64_trunc_concat"
+  [(set (match_operand: 0 "register_operand" "=w")
+   (truncate:
+ (vec_concat:VQN
+   (match_operand: 1 "register_operand" "w")
+   (match_operand: 2 "register_operand" "w"]
+  "TARGET_SIMD"
+{
+  if (!BYTES_BIG_ENDIAN)
+return "uzp1\\t%0., %1., %2.";
+  else
+return "uzp1\\t%0., %2., %1.";
+}
+  [(set_attr "type" "neon_permute")]
+)
+
 ;; Packing doubles.
 
 (define_expand "vec_pack_trunc_"
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
new file mode 100644
index 000..95577a1a9ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/truncated_concatenation_1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wall -march=armv8.2-a+sve" } */
+
+#include 
+#include 
+
+int8x8_t f1 (int16x4_t a, int16x4_t b) {
+int8x8_t ab = vdup_n_s8 (0);
+int16x8_t ab_concat = vcombine_s16 (a, b);
+ab = vmovn_s16 (ab_concat);
+return ab;
+}
+
+int16x4_t f2 (int32x2_t a, int32x2_t b) {
+int16x4_t ab = vdup_n_s16 (0);
+int32x4_t ab_concat = vcombine_s32 (a, b);
+ab = vmovn_s32 (ab_concat);
+return ab;
+}
+
+int32x2_t f3 (svint64_t a, svint64_t b) {
+int32x2_t ab = vdup_n_s32 (0);
+ab = vset_lane_s32 ((int)svaddv_s64 (svptrue_b64 (), a), ab, 0);
+ab = vset_lane_s32 ((int)svaddv_s64 (svptrue_b64 (), b), ab, 1);
+return ab;
+}
+
+/* { dg-final { scan-assembler-not {\txtn\t} } }*/
+/* { dg-final { scan-assembler-not {\tfcvtn\t} } }*/
+/* { dg-final { scan-assembler-times {\tuzp1\tv[0-9]+\.8b, v[0-9]+\.8b, 
v[0-9]+\.8b} 1 } }*/
+/* { dg-final { scan-assembler-times {\tuzp1\tv[0-9]+\.4h, v[0-9]+\.4h, 
v[0-9]+\.4h} 1 } }*/
+/* { dg-final { scan-assembler-times {\tuzp1\tv[0-9]+\.2s, v[0-9]+\.2s, 
v[0-9]+\.2s} 1 } }*/
\ No newline at end of file
-- 
2.34.1



Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Stafford Horne
On Mon, Jan 06, 2025 at 10:03:50AM -0700, Jeff Law wrote:
> 
> 
> On 1/6/25 10:02 AM, Stafford Horne wrote:
> > On Mon, Jan 06, 2025 at 07:37:56AM -0700, Jeff Law wrote:
> > > 
> > > 
> > > On 1/6/25 6:01 AM, Stafford Horne wrote:
> > > > In the OpenRISC build we get the following warning:
> > > > 
> > > >   ld: warning: __modsi3_s.o: missing .note.GNU-stack section 
> > > > implies executable stack
> > > >   ld: NOTE: This behaviour is deprecated and will be removed in a 
> > > > future version of the linker
> > > > 
> > > > Fix this by adding a .note.GNU-stack to indicate the stack does not 
> > > > need to be
> > > > executable for the lib1funcs.
> > > > 
> > > > Note, this is also needed for the upcoming glibc 2.41.
> > > > 
> > > > libgcc/
> > > > * config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.
> > > OK for the trunk.  You've got write privs, correct?
> > 
> > Thanks,
> > 
> > Yes I should still have access.  Also, I would like to add to the release-14
> > branch.  Is there anything special I need to do to add the patch there?
> Nothing really special on the release branches.

Thanks, I have pushed the patch to trunk and releases/gcc-14.

-Stafford



RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2025-01-06 Thread Tamar Christina
> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, December 31, 2024 1:04 PM
> To: Richard Biener ; Andrew Pinski
> 
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use
> cache and look back further [PR111422]
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, November 20, 2024 11:28 AM
> > To: Andrew Pinski 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use
> > cache and look back further [PR111422]
> >
> > On Sat, Nov 16, 2024 at 5:25 AM Andrew Pinski 
> > wrote:
> > >
> > > After fixing loop-im to do the correct overflow rewriting
> > > for pointer types too. We end up with code like:
> > > ```
> > > _9 = (unsigned long) &g;
> > > _84 = _9 + 18446744073709551615;
> > > _11 = _42 + _84;
> > > _44 = (signed char *) _11;
> > > ...
> > > *_44 = 10;
> > > g ={v} {CLOBBER(eos)};
> > > ...
> > > n[0] = &f;
> > > *_44 = 8;
> > > g ={v} {CLOBBER(eos)};
> > > ```
> > >
> > > Which was not being recongized by the scope conflicts code.
> > > This was because it only handled one level walk backs rather than multiple
> ones.
> > > This fixes the issue by having a cache which records all references to 
> > > addresses
> > > of stack variables.
> > >
> > > Unlike the previous patch, this only records and looks at addresses of 
> > > stack
> > variables.
> > > The cache uses a bitmap and uses the index as the bit to look at.
> > >
> > > PR middle-end/117426
> > > PR middle-end/111422
> > > gcc/ChangeLog:
> > >
> > > * cfgexpand.cc (struct vars_ssa_cache): New class.
> > > (vars_ssa_cache::vars_ssa_cache): New constructor.
> > > (vars_ssa_cache::~vars_ssa_cache): New deconstructor.
> > > (vars_ssa_cache::create): New method.
> > > (vars_ssa_cache::exists): New method.
> > > (vars_ssa_cache::add_one): New method.
> > > (vars_ssa_cache::update): New method.
> > > (vars_ssa_cache::dump): New method.
> > > (add_scope_conflicts_2): Factor mostly out to
> > > vars_ssa_cache::operator(). New cache argument.
> > > Walk the bitmap cache for the stack variables addresses.
> > > (vars_ssa_cache::operator()): New method factored out from
> > > add_scope_conflicts_2. Rewrite to be a full walk of all operands
> > > and use a worklist.
> > > (add_scope_conflicts_1): Add cache new argument for the addr 
> > > cache.
> > > Just call add_scope_conflicts_2 for the phi result instead of 
> > > calling
> > > for the uses and don't call walk_stmt_load_store_addr_ops for 
> > > phis.
> > > Update call to add_scope_conflicts_2 to add cache argument.
> > > (add_scope_conflicts): Add cache argument and update calls to
> > > add_scope_conflicts_1.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/torture/pr117426-1.c: New test.
> > >
> > > Signed-off-by: Andrew Pinski 
> > > ---
> > >  gcc/cfgexpand.cc  | 292 +++---
> > >  gcc/testsuite/gcc.dg/torture/pr117426-1.c |  53 
> > >  2 files changed, 308 insertions(+), 37 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr117426-1.c
> > >
> > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> > > index b88e8827667..841d3c1254e 100644
> > > --- a/gcc/cfgexpand.cc
> > > +++ b/gcc/cfgexpand.cc
> > > @@ -585,35 +585,243 @@ visit_conflict (gimple *, tree op, tree, void 
> > > *data)
> > >return false;
> > >  }
> > >
> > > -/* Helper function for add_scope_conflicts_1.  For USE on
> > > -   a stmt, if it is a SSA_NAME and in its SSA_NAME_DEF_STMT is known to 
> > > be
> > > -   based on some ADDR_EXPR, invoke VISIT on that ADDR_EXPR.  */
> > > +/* A cache for ssa name to address of stack variables.
> > > +   When taking into account if a ssa name refers to an
> > > +   address of a stack variable, we need to walk the
> > > +   expressions backwards to find the addresses. This
> > > +   cache is there so we don't need to walk the expressions
> > > +   all the time.  */
> > > +struct vars_ssa_cache
> > > +{
> > > +private:
> > > +  /* Currently an entry is a bitmap of all of the known stack variables
> > > + addresses that are referenced by the ssa name.
> > > + When the bitmap is the nullptr, then there is no cache.
> > > + Currently only empty bitmaps are shared.
> > > + The reason for why empty cache is not just a null is so we know the
> > > + cache for an entry is filled in.  */
> > > +  struct entry
> > > +  {
> > > +bitmap bmap = nullptr;
> > > +  };
> > > +  entry *vars_ssa_caches;
> > > +public:
> > >
> > > -static inline void
> > > -add_scope_conflicts_2 (tree use, bitmap work,
> > > -  walk_stmt_load_store_addr_fn visit)
> > > +  vars_ssa_cache();
> > > +  ~vars_ssa_cache();
> > > +  const_bitmap operator() (tree name);
> > > +  void dum

[COMMITTED 18/30] ada: Fix crash on Depends contract with homonym functions

2025-01-06 Thread Marc Poulhiès
From: Piotr Trojanek 

When resolving names in flow contracts, we refine the ordinary analysis by
knowing that an overloaded name must refer to an abstract state and not a
function. However, when all overloadings refer to function, we shouldn't
crash, but instead let the error to be diagnosed later.

gcc/ada/ChangeLog:

* sem_prag.adb (Resolve_State): Continue ordinary processing.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 081716a4027..1841a0b79ad 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -33804,11 +33804,7 @@ package body Sem_Prag is
State := Homonym (State);
 end loop;
 
---  A function can never act as a state. If the homonym chain does
---  not contain a corresponding state, then something went wrong in
---  the overloading mechanism.
-
-raise Program_Error;
+--  A function can never act as a state; it will be diagnosed later
  end if;
   end if;
end Resolve_State;
-- 
2.43.0



[COMMITTED 12/30] ada: Ada version used to compile runtime is constant

2025-01-06 Thread Marc Poulhiès
From: Piotr Trojanek 

Code cleanup.

gcc/ada/ChangeLog:

* opt.ads (Ada_Version_Runtime): Now a constant, since it cannot
and should never be modified.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/opt.ads | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/opt.ads b/gcc/ada/opt.ads
index aea52f3ad69..a09a1ccffa6 100644
--- a/gcc/ada/opt.ads
+++ b/gcc/ada/opt.ads
@@ -127,7 +127,7 @@ package Opt is
--  remains set to Ada_Version_Default). This is used in the rare cases
--  (notably pragma Obsolescent) where we want the explicit version set.
 
-   Ada_Version_Runtime : Ada_Version_Type := Ada_With_All_Extensions;
+   Ada_Version_Runtime : constant Ada_Version_Type := Ada_With_All_Extensions;
--  GNAT
--  Ada version used to compile the runtime. Used to set Ada_Version (but
--  not Ada_Version_Explicit) when compiling predefined or internal units.
-- 
2.43.0



[COMMITTED 13/30] ada: Avoid null-exclusion checks for Node_Field_Table

2025-01-06 Thread Marc Poulhiès
From: Piotr Trojanek 

By generating the type of Node_Field_Table with a "not null" qualifier
we check the null exclusion of its elements only once, at the object
declaration.

Tiny performance improvement for the debug builds (because in production
builds checks are disabled anyway); semantics is unaffected.

gcc/ada/ChangeLog:

* gen_il-gen.adb (Put_Tables): Add "not null" to the generated code.
* rtsfind.adb (Cstring_Ptr): Same for table with predefined RE_Id
error messages.
* impunit.adb (Aunit_Record): Same for array of alternative unit names.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-gen.adb | 2 +-
 gcc/ada/impunit.adb| 2 +-
 gcc/ada/rtsfind.adb| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/gen_il-gen.adb b/gcc/ada/gen_il-gen.adb
index c200abc0223..bac17ebc356 100644
--- a/gcc/ada/gen_il-gen.adb
+++ b/gcc/ada/gen_il-gen.adb
@@ -2214,7 +2214,7 @@ package body Gen_IL.Gen is
   Field_Enum_Type_Name & "_Index range <>) of " &
   Field_Enum_Type_Name & ";" & LF);
  Put (S, "type " & Field_Enum_Type_Name &
-  "_Array_Ref is access constant " & Field_Enum_Type_Name &
+  "_Array_Ref is not null access constant " & Field_Enum_Type_Name 
&
   "_Array;" & LF);
  Put (S, "subtype A is " & Field_Enum_Type_Name & "_Array;" & LF);
  --  Short name to make allocators below more readable
diff --git a/gcc/ada/impunit.adb b/gcc/ada/impunit.adb
index 47a5459792e..f8771772b52 100644
--- a/gcc/ada/impunit.adb
+++ b/gcc/ada/impunit.adb
@@ -649,7 +649,7 @@ package body Impunit is
 
type Aunit_Record is record
   Fname : String (1 .. 6);
-  Aname : String_Ptr_Const;
+  Aname : not null String_Ptr_Const;
end record;
 
--  Array of alternative unit names
diff --git a/gcc/ada/rtsfind.adb b/gcc/ada/rtsfind.adb
index 6697184fae0..16deb82973c 100644
--- a/gcc/ada/rtsfind.adb
+++ b/gcc/ada/rtsfind.adb
@@ -153,7 +153,7 @@ package body Rtsfind is
 
--packed component size of 43 is not supported
 
-   type CString_Ptr is access constant String;
+   type CString_Ptr is not null access constant String;
 
type PRE_Id_Entry is record
   Str : CString_Ptr;
-- 
2.43.0



[COMMITTED 04/30] ada: Fix incorrect incomplete type error

2025-01-06 Thread Marc Poulhiès
From: Viljar Indus 

In Ada 2005 even if the formal is using a tagged limited type
then the type should not be considered incomplete.

gcc/ada/ChangeLog:

* sem_ch6.adb (Analyze_Subprogram_Body_Helper):
Exchange_Limited_Views also in Ada 2005.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch6.adb | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
index 2f09284ebda..ab12f22ad51 100644
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -4536,8 +4536,10 @@ package body Sem_Ch6 is
   --  may now appear in parameter and result profiles. Since the analysis
   --  of a subprogram body may use the parameter and result profile of the
   --  spec, swap any limited views with their non-limited counterpart.
+  --
+  --  Note that the non-limited view should also be exchanged in Ada 2005.
 
-  if Ada_Version >= Ada_2012 and then Present (Spec_Id) then
+  if Ada_Version >= Ada_2005 and then Present (Spec_Id) then
  Exch_Views := Exchange_Limited_Views (Spec_Id);
   end if;
 
-- 
2.43.0



[COMMITTED 20/30] ada: Fix assertion failure on 'Old in post-condition with -gnat2022

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou 

It comes from a small oversight in the updated implementation for Ada 2022.

gcc/ada/ChangeLog:

PR ada/117956
* sem_util.adb (Is_Known_On_Entry): Be prepared for constants coming
from a renaming declaration.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 595d3d10d0f..ffc631a7bee 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -31188,8 +31188,17 @@ package body Sem_Util is
--  removal, e.g. by validity checks.
 
if not Comes_From_Source (Obj) then
-  return
-Is_Known_On_Entry (Expression (Parent (Obj)));
+  declare
+ D : constant Node_Id :=
+   Declaration_Node (Obj);
+  begin
+ if Nkind (D) = N_Object_Renaming_Declaration
+ then
+return Is_Known_On_Entry (Name (D));
+ else
+return Is_Known_On_Entry (Expression (D));
+ end if;
+  end;
end if;
 
--  return False if not "all views are constant".
-- 
2.43.0



[COMMITTED 08/30] ada: Elide copy for calls as default values of nonlimited by-reference components

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou 

This prevents a temporary from being created on the primary stack to hold
the result of the function calls before it is copied to the object being
elaborated in the nonlimited by-reference case.

That's already not done in the nonlimited non-by-reference case and there is
no reason to do it in the former case either.  The main issue are the calls
to Remove_Side_Effects in Expand_Ctrl_Function_Call (controlled case only)
and in Expand_N_Assignment_Statement, which serve various purposes including
very technical ones beside removing side effects.

The change is therefore very conservative and only removes the copy in the
case of a naked function call for the time being.

gcc/ada/ChangeLog:

* einfo.ads (Returns_By_Ref): Fix description.
* exp_ch3.adb (Build_Record_Init_Proc.Build_Assignment): Do not
adjust the component manually (if need be), set No_Finalize_Actions
instead of No_Ctrl_Actions for this purpose.  Do not adjust when
the expression is a naked function call.
* exp_ch5.adb (Make_Tag_Ctrl_Assignment): Document the quirks of
the function.  Assert that the LHS of the assignment does not have
side effects and replace calls to Duplicate_Subexpr_No_Checks with
calls to New_Copy_Tree.  Rename local variable Asn to New_N.
(Expand_N_Assignment_Statement): In the tagged or controlled record
case, do remove side effects from both operands on entry.  Remove
them in the controlled record case, except if the RHS is a function
call and the assignment has the No_Ctrl_Actions flag set.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Bail out when the parent
node is an assignment statement with the No_Ctrl_Actions flag set.
* sem_util.adb (Statically_Different): Return True for a function
call that does not return its result by reference.
* sinfo.ads (No_Ctrl_Actions): Adjust description and add a note for
the code generator.
(No_Finalize_Actions): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/einfo.ads|   4 +-
 gcc/ada/exp_ch3.adb  |  80 +++-
 gcc/ada/exp_ch5.adb  | 143 ++-
 gcc/ada/exp_ch6.adb  |  21 +--
 gcc/ada/sem_util.adb |  28 +++--
 gcc/ada/sinfo.ads|  22 ---
 6 files changed, 180 insertions(+), 118 deletions(-)

diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
index d283358c0c0..1946e68b3c7 100644
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -4278,8 +4278,8 @@ package Einfo is
 
 --Returns_By_Ref
 --   Defined in subprogram type entities and functions. Set if a function
---   (or an access-to-function type) returns a result by reference, either
---   because the result is built in place, or its type is by-reference.
+--   (or a function type) returns a result by reference, either because the
+--   result is built in place or its type is limited in Ada 95.
 
 --Reverse_Bit_Order [base type only]
 --   Defined in all record type entities. Set if entity has a Bit_Order
diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 9419d5d2fac..65d8eb7c433 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -2618,11 +2618,10 @@ package body Exp_Ch3 is
  Default_Loc : constant Source_Ptr := Sloc (Default);
  Typ : constant Entity_Id  := Underlying_Type (Etype (Id));
 
- Adj_Call : Node_Id;
- Exp  : Node_Id;
- Exp_Q: Node_Id;
- Lhs  : Node_Id;
- Res  : List_Id;
+ Exp   : Node_Id;
+ Exp_Q : Node_Id;
+ Lhs   : Node_Id;
+ Res   : List_Id;
 
   begin
  Lhs :=
@@ -2677,57 +2676,48 @@ package body Exp_Ch3 is
  Name   => Lhs,
  Expression => Exp));
 
- Set_No_Ctrl_Actions (First (Res));
-
  Exp_Q := Unqualify (Exp);
 
- --  Adjust the tag if tagged (because of possible view conversions).
- --  Suppress the tag adjustment when not Tagged_Type_Expansion because
- --  tags are represented implicitly in objects, and when the record is
- --  initialized with a raise expression.
-
- if Is_Tagged_Type (Typ)
-   and then Tagged_Type_Expansion
-   and then Nkind (Exp_Q) /= N_Raise_Expression
- then
---  Get the relevant type for the call to
---  Make_Tag_Assignment_From_Type, which, for concurrent types is
---  their corresponding record.
-
-declare
-   T : Entity_Id := Underlying_Type (Typ);
-begin
-   if Ekind (T) in E_Protected_Type | E_Task_Type then
-  T := Corresponding_Record_Type (T);
-   end if;
-
-   Append_To (Res,
- Make_Tag_Assignment_From_Type
-   (Default_Loc,
-New_Copy_Tree (Lhs, Ne

[COMMITTED 09/30] ada: C++ exception hierarchies: adjust for gnat-llvm

2025-01-06 Thread Marc Poulhiès
From: Alexandre Oliva 

gnat-llvm doesn't support C++ imports, so arrange for the GNAT.CPP*
units to be omitted from gnat-llvm builds.

Drop G++ exception interoperability support from raise-gcc.c, so as to
not require the GNAT.CPP* units that define some of the required
symbols.

Co-Authored-By: Olivier Hainque 

gcc/ada/ChangeLog:

* Makefile.rtl (LLVM_BUILD): Define based on LLVM_CONFIG.
(GNATRTL_NONTASKING_OBJS): Make g-cpp, g-cppstd, and g-cstyin
conditional on -gcc or -arm EH, and on no LLVM_BUILD.
* raise-gcc.c (GXX_EH_INTEROP): Define as 0 on gnat-llvm or
CERT, and 1 otherwise.
(__gnat_get_cxx_dependent_exception) Omit on !GXX_EH_INTEROP.
(__gnat_maybe_get_cxx_dependent_exception): Likewise.
(__gnat_get_cxx_exception_type_info): Likewise.
(__gnat_obtain_caught_object): Likewise.
(is_handled_by): Omit eid parameter and G++ interop on
!GXX_EH_INTEROP.  Adjust callers.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/Makefile.rtl | 23 ++-
 gcc/ada/raise-gcc.c  | 40 +---
 2 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index cd112d3c744..34bc5c80431 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -26,6 +26,12 @@ ifndef ADAC
 ADAC=$(CC)
 endif
 
+ifeq ($(LLVM_CONFIG),)
+LLVM_BUILD := $(shell $(ADAC) -v | grep ^llvm-gcc)
+else
+LLVM_BUILD := llvm-gcc
+endif
+
 # Objects needed only for tasking
 GNATRTL_TASKING_OBJS= \
   a-dispat$(objext) \
@@ -419,9 +425,6 @@ GNATRTL_NONTASKING_OBJS= \
   g-comlin$(objext) \
   g-comver$(objext) \
   g-crc32$(objext) \
-  g-cpp$(objext) \
-  g-cppstd$(objext) \
-  g-cstyin$(objext) \
   g-ctrl_c$(objext) \
   g-curexc$(objext) \
   g-debpoo$(objext) \
@@ -3152,7 +3155,12 @@ ifeq ($(EH_MECHANISM),-gcc)
 s-excmac.ads
 # include 
@@ -975,6 +991,8 @@ __gnat_exception_language_is_ada (_Unwind_Exception *except)
   return (exception_class_eq (except, GNAT_EXCEPTION_CLASS));
 }
 
+#if GXX_EH_INTEROP
+
 /* Check whether *THROWN_PTR of EXCEPT_TYPEINFO is to be caught by a
CHOICE_TYPEINFO handler under LANG convention.
Implemented by GNAT.CPP_Exception.Convert_Caught_Object.  */
@@ -1099,20 +1117,17 @@ __gnat_obtain_caught_object (int *success_p, void 
**thrown_ptr_p,
 }
 }
 
+#endif /* GXX_EH_INTEROP */
+
 /* Return how CHOICE matches PROPAGATED_EXCEPTION.  */
 
 static enum action_kind
 is_handled_by (Exception_Id choice,
-#ifndef CERT
+#if GXX_EH_INTEROP
   Exception_Id *eid,
-#endif
+#endif /* GXX_EH_INTEROP */
   _Unwind_Exception *propagated_exception)
 {
-#ifndef CERT
-  char lang;
-  bool primary;
-#endif
-
   /* All others choice match everything.  */
   if (choice == GNAT_ALL_OTHERS)
 return handler;
@@ -1144,7 +1159,10 @@ is_handled_by (Exception_Id choice,
   )
 return handler;
 
-#ifndef CERT
+#if GXX_EH_INTEROP
+  char lang;
+  bool primary;
+
   /* C++ exception occurrences with exact (C) or base (B) type matching.  */
   if (((primary = exception_class_eq (propagated_exception,
  CXX_EXCEPTION_CLASS))
@@ -1171,7 +1189,7 @@ is_handled_by (Exception_Id choice,
  return handler;
}
 }
-#endif
+#endif /* GXX_EH_INTEROP */
 
   return nothing;
 }
@@ -1259,9 +1277,9 @@ get_action_description_for (_Unwind_Ptr ip,
= (Exception_Id) get_ttype_entry_for (region, ar_filter);
 
  act = is_handled_by (choice,
-#ifndef CERT
+#if GXX_EH_INTEROP
   eid,
-#endif
+#endif /* GXX_EH_INTEROP */
   uw_exception);
   if (act != nothing)
 {
-- 
2.43.0



[COMMITTED 03/30] ada: Fix finalization issue introduced by previous change

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou 

When detecting calls to subprograms specified for aspects of a type, the
entity denoted by the aspects must go through Ultimate_Alias, since that
of the name of the calls did the same.

gcc/ada/ChangeLog:

* exp_ch6.adb (Expand_Call_Helper): Call Ultimate_Alias for the
detection of calls to subprograms specified for Constant_Indexing.
* exp_util.adb (Is_Indexed_Container): Likewise.
(Is_Iterated_Container): Likewise for Default_Iterator.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb  | 5 +++--
 gcc/ada/exp_util.adb | 4 ++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index c65ea91cb13..e6e5d50dbbf 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -5331,16 +5331,17 @@ package body Exp_Ch6 is
 
   elsif Nkind (Call_Node) = N_Function_Call
 and then Nkind (Parent (Call_Node)) = N_Function_Call
+and then Is_Entity_Name (Name (Parent (Call_Node)))
   then
  declare
 Aspect : constant Node_Id :=
   Find_Value_Of_Aspect
 (Etype (Call_Node), Aspect_Constant_Indexing);
+Subp   : constant Entity_Id := Entity (Name (Parent (Call_Node)));
 
  begin
 if Present (Aspect)
-  and then Is_Entity_Name (Name (Parent (Call_Node)))
-  and then Entity (Name (Parent (Call_Node))) = Entity (Aspect)
+  and then Subp = Ultimate_Alias (Entity (Aspect))
 then
--  Resolution is now finished, make sure we don't start
--  analysis again because of the duplication.
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index a9424b95880..25f9f077174 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -8945,7 +8945,7 @@ package body Exp_Util is
  Aspect := Find_Value_Of_Aspect (Typ, Aspect_Constant_Indexing);
 
  if Present (Aspect) then
-Index := Entity (Aspect);
+Index := Ultimate_Alias (Entity (Aspect));
 
 --  Examine the statements following the container object and
 --  look for a call to the default indexing routine where the
@@ -9030,7 +9030,7 @@ package body Exp_Util is
  Aspect := Find_Value_Of_Aspect (Typ, Aspect_Default_Iterator);
 
  if Present (Aspect) then
-Iter := Entity (Aspect);
+Iter := Ultimate_Alias (Entity (Aspect));
 
 --  Examine the statements following the container object and
 --  look for a call to the default iterate routine where the
-- 
2.43.0



[COMMITTED 21/30] ada: Fix memory leak when failing to initialize newly allocated memory

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou 

This makes the compiler generate cleanup code to deallocate the memory when
the evaluation of the expression of an allocator raises an exception, if the
expression is a call to a function that may raise, i.e. is not declared with
the No_Raise aspect/pragma.  This can also be disabled by means of -gnatdQ.

gcc/ada/ChangeLog:

* debug.adb (dQ): Document usage.
* exp_ch4.ads (Build_Cleanup_For_Allocator): New declaration.
* exp_ch4.adb (Build_Cleanup_For_Allocator): New procedure.
(Expand_Allocator_Expression): Build a cleanup to deallocate the
memory when the evaluation of the expression raises an exception.
* exp_ch6.adb (Make_Build_In_Place_Call_In_Allocator): Likewise.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Do not generate the
detachment if the deallocation is for the cleanup of an allocator.
* gen_il-fields.ads (Opt_Field_Enum): Add For_Allocator.
* gen_il-gen-gen_nodes.adb (N_Free_Statement): Likewise.
* sinfo.ads (For_Allocator): Document usage on N_Free_Statement.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/debug.adb|   5 +-
 gcc/ada/exp_ch4.adb  | 123 +--
 gcc/ada/exp_ch4.ads  |   9 +++
 gcc/ada/exp_ch6.adb  |  16 +++-
 gcc/ada/exp_util.adb |   5 ++
 gcc/ada/gen_il-fields.ads|   1 +
 gcc/ada/gen_il-gen-gen_nodes.adb |   3 +-
 gcc/ada/sinfo.ads|   5 ++
 8 files changed, 140 insertions(+), 27 deletions(-)

diff --git a/gcc/ada/debug.adb b/gcc/ada/debug.adb
index 2d0c32b0f09..c4b6d035e5c 100644
--- a/gcc/ada/debug.adb
+++ b/gcc/ada/debug.adb
@@ -74,7 +74,7 @@ package body Debug is
--  dN   No file name information in exception messages
--  dO   Output immediate error messages
--  dP   Do not check for controlled objects in preelaborable packages
-   --  dQ
+   --  dQ   Do not generate cleanups for qualified expressions of allocators
--  dR   Bypass check for correct version of s-rpc
--  dS   Never convert numbers to machine numbers in Sem_Eval
--  dT   Convert to machine numbers only for constant declarations
@@ -640,6 +640,9 @@ package body Debug is
--   in preelaborable packages, but this restriction is a huge pain,
--   especially in the predefined library units.
 
+   --  dQ   Do not generate cleanups to deallocate the memory in case qualified
+   --   expressions of allocators raise an exception.
+
--  dR   Bypass the check for a proper version of s-rpc being present
--   to use the -gnatz? switch. This allows debugging of the use
--   of stubs generation without needing to have GLADE (or some
diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index 18656ea24fd..75d79019f80 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -437,6 +437,37 @@ package body Exp_Ch4 is
  return;
end Build_Boolean_Array_Proc_Call;
 
+   -
+   -- Build_Cleanup_For_Allocator --
+   -
+
+   function Build_Cleanup_For_Allocator
+ (Loc : Source_Ptr;
+  Obj_Id  : Entity_Id;
+  Pool: Entity_Id;
+  Actions : List_Id) return Node_Id
+   is
+  Free_Stmt : constant Node_Id :=
+Make_Free_Statement (Loc, New_Occurrence_Of (Obj_Id, Loc));
+
+   begin
+  Set_For_Allocator (Free_Stmt);
+  Set_Storage_Pool  (Free_Stmt, Pool);
+
+  return
+Make_Block_Statement (Loc,
+  Handled_Statement_Sequence =>
+Make_Handled_Sequence_Of_Statements (Loc,
+  Statements => Actions,
+  Exception_Handlers => New_List (
+Make_Exception_Handler (Loc,
+  Exception_Choices => New_List (
+Make_Others_Choice (Loc)),
+  Statements=> New_List (
+Free_Stmt,
+Make_Raise_Statement (Loc));
+   end Build_Cleanup_For_Allocator;
+
---
-- Build_Eq_Call --
---
@@ -574,7 +605,12 @@ package body Exp_Ch4 is
   T  : constant Entity_Id  := Entity (Indic);
   PtrT   : constant Entity_Id  := Etype (N);
   DesigT : constant Entity_Id  := Designated_Type (PtrT);
+  Pool   : constant Node_Id:= Storage_Pool (N);
   Special_Return : constant Boolean:= For_Special_Return_Object (N);
+  Special_Pool   : constant Boolean:=
+Present (Pool)
+  and then
+(Is_RTE (Pool, RE_RS_Pool) or else Is_RTE (Pool, RE_SS_Pool));
   Static_Match   : constant Boolean:=
 not Is_Constrained (DesigT)
   or else Subtypes_Statically_Match (T, DesigT);
@@ -586,8 +622,7 @@ package body Exp_Ch4 is
   --  of Exp into the newly allocated memory.
 
   procedure Build_Explicit_Assignment (Temp : Entity_Id;

[COMMITTED 27/30] ada: Fix incorrect RM reference in s-imagef.adb

2025-01-06 Thread Marc Poulhiès
From: Bob Duff 

gcc/ada/ChangeLog:

* libgnat/s-imagef.adb (Set_Image_Integer):
Change "RM A.3.10" to be "RM A.10.9".

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-imagef.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/libgnat/s-imagef.adb b/gcc/ada/libgnat/s-imagef.adb
index 56992064fb1..06cbed11369 100644
--- a/gcc/ada/libgnat/s-imagef.adb
+++ b/gcc/ada/libgnat/s-imagef.adb
@@ -115,7 +115,7 @@ package body System.Image_F is
-- Q * 10**(-D)
 
--  This value can be written to the output file or to the result string
-   --  according to the format described in RM A.3.10. The details of this
+   --  according to the format described in RM A.10.9. The details of this
--  operation are omitted here.
 
--  A 64-bit value can represent all integers with 18 decimal digits, but
-- 
2.43.0



[COMMITTED 26/30] ada: Reduce footprint of C++ exception interoperation support

2025-01-06 Thread Marc Poulhiès
From: Alexandre Oliva 

The initial C++ base-type exception interoperation support change
brought all of GNAT.CPP* along with raise-gcc, because of
[__gnat_]Convert_Caught_Object.  Move that private but pragma-exported
function to GNAT.CPP.Std.Type_Info, so that it can rely on the C++
virtual/dispatch calls that justified the introduction of the Ada
wrapper type, to avoid emulating virtual calls in C or bringing in a
dependency on the C++ compiler and runtime.

Drop the CharPtr package instantiation, that brought a huge amount of
unnecessary code, and use string and storage primitives instead, using
the strcmp builtin directly for the C string compares.

Move the conversion to Ada String in Name to the wrapper interface in
GNAT.CPP.Std, adjusting the private internal type to shave off a few
more bytes from the only unit that raise-gcc will still need.

Finally, disable heap finalization for Type_Info_Ptr, to avoid
dragging in all of the finalization code.  Thank to Eric Botcazou for
the suggestion.

gcc/ada/ChangeLog:

* libgnat/g-cppexc.adb (Convert_Caught_Object): Move...
* libgnat/g-cstyin.adb (Convert_Caught_Object):  ... here.
Use object call notation.
(strcmp): New.
(Char_Arr, CharPtr, Char_Pointer, To_chars_ptr): Drop.  Do not
import Interfaces.C.Pointers.
(To_Pointer): Convert from System.Address.
(Name_Starts_With_Asterisk): Rename local variable.
(Name_Past_Asterisk): Rewrite with System.Address and strcmp.
Import System.Storage_Elements.
(Equals): Use strcmp.
(Before): Fix logic error.  Use strcmp.
(Name): Move conversion to String...
* libgnat/g-cppstd.adb (Name): ... here.  Import
Interfaces.C.Strings.
* libgnat/g-cppstd.ads (Type_Info_Ptr): Disable heap
finalization.
* libgnat/g-cstyin.ads (Name): Change return type.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/g-cppexc.adb | 40 --
 gcc/ada/libgnat/g-cppstd.adb |  3 +-
 gcc/ada/libgnat/g-cppstd.ads |  4 +-
 gcc/ada/libgnat/g-cstyin.adb | 80 ++--
 gcc/ada/libgnat/g-cstyin.ads |  2 +-
 5 files changed, 65 insertions(+), 64 deletions(-)

diff --git a/gcc/ada/libgnat/g-cppexc.adb b/gcc/ada/libgnat/g-cppexc.adb
index 11022880670..bad748fdfe3 100644
--- a/gcc/ada/libgnat/g-cppexc.adb
+++ b/gcc/ada/libgnat/g-cppexc.adb
@@ -267,44 +267,4 @@ package body GNAT.CPP_Exceptions is
 
end Get_Type_Info;
 
-   function Convert_Caught_Object (Choice, Except : Type_Info_Ptr;
-   Thrown : in out Address;
-   Lang   : Character)
-   return   Interfaces.C.C_bool;
-   pragma Export (Cpp, Convert_Caught_Object, "__gnat_convert_caught_object");
-   --  Convert the exception object at Thrown, under Lang convention,
-   --  from type Except to type Choice, adjusting Thrown as needed and
-   --  returning True, or returning False in case the conversion fails.
-
-   ---
-   -- Convert_Caught_Object --
-   ---
-
-   function Convert_Caught_Object (Choice, Except : Type_Info_Ptr;
-   Thrown : in out Address;
-   Lang   : Character)
-   return   Interfaces.C.C_bool is
-   begin
-  if Equals (Choice, Except) then
- return C_bool'(True);
-  end if;
-
-  if Lang = 'B' then
- if Is_Pointer_P (Except) then
-declare
-   Thrown_Indirect : Address;
-   for Thrown_Indirect'Address use Thrown;
-begin
-   Thrown := Thrown_Indirect;
-end;
- end if;
-
- if Do_Catch (Choice, Except, Thrown, 1) then
-return C_bool'(True);
- end if;
-  end if;
-
-  return C_bool'(False);
-   end Convert_Caught_Object;
-
 end GNAT.CPP_Exceptions;
diff --git a/gcc/ada/libgnat/g-cppstd.adb b/gcc/ada/libgnat/g-cppstd.adb
index 000dd474c5c..8cb64edaffe 100644
--- a/gcc/ada/libgnat/g-cppstd.adb
+++ b/gcc/ada/libgnat/g-cppstd.adb
@@ -34,6 +34,7 @@
 
 with GNAT.CPP.Std.Type_Info;
 with Ada.Unchecked_Conversion;
+with Interfaces.C.Strings; use Interfaces.C.Strings;
 
 package body GNAT.CPP.Std is
--
@@ -53,7 +54,7 @@ package body GNAT.CPP.Std is
 
function Name (this : Type_Info_Ptr)
   return String
-   is (this.all.Name);
+   is (Value (this.all.Name));
 
---
--  Before  ---
diff --git a/gcc/ada/libgnat/g-cppstd.ads b/gcc/ada/libgnat/g-cppstd.ads
index 63ef03e43dd..be8907c4f77 100644
--- a/gcc/ada/libgnat/g-cppstd.ads
+++ b/gcc/ada/libgnat/g-cppstd.ads
@@ -50,7 +50,8 @@ package GNAT.CPP.Std is
function Name (this : Type_Info_Ptr)
   --  return Interfaces.C.Strings.chars_ptr;
 

[COMMITTED 05/30] ada: Use inheritance in Gen_IL

2025-01-06 Thread Marc Poulhiès
From: Bob Duff 

In Gen_IL, detect cases where fields could be inherited from
an abstract type instead of being defined in each of two or more
descendants of that type. Raise Illegal when that is the case,
except in specific cases called out as exceptions to this rule.

For every such case, either update the types declared in
Gen_Nodes and Gen_Entities to use inheritance, or add the case
to the list of exceptions where we do not want to use inheritance.

gcc/ada/ChangeLog:

* gen_il-internals.ads: Split Fields field into two fields
Imm_Fields and Fields.
* gen_il-gen.adb: Modify the field-inheritance
algorithm to inherit at each level of the type hierarchy,
rather than just inheriting into concrete types.
For example, if C is a concrete type derived from B,
which is in turn derived from A, we now set the Fields
of B to include those of A. (We had always set the Fields
of C to include those of A and B, and we still do that.)
(Compute_Fields_For_One_Type): Detect cases where a given
field is declared for all descendants of a given abstract
type, in which case we should consider declaring it in
the abstract type, and inheriting it in those descendants.
(Exception_To_Inheritance_Rule): These are the cases where
we could inherit, but we don't want to.
* gen_il-gen-gen_nodes.adb: Move fields up the type hierarchy,
so they are inherited instead of being defined separately.
* gen_il-gen-gen_entities.adb: Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-gen-gen_entities.adb |  18 +-
 gcc/ada/gen_il-gen-gen_nodes.adb| 291 
 gcc/ada/gen_il-gen.adb  | 270 +++---
 gcc/ada/gen_il-internals.ads|   6 +-
 4 files changed, 289 insertions(+), 296 deletions(-)

diff --git a/gcc/ada/gen_il-gen-gen_entities.adb 
b/gcc/ada/gen_il-gen-gen_entities.adb
index f887e0c3c99..8cf66b2611d 100644
--- a/gcc/ada/gen_il-gen-gen_entities.adb
+++ b/gcc/ada/gen_il-gen-gen_entities.adb
@@ -302,7 +302,8 @@ begin -- Gen_IL.Gen.Gen_Entities
--  but not getters; the Ekind is modified before any such getters are
--  called.
 
-   Ab (Exception_Or_Object_Kind, Entity_Kind);
+   Ab (Exception_Or_Object_Kind, Entity_Kind,
+   (Sm (Esize, Uint)));
 
Ab (Object_Kind, Exception_Or_Object_Kind,
(Sm (Current_Value, Node_Id),
@@ -313,7 +314,6 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Component_Clause, Node_Id),
 Sm (Corresponding_Record_Component, Node_Id),
 Sm (Entry_Formal, Node_Id),
-Sm (Esize, Uint),
 Sm (Interface_Name, Node_Id),
 Sm (Normalized_First_Bit, Uint),
 Sm (Normalized_Position, Uint),
@@ -334,7 +334,6 @@ begin -- Gen_IL.Gen.Gen_Entities
Ab (Allocatable_Kind, Object_Kind,
(Sm (Activation_Record_Component, Node_Id),
 Sm (Alignment, Unat),
-Sm (Esize, Uint),
 Sm (Finalization_Master_Node, Node_Id),
 Sm (Interface_Name, Node_Id),
 Sm (Is_Finalized_Transient, Flag),
@@ -403,7 +402,6 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Default_Expr_Function, Node_Id),
 Sm (Default_Value, Node_Id),
 Sm (Entry_Component, Node_Id),
-Sm (Esize, Uint),
 Sm (Extra_Accessibility, Node_Id),
 Sm (Extra_Constrained, Node_Id),
 Sm (Extra_Formal, Node_Id),
@@ -433,8 +431,7 @@ begin -- Gen_IL.Gen.Gen_Entities
 
Ab (Formal_Object_Kind, Object_Kind,
--  Generic formal objects are also objects
-   (Sm (Entry_Component, Node_Id),
-Sm (Esize, Uint)));
+   (Sm (Entry_Component, Node_Id)));
 
Cc (E_Generic_In_Out_Parameter, Formal_Object_Kind,
--  A generic in out parameter, created by the use of a generic in out
@@ -993,7 +990,8 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Static_Call_Helper, Node_Id),
 Sm (SPARK_Pragma, Node_Id),
 Sm (SPARK_Pragma_Inherited, Flag),
-Sm (Subps_Index, Unat)));
+Sm (Subps_Index, Unat),
+Sm (LSP_Subprogram, Node_Id)));
 
Cc (E_Function, Subprogram_Kind,
--  A function, created by a function declaration or a function body
@@ -1020,7 +1018,6 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Is_Predicate_Function, Flag),
 Sm (Is_Primitive_Wrapper, Flag),
 Sm (Is_Private_Primitive, Flag),
-Sm (LSP_Subprogram, Node_Id),
 Sm (Mechanism, Mechanism_Type),
 Sm (Next_Inlined_Subprogram, Node_Id),
 Sm (Original_Protected_Subprogram, Node_Id),
@@ -1039,8 +1036,7 @@ begin -- Gen_IL.Gen.Gen_Entities
--  defined concatenation operator created whenever an array is 
declared.
--  We do not make normal derived operators explicit in the tree, but 
the
--  concatenation operators are made explicit.
-   (Sm (Extra_Accessibility_Of_Result, Node_Id),
-Sm (LSP_Subpro

[COMMITTED 23/30] ada: Small housekeeping work in Exp_Aggr

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou 

This moves a few declarations around and tweaks a few comments.

gcc/ada/ChangeLog:

* exp_aggr.adb (Case_Table_Type): Fix reference in comment.
(In_Place_Assign_OK): Move declaration around.
(Is_Build_In_Place_Aggregate_Return): Likewise and adjust.
(Expand_Array_Aggregate): Streamline for the sake of consistency.
(Aggr_Assignment_OK_For_Backend): Remove reference to Gigi/gcc.
(Backend_Processing_Possible): Likewise.
(Expand_Array_Aggregate): Add comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb | 41 +++--
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index dae9d4e5009..fb5cfc6d1f0 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -85,7 +85,7 @@ package body Exp_Aggr is
end record;
 
type Case_Table_Type is array (Nat range <>) of Case_Bounds;
-   --  Table type used by Check_Case_Choices procedure
+   --  Table type used by Sort_Case_Table procedure
 
function Get_Base_Object (N : Node_Id) return Entity_Id;
--  Return the base object, i.e. the outermost prefix object, that N refers
@@ -106,6 +106,17 @@ package body Exp_Aggr is
--  Comp_Typ of aggregate N. Init_Expr denotes the initialization
--  expression of the component. All generated code is added to Stmts.
 
+   function In_Place_Assign_OK
+ (N : Node_Id;
+  Target_Object : Entity_Id := Empty) return Boolean;
+   --  Predicate to determine whether an aggregate assignment can be done in
+   --  place, because none of the new values can depend on the components of
+   --  the target of the assignment.
+
+   function Is_Build_In_Place_Aggregate_Return (N : Node_Id) return Boolean;
+   --  Return True if N is a simple return whose expression needs to be built
+   --  in place in the return object, assuming the expression is an aggregate.
+
function Is_Static_Dispatch_Table_Aggregate (N : Node_Id) return Boolean;
--  Returns true if N is an aggregate used to initialize the components
--  of a statically allocated dispatch table.
@@ -162,14 +173,6 @@ package body Exp_Aggr is
-- Local subprograms for Record Aggregate Expansion --
--
 
-   function Is_Build_In_Place_Aggregate_Return (N : Node_Id) return Boolean;
-   --  Return True if N is a simple return whose expression needs to be built
-   --  in place in the return object, assuming the expression is an aggregate,
-   --  possibly qualified or a dependent expression of a conditional expression
-   --  (and possibly recursively). Such qualified and conditional expressions
-   --  are transparent for this purpose since an enclosing return is propagated
-   --  resp. distributed into these expressions by the expander.
-
function Build_Record_Aggr_Code
  (N   : Node_Id;
   Typ : Entity_Id;
@@ -210,13 +213,6 @@ package body Exp_Aggr is
--  defaults. An aggregate for a type with mutable components must be
--  expanded into individual assignments.
 
-   function In_Place_Assign_OK
- (N : Node_Id;
-  Target_Object : Entity_Id := Empty) return Boolean;
-   --  Predicate to determine whether an aggregate assignment can be done in
-   --  place, because none of the new values can depend on the components of
-   --  the target of the assignment.
-
procedure Initialize_Discriminants (N : Node_Id; Typ : Entity_Id);
--  If the type of the aggregate is a type extension with renamed discrimi-
--  nants, we must initialize the hidden discriminants of the parent.
@@ -310,8 +306,7 @@ package body Exp_Aggr is
--  these are cases we handle in there.
 
procedure Expand_Array_Aggregate (N : Node_Id);
-   --  This is the top-level routine for array aggregate expansion.
-   --  N is the N_Aggregate node to be expanded.
+   --  This is the top-level routine for array aggregate expansion
 
procedure Expand_Delta_Array_Aggregate (N : Node_Id; Deltas : List_Id);
--  This is the top-level routine for delta array aggregate expansion
@@ -351,8 +346,8 @@ package body Exp_Aggr is
-- Aggr_Assignment_OK_For_Backend --

 
-   --  Back-end processing by Gigi/gcc is possible only if all the following
-   --  conditions are met:
+   --  Back-end processing is possible only if all the following conditions
+   --  are met:
 
--1. N consists of a single OTHERS choice, possibly recursively, or
--   of a single choice, possibly recursively, if it is surrounded by
@@ -806,8 +801,8 @@ package body Exp_Aggr is
-- Backend_Processing_Possible --
-
 
-   --  Backend processing by Gigi/gcc is possible only if all the following
-   --  conditions are met:
+   --  Back-end processing is possible only if all the following conditions
+   --  are met:
 
  

[COMMITTED 22/30] ada: cleanup documentation for shift and rotate

2025-01-06 Thread Marc Poulhiès
From: Bob Duff 

Documentation updated.

gcc/ada/ChangeLog:

* sinfo.ads (Shift_Count_OK): Update comments.
(Is_Power_Of_2_For_Shift): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sinfo.ads | 34 --
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/gcc/ada/sinfo.ads b/gcc/ada/sinfo.ads
index 3db084ef391..2b82ab345d6 100644
--- a/gcc/ada/sinfo.ads
+++ b/gcc/ada/sinfo.ads
@@ -1812,16 +1812,15 @@ package Sinfo is
--A flag set on an aggregate that uses parentheses as delimiters
 
--  Is_Power_Of_2_For_Shift
-   --A flag present only in N_Op_Expon nodes. It is set when the
-   --exponentiation is of the form 2 ** N, where the type of N is an
-   --unsigned integral subtype whose size does not exceed the size of
-   --Standard_Integer (i.e. a type that can be safely converted to
-   --Natural), and the exponentiation appears as the right operand of an
-   --integer multiplication or an integer division where the dividend is
-   --unsigned. It is also required that overflow checking is off for both
-   --the exponentiation and the multiply/divide node. If this set of
-   --conditions holds, and the flag is set, then the division or
-   --multiplication can be (and is) converted to a shift.
+   --Present in N_Op_Expon nodes. It is set when the exponentiation is of
+   --the form 2 ** N, where the type of N is an integer subtype whose size
+   --does not exceed the size of Standard.Integer (i.e. a type that can be
+   --safely converted to Natural), and the exponentiation appears as the
+   --right operand of an integer multiplication or an integer division
+   --where the dividend is unsigned. It is also required that overflow
+   --checking is off for both the exponentiation and the multiply/divide
+   --node. If this set of conditions holds, and the flag is set, then the
+   --division or multiplication is converted to a shift.
 
--  Is_Preelaborable_Call
--Present in call marker nodes. Set when the related call is non-static
@@ -2291,7 +2290,8 @@ package Sinfo is
--known to be in range, i.e. is in the range from zero to word length
--minus one. If this flag is not set, then the shift count may be
--outside this range, i.e. larger than the word length, and the code
-   --must ensure that such shift counts give the appropriate result.
+   --generated by the back end must ensure that such shift counts give the
+   --appropriate result.
 
--  Source_Type
--Used in an N_Validate_Unchecked_Conversion node to point to the
@@ -7738,15 +7738,13 @@ package Sinfo is
   --  operator names. Note that for a given shift operation, one node
   --  covers all possible types, as for normal operators.
 
-  --  Note: it is perfectly permissible for the expander to generate
-  --  shift operation nodes directly, in which case they will be analyzed
-  --  and parsed in the usual manner.
-
   --  Sprint syntax: shift-function-name!(expr, count)
 
-  --  Note: the Left_Opnd field holds the first argument (the value to
-  --  be shifted). The Right_Opnd field holds the second argument (the
-  --  shift count). The Chars field is the name of the intrinsic function.
+  --  The Left_Opnd field holds the first argument (the value to be
+  --  shifted). The Right_Opnd field holds the second argument (the shift
+  --  count). The Chars field is the name of the intrinsic function.
+  --  The back end is responsible for generating correct code in the case
+  --  of large shift counts (in which case Shift_Count_OK will be False).
 
   --  N_Op_Rotate_Left
   --  Sloc points to the function name
-- 
2.43.0



[COMMITTED 17/30] ada: Crash in prefix notation with access to class-wide object

2025-01-06 Thread Marc Poulhiès
From: Javier Miranda 

The compiler crashes analyzing a prefix notation call when its
prefix is an access to a class-wide object, an actual parameter
is missing, and the sources are compiled with language extensions
(-gnatX) and full errors (-gnatf).

gcc/ada/ChangeLog:

* sem_ch4.adb (Try_Object_Operation): if no candidate interpretation
matches the context, redo the same analysis with Report_Error True
to report the error.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch4.adb | 29 +++--
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index 18b3a4fc22f..94e43429814 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -10498,6 +10498,11 @@ package body Sem_Ch4 is
   Set_Etype (Subprog, Any_Type);
   Set_Parent (New_Call_Node, Parent (Node_To_Replace));
 
+  --  Perform the analysis searching for a candidate with Report_Error
+  --  set to False (see above); if no candidate interpretation matches
+  --  the context, this analysis will be redone with Report_Error set
+  --  to True to provide additional information.
+
   if not Is_Overloaded (Obj) then
  Try_One_Prefix_Interpretation (Obj_Type);
 
@@ -10537,18 +10542,22 @@ package body Sem_Ch4 is
 
  if All_Errors_Mode then
 Report_Error := True;
-if Try_Primitive_Operation
- (Call_Node   => New_Call_Node,
-  Node_To_Replace => Node_To_Replace)
 
-  or else
-Try_Class_Wide_Operation
-  (Call_Node   => New_Call_Node,
-   Node_To_Replace => Node_To_Replace)
-then
-   null;
-end if;
+if not Is_Overloaded (Obj) then
+   Try_One_Prefix_Interpretation (Obj_Type);
 
+else
+   declare
+  I  : Interp_Index;
+  It : Interp;
+   begin
+  Get_First_Interp (Obj, I, It);
+  while Present (It.Nam) loop
+ Try_One_Prefix_Interpretation (It.Typ);
+ Get_Next_Interp (I, It);
+  end loop;
+   end;
+end if;
  else
 Analyze_One_Call
   (N  => New_Call_Node,
-- 
2.43.0



[COMMITTED 14/30] ada: Remove level attribute from Rules in the SARIF report

2025-01-06 Thread Marc Poulhiès
From: Viljar Indus 

A Rule object in the SARIF report does not have a level
attribute. Result objects are the elements in the SARIF
reprot that have a level attribute that ultimately determines
the level of each diagnostic object.

Rules can have a defaultConfiguration attribute which has a level
attribute that can be overridden in multiple ways. This can make
the overall report more complex than it needs to be. It is simpler
to remove the attribute from rules where it does not really matter
and add it back in under the defaultConfiguration when there is an
explicit need for it.

gcc/ada/ChangeLog:

* diagnostics-sarif_emitter.adb (Print_Rule): Remove printing
of the level attribute since it does not match the SARIF schema.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/diagnostics-sarif_emitter.adb | 6 --
 1 file changed, 6 deletions(-)

diff --git a/gcc/ada/diagnostics-sarif_emitter.adb 
b/gcc/ada/diagnostics-sarif_emitter.adb
index b6035c2970d..f0be97d1a1e 100644
--- a/gcc/ada/diagnostics-sarif_emitter.adb
+++ b/gcc/ada/diagnostics-sarif_emitter.adb
@@ -229,12 +229,10 @@ package body Diagnostics.SARIF_Emitter is
procedure Print_Rule (Diag : Diagnostic_Type);
--  Print a rule node that consists of the following attributes:
--  * ruleId
-   --  * level
--  * name
--
--  {
--"id": ,
-   --"level": ,
--"name": 
--  },
 
@@ -1006,10 +1004,6 @@ package body Diagnostics.SARIF_Emitter is
   Write_Char (',');
   NL_And_Indent;
 
-  Write_String_Attribute ("level", Kind_To_String (Diag));
-  Write_Char (',');
-  NL_And_Indent;
-
   if Human_Id = null then
  Write_String_Attribute ("name", "Uncategorized_Diagnostic");
   else
-- 
2.43.0



[COMMITTED 10/30] ada: Cleanup preanalysis of static expressions

2025-01-06 Thread Marc Poulhiès
From: Javier Miranda 

Complete previous patch; required to avoid regressions in GNATProve.

gcc/ada/ChangeLog:

* sem_ch6.adb (Analyze_Expression_Function): Set the parent of
the new node to be the parent of the original to get the proper
context, which is needed for complete error reporting and for
semantic analysis. Patch suggested by Eric Botcazou.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch6.adb | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
index ab12f22ad51..2e619d89444 100644
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -427,6 +427,12 @@ package body Sem_Ch6 is
  --  As elsewhere, we do not emit freeze nodes within a generic unit.
 
  if not Inside_A_Generic then
+--  Set the parent of the new node to be the parent of the original
+--  to get the proper context, which is needed for complete error
+--  reporting and for semantic analysis.
+
+Set_Parent (New_Body, Parent (N));
+
 Freeze_Expr_Types
   (Def_Id => Def_Id,
Typ=> Typ,
-- 
2.43.0



[COMMITTED 19/30] ada: Declare that the new argument may not be used

2025-01-06 Thread Marc Poulhiès
From: Tonu Naks 

gcc/ada/ChangeLog:

* adaint.c (__gnat_locate_exec_on_path): modify function signature

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/adaint.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index 0459956ff5b..b906ac092ef 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -3079,7 +3079,8 @@ __gnat_locate_exec (char *exec_name, char *path_val)
 /* Locate an executable using the Systems default PATH.  */
 
 char *
-__gnat_locate_exec_on_path (char *exec_name, int current_dir_on_windows)
+__gnat_locate_exec_on_path (char *exec_name,
+   int current_dir_on_windows ATTRIBUTE_UNUSED)
 {
   char *apath_val;
 
@@ -3110,10 +3111,6 @@ __gnat_locate_exec_on_path (char *exec_name, int 
current_dir_on_windows)
   }
 
 #else
-  /* Tell the compiler that we are not going to use this parameter
- on non-windows platforms. */
-  (void)current_dir_on_windows;
-
   const char *path_val = getenv ("PATH");
 
   /* If PATH is not defined, proceed with __gnat_locate_exec anyway, so we can
-- 
2.43.0



[PATCH v3 5/6] c++/modules: Add testcase for fixed ICE [PR116568]

2025-01-06 Thread Nathaniel Shead
This ICE was fixed by ensuring that the lambdas had
LAMBDA_EXPR_EXTRA_SCOPE properly set.

PR c++/116568

gcc/testsuite/ChangeLog:

* g++.dg/modules/lambda-8.h: New test.
* g++.dg/modules/lambda-8_a.H: New test.
* g++.dg/modules/lambda-8_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/testsuite/g++.dg/modules/lambda-8.h   | 7 +++
 gcc/testsuite/g++.dg/modules/lambda-8_a.H | 5 +
 gcc/testsuite/g++.dg/modules/lambda-8_b.C | 5 +
 3 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-8.h
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-8_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/lambda-8_b.C

diff --git a/gcc/testsuite/g++.dg/modules/lambda-8.h 
b/gcc/testsuite/g++.dg/modules/lambda-8.h
new file mode 100644
index 000..0c66f053b20
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/lambda-8.h
@@ -0,0 +1,7 @@
+template  struct S {
+  template  static constexpr auto x = []{};
+  template  using t = decltype([]{});
+};
+
+inline auto x = S::x;
+using t = S::t;
diff --git a/gcc/testsuite/g++.dg/modules/lambda-8_a.H 
b/gcc/testsuite/g++.dg/modules/lambda-8_a.H
new file mode 100644
index 000..d20958ee140
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/lambda-8_a.H
@@ -0,0 +1,5 @@
+// PR c++/116568
+// { dg-additional-options "-fmodules-ts -std=c++20" }
+// { dg-module-cmi {} }
+
+#include "lambda-8.h"
diff --git a/gcc/testsuite/g++.dg/modules/lambda-8_b.C 
b/gcc/testsuite/g++.dg/modules/lambda-8_b.C
new file mode 100644
index 000..05ea4afd8c1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/lambda-8_b.C
@@ -0,0 +1,5 @@
+// PR c++/116568
+// { dg-additional-options "-fmodules-ts -std=c++20" }
+
+#include "lambda-8.h"
+import "lambda-8_a.H";
-- 
2.47.0



Re: [Ping, Fortran, Patch, PR114612, v1] Fix missing deep-copy for allocatable components of derived types having cycles.

2025-01-06 Thread Andre Vehreschild
Hi all,

attached patch has been rebased to latest trunk. Just pinging!

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

- Andre

On Fri, 13 Dec 2024 12:10:58 +0100
Andre Vehreschild  wrote:

> Hi all,
>
> attached patch fixes deep-copying (or rather its former absence) for
> allocatable components of derived types having cyclic dependencies.
>
> Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline?
>
> Regards,
>   Andre
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


--
Andre Vehreschild * Email: vehre ad gmx dot de
From 13b7fa360ccd246bb6dc93c0fc177b85d12f578f Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Fri, 13 Dec 2024 12:07:01 +0100
Subject: [PATCH] Fortran: Ensure deep copy of allocatable components in cylic
 types [PR114612]

gcc/fortran/ChangeLog:

	PR fortran/114612

	* trans-array.cc (structure_alloc_comps): Ensure deep copy is
	also done for types having cycles.

gcc/testsuite/ChangeLog:

	* gfortran.dg/alloc_comp_deep_copy_4.f03: New test.
---
 gcc/fortran/trans-array.cc|  7 ++---
 .../gfortran.dg/alloc_comp_deep_copy_4.f03| 29 +++
 2 files changed, 32 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_4.f03

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 67e139664c2..764e51989cb 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -10588,10 +10588,9 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, tree dest,
 	   false, false, NULL_TREE, NULL_TREE);
 	  gfc_add_expr_to_block (&fnblock, tmp);
 	}
-	  else if ((c->attr.allocatable)
-		&& !c->attr.proc_pointer && !same_type
-		&& (!(cmp_has_alloc_comps && c->as) || c->attr.codimension
-			|| caf_in_coarray (caf_mode)))
+	  else if (c->attr.allocatable && !c->attr.proc_pointer
+		   && (!(cmp_has_alloc_comps && c->as) || c->attr.codimension
+		   || caf_in_coarray (caf_mode)))
 	{
 	  rank = c->as ? c->as->rank : 0;
 	  if (c->attr.codimension)
diff --git a/gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_4.f03 b/gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_4.f03
new file mode 100644
index 000..3c445be032f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_4.f03
@@ -0,0 +1,29 @@
+!{ dg-do run }
+!
+! Contributed Vladimir Terzi  
+! Check that deep-copy for b=a works.
+
+program pr114672
+type node
+integer::val
+type(node),allocatable::next
+end type
+
+type(node)::a,b
+
+allocate(a%next)
+a%val=1
+a%next%val=2
+!print*,a%val,a%next%val
+b=a
+b%val=3
+b%next%val=4
+if (loc(b) == loc(a)) stop 1
+if (loc(b%next) == loc(a%next)) stop 2
+!print*,a%val,a%next%val
+deallocate(b%next)
+if (.NOT. allocated(a%next)) stop 3
+!print*,a%val,a%next%val
+deallocate(a%next)
+end
+
--
2.47.1



Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Qing Zhao


> On Jan 4, 2025, at 12:58, Jeff Law  wrote:
> 
> 
> 
> On 1/3/25 10:30 AM, Qing Zhao wrote:
>>> On Jan 3, 2025, at 11:41, Richard Biener  wrote:
>>> 
>>> 
>>> 
 Am 03.01.2025 um 16:22 schrieb Jeff Law :
 
 So this is an implementation of an idea I had a few years back and 
 prototyped last spring to fix pr92539.
 
 pr92539 is a false positive Warray-bounds warning triggered by loop 
 unrolling.  The warning is in code that will never execute, but none of 
 the optimizers clean things up well enough or early enough in the pipeline 
 to avoid the warning.
 
 To optimize away the code we can take advantage of the fact that we're 
 comparing a value to a bogus pointer.  So for example an EQ comparison 
 against &"aa" + 3 is always false and a NE comparison against that would 
 always be true.
>>> 
>>> Forming the pointer invokes UB, the comparison is neither false nor true.  
>>> I’d say this is a classical case for path isolation on the pointer forming, 
>>> not only discarding either the true or false path of the compare.
>>> 
>>> The path isolation pass is probably too late in this case?
>> Path isolation pass is After bounds pass.
>> From my understanding,  path isolation pass only insert TRAP on the 
>> problematic path.  The false positive warning on that path still there.  
>> Please see:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106762
> Correct. But we have choices here.  Forming the address doesn't actually 
> cause user visible UB, though it is UB in the IL.
> 
> Given it doesn't cause user visible UB, we could insert the trap *before* the 
> UB inducing statement.  That would then make the statement unreachable and 
> it'd get removed avoiding the false positive diagnostic.

Yes, that’s a good idea. 
However, in order to distinguish a user visible UB and a UB in the IL that is 
introduced purely by compiler, we might need some new marking in the IR?

Qing
> 
> But path-isolation is too late in the gimple pipeline to really help us here.
> 
> 
>> My work of adding -fdiagnostics-explain-harder also handled path isolation 
>> pass.
>> Should the loop unrolling pass need to be adjusted to avoid forming the 
>> invalid pointers?
> It it can reasonably be adjusted, it'd probably be good.  It's an area I'm 
> not terribly familiar with.
> 
> jeff




Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Jeff Law




On 1/6/25 6:01 AM, Stafford Horne wrote:

In the OpenRISC build we get the following warning:

 ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies 
executable stack
 ld: NOTE: This behaviour is deprecated and will be removed in a future 
version of the linker

Fix this by adding a .note.GNU-stack to indicate the stack does not need to be
executable for the lib1funcs.

Note, this is also needed for the upcoming glibc 2.41.

libgcc/
* config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.

OK for the trunk.  You've got write privs, correct?

jeff




[PATCH] c: Restore warning for incomplete structures declared in parameter list [PR117866]

2025-01-06 Thread Martin Uecker


Happy new year! Please consider the following patch.

Bootstrapped and regression tested on x86_64.


c: Restore warning for incomplete structures declared in parameter list 
[PR117866]

In C23 mode the warning about declaring structures and union in
parameter lists was removed, because it is possible to redeclare
a compatible type elsewhere.  This is not the case for incomplete types,
so restore the warning for those types.

PR c/117886

gcc/c/ChangeLog:
* c-decl.cc (get_parm_info): Change condition for warning.

gcc/testsuite/ChangeLog:
* gcc.dg/pr117866.c: New test.
* gcc.dg/struct-pr118007.c: Adapt.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f60b2a54a17..56d13884182 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -8677,7 +8677,7 @@ get_parm_info (bool ellipsis, tree expr)
  if (b->id)
{
  /* The %s will be one of 'struct', 'union', or 'enum'.  */
- if (!flag_isoc23)
+ if (!flag_isoc23 || !COMPLETE_TYPE_P (decl))
warning_at (b->locus, 0,
"%<%s %E%> declared inside parameter list"
" will not be visible outside of this 
definition or"
diff --git a/gcc/testsuite/gcc.dg/pr117866.c b/gcc/testsuite/gcc.dg/pr117866.c
new file mode 100644
index 000..3caf707791d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr117866.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23" } */
+
+void convert(struct fractpoint *pt);   /* { dg-warning "declared inside 
parameter list" } */
+
diff --git a/gcc/testsuite/gcc.dg/strub-pr118007.c 
b/gcc/testsuite/gcc.dg/strub-pr118007.c
index 6c24cad6529..51f48245b3f 100644
--- a/gcc/testsuite/gcc.dg/strub-pr118007.c
+++ b/gcc/testsuite/gcc.dg/strub-pr118007.c
@@ -2,4 +2,4 @@
 /* { dg-do compile } */
 /* { dg-options "-fstrub=all -O2" } */
 
-void rb_ec_error_print(struct rb_execution_context_struct *volatile) {}
+void rb_ec_error_print(struct rb_execution_context_struct *volatile) {}
/* { dg-warning "declared inside parameter list" } */



Re: [PATCH v2 1/4] testsuite: RISC-V: Add effective target for E ABI variant

2025-01-06 Thread Jeff Law




On 1/4/25 11:01 AM, Dimitar Dimitrov wrote:

Add new effective target check for either ILP32E or ILP64E ABI variants.

Initial implementation only checks for RV32E or RV64E ISA, which in turn
implies that ILP32E/ILP64E ABI is used.  The RV32I+ILP32E and
RV64I+ILP64E combinations are not yet caught by the check, but they
do not seem to be widely used currently.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_riscv_abi_e):
New procedure.
All 4 patches in this series are fine.  I do fear that trying to keep 
the E variants "clean" will be a long term maintenance task though. 
Hell, we're struggling with keeping the existing stuff clean.


But IMHO that's not a good reason to reject, especially with someone 
that's planning to test this regularly (and presumably send fixes 
regularly).


Jeff




Re: [ping][PATCH] testsuite/118127: Pass fortran tests on ppc64le for IEEE128 long doubles

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 11:01:18AM -0500, Siddhesh Poyarekar wrote:
> Ping!
> 
> On 2024-12-19 08:16, Siddhesh Poyarekar wrote:
> > Denormal behaviour is well defined for IEEE128 long doubles, so don't
> > XFAIL some gfortran tests on ppc64le when configured with the IEEE128
> > long double ABI.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR testsuite/118127
> > * gfortran.dg/default_format_2.f90: Don't xfail for
> > ppc_ieee128_ok.
> > * gfortran.dg/default_format_denormal_2.f90: Likewise.
> > * gfortran.dg/large_real_kind_form_io_2.f90: Likewise.
> > 
> > Signed-off-by: Siddhesh Poyarekar 

That looks incorrect to me.
ppc_ieee128_ok just means that one can use the __ieee128 type (and only if
-mfloat128 option is passed).
What the tests care is whether real(16) is IEEE128 or IBM128.
That is dependent on what glibc gcc has been configured against, with what
configure options and whether -mabi=ieeelongdouble or -mabi=ibmlongdouble
options were used.

The long_double_ibm128 and long_double_ieee128 effective targets would
be slightly closer to what you want, but they actually test whether
that is the case when one uses dg-add-options long_double_ibm128 or
dg-add-options long_double_ieee128.

For the xfail, guess you want a test which will check what is the default...

So, e.g. (for powerpc* only) try to compile
integer function foo ()
  integer, parameter :: kl = selected_real_kind (precision (0.0_8) + 1)
  foo = precision (0.0_kl)
end
and see if it returns 33 (that is IEEE quad) or 31 (IBM double double).

Jakub



Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Stafford Horne
On Mon, Jan 06, 2025 at 07:37:56AM -0700, Jeff Law wrote:
> 
> 
> On 1/6/25 6:01 AM, Stafford Horne wrote:
> > In the OpenRISC build we get the following warning:
> > 
> >  ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies 
> > executable stack
> >  ld: NOTE: This behaviour is deprecated and will be removed in a future 
> > version of the linker
> > 
> > Fix this by adding a .note.GNU-stack to indicate the stack does not need to 
> > be
> > executable for the lib1funcs.
> > 
> > Note, this is also needed for the upcoming glibc 2.41.
> > 
> > libgcc/
> > * config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.
> OK for the trunk.  You've got write privs, correct?

Thanks,

Yes I should still have access.  Also, I would like to add to the release-14
branch.  Is there anything special I need to do to add the patch there?

-Stafford



Re: [-Ping-, Fortran, Patch, PR116669, v3] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Andre Vehreschild
Hi all,

during looking for something completely different, I figured, that gcc does not
use std::set internally, but its implementation of hash_set. I therefore
adapted the patch to use it. Nothing more changed.

Still regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

Regards,
Andre


On Mon, 6 Jan 2025 11:06:46 +0100
Andre Vehreschild  wrote:

> Hi all,
>
> pinging attached rebased patch.
>
> Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?
>
> - Andre
>
> On Thu, 12 Dec 2024 14:50:13 +0100
> Andre Vehreschild  wrote:
>
> > Hi all,
> >
> > attached patch improves analysis of cycles in derived types, i.e. type
> > dependencies ala:
> >
> > type(T)
> >   type(T2), allocatable :: c
> > end type
> >
> > type(T2)
> >   type(T), allocatable :: t
> > end type
> >
> > are now detected and deallocating an object that is of any of the types
> > now no longer crashes the compiler because of an endless recursion. To
> > accomplish this, I stored the symbols of the types seen in a C++ set
> > and checked if a component's type is already present in there. When a
> > type has such an indirect self-reference, it gets marked by setting its
> > symbol_attribute::recursive flag. Later steps then can make use of it.
> >
> > Furthermore are _deallocate members of the vtab populated when a type
> > has the recursive and the alloc_comp flag set.
> >
> > Bootstraps and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
> > mainline?
> >
> > Note: The patch was developed on top of my coarray patch, but should
> > apply with delta on a regular trunk w/o issues.
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gcc dot gnu dot org
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


--
Andre Vehreschild * Email: vehre ad gmx dot de
From e12066f328cb7952e28996775739592c1978c066 Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Mon, 9 Dec 2024 14:56:27 +0100
Subject: [PATCH] Fortran: Extend cylic type detection for deallocate
 [PR116669]

Using cycles in derived/class types lead to the compiler doing a endless
recursion in several locations, when the cycle was not immediate.
An immediate cyclic dependency is present in, for example T T::comp.
Cylcic dependencies of the form T T2::comp; T2 T::comp2; are now
detected and the recursive bit in the derived type's attr is set.

gcc/fortran/ChangeLog:

	PR fortran/116669

	* class.cc (gfc_find_derived_vtab): Use attr to determine cyclic
	type dependendies.
	* expr.cc (gfc_has_default_initializer): Prevent endless
	recursion by storing already visited derived types.
	* resolve.cc (resolve_cyclic_derived_type): Determine if a type
	is used in its hierarchy in a cyclic way.
	(resolve_fl_derived0): Call resolve_cyclic_derived_type.
	(resolve_fl_derived): Ensure vtab is generated when cyclic
	derived types have allocatable components.
	* trans-array.cc (structure_alloc_comps): Prevent endless loop
	for derived type cycles.
	* trans-expr.cc (gfc_get_ultimate_alloc_ptr_comps_caf_token):
	Off topic, just prevent memory leaks.

gcc/testsuite/ChangeLog:

	* gfortran.dg/class_array_15.f03: Freeing more memory.
	* gfortran.dg/recursive_alloc_comp_6.f90: New test.
---
 gcc/fortran/class.cc  | 19 +-
 gcc/fortran/expr.cc   | 38 
 gcc/fortran/resolve.cc| 58 +--
 gcc/fortran/trans-array.cc| 25 +---
 gcc/fortran/trans-expr.cc | 10 +++-
 gcc/testsuite/gfortran.dg/class_array_15.f03  |  2 +-
 .../gfortran.dg/recursive_alloc_comp_6.f90| 28 +
 7 files changed, 136 insertions(+), 44 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/recursive_alloc_comp_6.f90

diff --git a/gcc/fortran/class.cc b/gcc/fortran/class.cc
index e0dd571cd68..3e0dce1b54d 100644
--- a/gcc/fortran/class.cc
+++ b/gcc/fortran/class.cc
@@ -2507,20 +2507,6 @@ gfc_find_derived_vtab (gfc_symbol *derived)
 	{
 	  gfc_component *c;
 	  gfc_symbol *parent = NULL, *parent_vtab = NULL;
-	  bool rdt = false;
-
-	  /* Is this a derived type with recursive allocatable
-		 components?  */
-	  c = (derived->attr.unlimited_polymorphic
-		   || derived->attr.abstract) ?
-		  NULL : derived->components;
-	  for (; c; c= c->next)
-		if (c->ts.type == BT_DERIVED
-		&& c->ts.u.derived == derived)
-		  {
-		rdt = true;
-		break;
-		  }

 	  gfc_get_symbol (name, ns, &vtype);
 	  if (!gfc_add_flavor (&vtype->attr, FL_DERIVED, NULL,
@@ -2703,9 +2689,8 @@ gfc_find_derived_vtab (gfc_symbol *derived)
 	  c->attr.access = ACCESS_PRIVATE;
 	  c->tb = XCNEW (gfc_typebound_proc);
 	  c->tb->ppc = 1;
-	  if (derived->attr.unlimited_polymorphic
-		  || derived->attr.abstract
-		  || !rdt)
+	  if (derived->attr.unlimited_polymorphic || derived->attr.abstract
+		  || !derived->attr.recursive)
 		c->initializer = gfc_get_null_expr (NULL);
 	  else
 		{
diff --git a/gcc/fort

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Jeff Law




On 1/6/25 7:11 AM, Qing Zhao wrote:


Given it doesn't cause user visible UB, we could insert the trap *before* the 
UB inducing statement.  That would then make the statement unreachable and it'd 
get removed avoiding the false positive diagnostic.


Yes, that’s a good idea.
However, in order to distinguish a user visible UB and a UB in the IL that is 
introduced purely by compiler, we might need some new marking in the IR?
I don't think we've ever really tackled that question; the closest I can 
think of would be things like integer overflow which we try to avoid 
allowing the compiler to introduce.  If we take the integer overflow as 
the model, then that would say we should be tackling this during loop 
unrolling.


jeff



[PATCH] Do not call cp_parser_omp_dispatch directly in cp_parser_pragma

2025-01-06 Thread Paul-Antoine Arras
This is a followup to
ed49709acda OpenMP: C++ front-end support for dispatch + adjust_args.

The call to cp_parser_omp_dispatch only belongs in cp_parser_omp_construct. In
cp_parser_pragma, handle PRAGMA_OMP_DISPATCH by calling cp_parser_omp_construct.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_pragma): Replace call to cp_parser_omp_dispatch
with cp_parser_omp_construct.
---
 gcc/cp/parser.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index f548dc31c2b..7434c1d91ba 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -53060,7 +53060,7 @@ cp_parser_pragma (cp_parser *parser, enum 
pragma_context context, bool *if_p)
   break;
 
 case PRAGMA_OMP_DISPATCH:
-  cp_parser_omp_dispatch (parser, pragma_tok);
+  cp_parser_omp_construct (parser, pragma_tok, if_p);
   return true;
 
 case PRAGMA_IVDEP:
-- 
2.45.2



Re: [PATCH] Only apply adjust_args in OpenMP dispatch if variant substitution occurs

2025-01-06 Thread Paul-Antoine Arras
Apologies, I forgot to add the testcase. Please find attached an updated 
patch.


On 06/01/2025 17:12, Paul-Antoine Arras wrote:

This is a followup to
084ea8ad584 OpenMP: middle-end support for dispatch + adjust_args.

This patch fixes a bug that caused arguments in an OpenMP dispatch call to be
modified even when no variant substitution occurred.

gcc/ChangeLog:

* gimplify.cc (gimplify_call_expr): Create variable
variant_substituted_p to control whether adjust_args applies.
---
  gcc/gimplify.cc | 13 +
  1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index bd324be926a..251d581f44c 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -3857,7 +3857,8 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool 
want_value)
enum gimplify_status ret;
int i, nargs;
gcall *call;
-  bool builtin_va_start_p = false, omp_dispatch_p = false;
+  bool builtin_va_start_p = false, omp_dispatch_p = false,
+   variant_substituted_p = false;
location_t loc = EXPR_LOCATION (*expr_p);
  
gcc_assert (TREE_CODE (*expr_p) == CALL_EXPR);

@@ -4035,7 +4036,10 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, 
bool want_value)
  {
tree variant = omp_resolve_declare_variant (fndecl);
if (variant != fndecl)
-   CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+   {
+ CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+ variant_substituted_p = true;
+   }
  }
  
/* There is a sequence point before the call, so any side effects in

@@ -4325,8 +4329,9 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool 
want_value)
}
}
  
-			  if ((need_device_ptr && !is_device_ptr)

- || (need_device_addr && !has_device_addr))
+ if (variant_substituted_p
+ && ((need_device_ptr && !is_device_ptr)
+ || (need_device_addr && !has_device_addr)))
{
  if (dispatch_device_num == NULL_TREE)
{



--
PAFrom 4d58719cb35b379950767bec8da0446570bd735e Mon Sep 17 00:00:00 2001
From: Paul-Antoine Arras 
Date: Mon, 6 Jan 2025 17:00:10 +0100
Subject: [PATCH] Only apply adjust_args in OpenMP dispatch if variant
 substitution occurs

This is a followup to
084ea8ad584 OpenMP: middle-end support for dispatch + adjust_args.

This patch fixes a bug that caused arguments in an OpenMP dispatch call to be
modified even when no variant substitution occurred.

gcc/ChangeLog:

	* gimplify.cc (gimplify_call_expr): Create variable
	variant_substituted_p to control whether adjust_args applies.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/adjust-args-4.c: New test.
---
 gcc/gimplify.cc   | 13 ++
 .../c-c++-common/gomp/adjust-args-4.c | 24 +++
 2 files changed, 33 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/adjust-args-4.c

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index bd324be926a..251d581f44c 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -3857,7 +3857,8 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
   enum gimplify_status ret;
   int i, nargs;
   gcall *call;
-  bool builtin_va_start_p = false, omp_dispatch_p = false;
+  bool builtin_va_start_p = false, omp_dispatch_p = false,
+   variant_substituted_p = false;
   location_t loc = EXPR_LOCATION (*expr_p);
 
   gcc_assert (TREE_CODE (*expr_p) == CALL_EXPR);
@@ -4035,7 +4036,10 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
 {
   tree variant = omp_resolve_declare_variant (fndecl);
   if (variant != fndecl)
-	CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+	{
+	  CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+	  variant_substituted_p = true;
+	}
 }
 
   /* There is a sequence point before the call, so any side effects in
@@ -4325,8 +4329,9 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
 }
 			}
 
-			  if ((need_device_ptr && !is_device_ptr)
-			  || (need_device_addr && !has_device_addr))
+			  if (variant_substituted_p
+			  && ((need_device_ptr && !is_device_ptr)
+  || (need_device_addr && !has_device_addr)))
 			{
 			  if (dispatch_device_num == NULL_TREE)
 {
diff --git a/gcc/testsuite/c-c++-common/gomp/adjust-args-4.c b/gcc/testsuite/c-c++-common/gomp/adjust-args-4.c
new file mode 100644
index 000..377932e1b9c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/adjust-args-4.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdump-tree-gimple" } */
+
+/* Ensure that adjust_args is only applied when variant substitution happens. */
+
+void h(int *);
+void f(int *);

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Jeff Law




On 1/6/25 9:01 AM, Richard Biener wrote:


Note unrolling doesn't introduce UB - it makes conditional UB
"obvious". 
That's fair and it's how I often view these kinds of things when they 
pop out via jump threading.




So unless the condition guarding the UB unrolling exposes is visibly
false to the compiler but we fail
to exploit that (missed optimization) there's not much that we can do.
I think "folding" away the UB
like what Jeff proposes trades false negatives for the false positive
diagnostics.
I hadn't looked at it in that way, but I can certainly see that point of 
view.  Of course the counter to that is all kinds of optimizations can 
do the same.





That said - I think for these unroller exposed cases of apparent false
positives we should improve
the path pruning in the unroller itself.  For the other cases the path
diagnostic might help clarify
that the UB happens on the 'n-th' iteration of the loop when some
additional condition is true/false.

Agreed.



So in the end Jeff - I think your patch isn't a good approach for the
issue at hand.

Understood.  It was an idea, I don't mind setting it aside.

Jeff



Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Qing Zhao


> On Jan 6, 2025, at 11:01, Richard Biener  wrote:
> 
> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Jan 6, 2025, at 09:21, Jeff Law  wrote:
>>> 
>>> 
>>> 
>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
> 
> Given it doesn't cause user visible UB, we could insert the trap *before* 
> the UB inducing statement.  That would then make the statement 
> unreachable and it'd get removed avoiding the false positive diagnostic.
 Yes, that’s a good idea.
 However, in order to distinguish a user visible UB and a UB in the IL that 
 is introduced purely by compiler, we might need some new marking in the IR?
>>> I don't think we've ever really tackled that question; the closest I can 
>>> think of would be things like integer overflow which we try to avoid 
>>> allowing the compiler to introduce.  If we take the integer overflow as the 
>>> model, then that would say we should be tackling this during loop unrolling.
>> 
>> UB that is introduced by compiler transformation is one important cause of 
>> false positive warnings.
>> 
>> There are two approaches to tackle this problem from my understanding:
>> 
>> 1. Avoid generating such UB from the beginning. i.e, for every compiler 
>> transformation that might introduce such UB, we should add check to avoid 
>> generating it.
>> 
>> 2. Marking the IR portion that were generated by compiler transformations, 
>> then check whether the UB is compiler generated when issue static checker 
>> warnings.
>> 
>> Are there other approaches?
> 
> Note unrolling doesn't introduce UB - it makes conditional UB
> "obvious”.

So, you mean this is the same issue as PR109071 (and PR85788, PR88771, etc), 
i.e, the compiler optimization make the conditional UB that’s originally in the 
source code “obvious” after code duplication? 

(I need to study the testing case in PR92539 more carefully to make sure this 
is the case...)

If so, then the claimed false positive warning in PR92539 actually is a real 
bug in the original source code,  and my patch that introduced the new option 
“--fdiagnostics-details” should also include loop unrolling to provide more 
details on the warning introduced by loop unrolling. 


>  Note -Warray-bounds wants to
> diagnose UB, so doing path isolation and removing the UB would make
> -Warray-bounds useless.
> 
> So unless the condition guarding the UB unrolling exposes is visibly
> false to the compiler but we fail
> to exploit that (missed optimization) there's not much that we can do.
> I think "folding" away the UB
> like what Jeff proposes trades false negatives for the false positive
> diagnostics.
> 
> Note the unroller knows UB that effectively bounds the number of
> iterations, even on conditional
> paths and it uses this to limit the number of copies _and_ to prune
> unreachable paths (exploiting
> UB, avoiding diagnostics).  But one of the limitations is that it only
> prunes paths in the last unrolled
> copy which can be insufficient (ISTR some PR where I noticed this).
> 
> That said - I think for these unroller exposed cases of apparent false
> positives we should improve
> the path pruning in the unroller itself.  For the other cases the path
> diagnostic might help clarify
> that the UB happens on the 'n-th' iteration of the loop when some
> additional condition is true/false.

So, the “other cases” refer to the situation similar as PR109071, i.e, 
“conditional UB” in the original source code is made obvious after loop 
unrolling? 
Yes, for such cases, the new option I have been trying to add, 
“-fdiagnostic-details” should be able to track and provide more details on the 
conditions that lead to the UB. 
Is this understanding correct?

thanks.

Qing
> 
> So in the end Jeff - I think your patch isn't a good approach for the
> issue at hand.
> 
> Richard.
> 
>> The above is very rough and initial idea at this moment.
>> 
>> Qing
>> 
>>> 
>>> jeff




Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Jennifer Schmitz


> On 19 Dec 2024, at 14:10, Jennifer Schmitz  wrote:
> 
> 
> 
>> On 19 Dec 2024, at 11:14, Richard Sandiford  
>> wrote:
>> 
>> External email: Use caution opening links or attachments
>> 
>> 
>> Jennifer Schmitz  writes:
>>> @@ -8834,22 +8834,7 @@ vectorizable_store (vec_info *vinfo,
>>> {
>>>   if (costing_p)
>>> {
>>> -   /* Only need vector extracting when there are more
>>> -  than one stores.  */
>>> -   if (nstores > 1)
>>> - inside_cost
>>> -   += record_stmt_cost (cost_vec, 1, vec_to_scalar,
>>> -stmt_info, slp_node,
>>> -0, vect_body);
>>> -   /* Take a single lane vector type store as scalar
>>> -  store to avoid ICE like 110776.  */
>>> -   if (VECTOR_TYPE_P (ltype)
>>> -   && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U))
>>> - n_adjacent_stores++;
>>> -   else
>>> - inside_cost
>>> -   += record_stmt_cost (cost_vec, 1, scalar_store,
>>> -stmt_info, 0, vect_body);
>>> +   n_adjacent_stores++;
>>>   continue;
>>> }
>>>   tree newref, newoff;
>>> @@ -8905,9 +8890,26 @@ vectorizable_store (vec_info *vinfo,
>>>  if (costing_p)
>>> {
>>>   if (n_adjacent_stores > 0)
>>> - vect_get_store_cost (vinfo, stmt_info, slp_node, 
>>> n_adjacent_stores,
>>> -  alignment_support_scheme, misalignment,
>>> -  &inside_cost, cost_vec);
>>> + {
>>> +   /* Take a single lane vector type store as scalar
>>> +  store to avoid ICE like 110776.  */
>>> +   if (VECTOR_TYPE_P (ltype)
>>> +   && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U))
>> 
>> Sorry to ask, since it's pre-existing, but could you change this to
>> maybe_ne while you're there?  nunits==1+1X should be treated as a vector
>> rather than a scalar.
> Sure, I made the change (see patch below) and re-validated on aarch64.
> 
> It would also be good to check for performance regressions, now that we have 
> a patch to test:
> I will run SPEC2017 with -mcpu=generic and -mcpu=native on Grace, but we 
> would appreciate help with benchmarking on other platforms.
> Tamar, would you still be willing to test the patch on other platforms?
> 
> If there are no other changes necessary and assuming there are no performance 
> regressions, I was planning to commit the patch in January after returning 
> from christmas break.
> 
> In the meantime I wish everyone happy holidays.
> Jennifer
On Grace, the patch has no non-noise impact on performance for SPEC2017 with 
-mcpu=generic and -mcpu=native. I also re-validated on aarch64 today, no 
regression.
Do you advise to run additional performance tests or is the patch ready to be 
pushed to trunk?
Thanks, Jennifer
> 
> This patch removes the AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS tunable and
> use_new_vector_costs entry in aarch64-tuning-flags.def and makes the
> AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS paths in the backend the
> default. To that end, the function aarch64_use_new_vector_costs_p and its uses
> were removed. To prevent costing vec_to_scalar operations with 0, as
> described in
> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665481.html,
> we adjusted vectorizable_store such that the variable n_adjacent_stores
> also covers vec_to_scalar operations. This way vec_to_scalar operations
> are not costed individually, but as a group.
> As suggested by Richard Sandiford, the "known_ne" in the multilane-check
> was replaced by "maybe_ne" in order to treat nunits==1+1X as a vector
> rather than a scalar.
> 
> Two tests were adjusted due to changes in codegen. In both cases, the
> old code performed loop unrolling once, but the new code does not:
> Example from gcc.target/aarch64/sve/strided_load_2.c (compiled with
> -O2 -ftree-vectorize -march=armv8.2-a+sve -mtune=generic 
> -moverride=tune=none):
> f_int64_t_32:
>cbz w3, .L92
>mov x4, 0
>uxtwx3, w3
> +   cntdx5
> +   whilelo p7.d, xzr, x3
> +   mov z29.s, w5
>mov z31.s, w2
> -   whilelo p6.d, xzr, x3
> -   mov x2, x3
> -   index   z30.s, #0, #1
> -   uqdecd  x2
> -   ptrue   p5.b, all
> -   whilelo p7.d, xzr, x2
> +   index   z30.d, #0, #1
> +   ptrue   p6.b, all
>.p2align 3,,7
> .L94:
> -   ld1dz27.d, p7/z, [x0, #1, mul vl]
> -   ld1dz28.d, p6/z, [x0]
> -   movprfx z29, z31
> -   mul z29.s, p5/m, z29.s, z30.s
> -   incwx4
> -   uunpklo z0.d, z29.s
> -   uunpkhi z29.d, z29.s
> -   ld1dz25.d, p6/z, [x1, z0.d, lsl 3]
>

Re: [PATCH 1/2] Alpha: Add memory clobbers to `builtin_longjmp' expansion

2025-01-06 Thread Jeff Law




On 1/5/25 9:40 AM, Maciej W. Rozycki wrote:

Add the same memory clobbers to `builtin_longjmp' for Alpha as with
commit 41439bf6a647 ("builtins.c (expand_builtin_longjmp): Added two
memory clobbers."), to prevent instructions that access memory via the
frame or stack pointer from being moved across the write to the frame
pointer.

gcc/
* config/alpha/alpha.md (builtin_longjmp): Add memory clobbers.

OK
jeff



Re: [PATCH 2/2] Alpha: Restore frame pointer last in `builtin_longjmp' [PR64242]

2025-01-06 Thread Jeff Law




On 1/5/25 9:40 AM, Maciej W. Rozycki wrote:

Add similar arrangements to `builtin_longjmp' for Alpha as with commit
71b144289c1c ("re PR middle-end/64242 (Longjmp expansion incorrect)")
and commit 511ed59d0b04 ("Fix PR64242 - Longjmp expansion incorrect"),
so as to restore the frame pointer last, so that accesses to a local
buffer supplied can still be fulfilled with memory accesses via the
original frame pointer, fixing:

FAIL: gcc.c-torture/execute/pr64242.c   -O0  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O1  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -Os  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test

and adding no regressions in `alpha-linux-gnu' testing.

gcc/
PR middle-end/64242
* gcc/config/alpha/alpha.md (`builtin_longjmp'): Restore frame
pointer last.  Add frame clobber and schedule blockage.

OK.

jeff



Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Qing Zhao


> On Jan 6, 2025, at 09:21, Jeff Law  wrote:
> 
> 
> 
> On 1/6/25 7:11 AM, Qing Zhao wrote:
>>> 
>>> Given it doesn't cause user visible UB, we could insert the trap *before* 
>>> the UB inducing statement.  That would then make the statement unreachable 
>>> and it'd get removed avoiding the false positive diagnostic.
>> Yes, that’s a good idea.
>> However, in order to distinguish a user visible UB and a UB in the IL that 
>> is introduced purely by compiler, we might need some new marking in the IR?
> I don't think we've ever really tackled that question; the closest I can 
> think of would be things like integer overflow which we try to avoid allowing 
> the compiler to introduce.  If we take the integer overflow as the model, 
> then that would say we should be tackling this during loop unrolling.

UB that is introduced by compiler transformation is one important cause of 
false positive warnings. 

There are two approaches to tackle this problem from my understanding:

1. Avoid generating such UB from the beginning. i.e, for every compiler 
transformation that might introduce such UB, we should add check to avoid 
generating it.

2. Marking the IR portion that were generated by compiler transformations, then 
check whether the UB is compiler generated when issue static checker warnings. 

Are there other approaches?

The above is very rough and initial idea at this moment.

Qing

> 
> jeff
> 



Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Richard Biener
On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao  wrote:
>
>
>
> > On Jan 6, 2025, at 09:21, Jeff Law  wrote:
> >
> >
> >
> > On 1/6/25 7:11 AM, Qing Zhao wrote:
> >>>
> >>> Given it doesn't cause user visible UB, we could insert the trap *before* 
> >>> the UB inducing statement.  That would then make the statement 
> >>> unreachable and it'd get removed avoiding the false positive diagnostic.
> >> Yes, that’s a good idea.
> >> However, in order to distinguish a user visible UB and a UB in the IL that 
> >> is introduced purely by compiler, we might need some new marking in the IR?
> > I don't think we've ever really tackled that question; the closest I can 
> > think of would be things like integer overflow which we try to avoid 
> > allowing the compiler to introduce.  If we take the integer overflow as the 
> > model, then that would say we should be tackling this during loop unrolling.
>
> UB that is introduced by compiler transformation is one important cause of 
> false positive warnings.
>
> There are two approaches to tackle this problem from my understanding:
>
> 1. Avoid generating such UB from the beginning. i.e, for every compiler 
> transformation that might introduce such UB, we should add check to avoid 
> generating it.
>
> 2. Marking the IR portion that were generated by compiler transformations, 
> then check whether the UB is compiler generated when issue static checker 
> warnings.
>
> Are there other approaches?

Note unrolling doesn't introduce UB - it makes conditional UB
"obvious".  Note -Warray-bounds wants to
diagnose UB, so doing path isolation and removing the UB would make
-Warray-bounds useless.

So unless the condition guarding the UB unrolling exposes is visibly
false to the compiler but we fail
to exploit that (missed optimization) there's not much that we can do.
I think "folding" away the UB
like what Jeff proposes trades false negatives for the false positive
diagnostics.

Note the unroller knows UB that effectively bounds the number of
iterations, even on conditional
paths and it uses this to limit the number of copies _and_ to prune
unreachable paths (exploiting
UB, avoiding diagnostics).  But one of the limitations is that it only
prunes paths in the last unrolled
copy which can be insufficient (ISTR some PR where I noticed this).

That said - I think for these unroller exposed cases of apparent false
positives we should improve
the path pruning in the unroller itself.  For the other cases the path
diagnostic might help clarify
that the UB happens on the 'n-th' iteration of the loop when some
additional condition is true/false.

So in the end Jeff - I think your patch isn't a good approach for the
issue at hand.

Richard.

> The above is very rough and initial idea at this moment.
>
> Qing
>
> >
> > jeff
> >
>


[ping][PATCH] testsuite/118127: Pass fortran tests on ppc64le for IEEE128 long doubles

2025-01-06 Thread Siddhesh Poyarekar

Ping!

On 2024-12-19 08:16, Siddhesh Poyarekar wrote:

Denormal behaviour is well defined for IEEE128 long doubles, so don't
XFAIL some gfortran tests on ppc64le when configured with the IEEE128
long double ABI.

gcc/testsuite/ChangeLog:

PR testsuite/118127
* gfortran.dg/default_format_2.f90: Don't xfail for
ppc_ieee128_ok.
* gfortran.dg/default_format_denormal_2.f90: Likewise.
* gfortran.dg/large_real_kind_form_io_2.f90: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
  gcc/testsuite/gfortran.dg/default_format_2.f90  | 2 +-
  gcc/testsuite/gfortran.dg/default_format_denormal_2.f90 | 2 +-
  gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90 | 2 +-
  3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/default_format_2.f90 
b/gcc/testsuite/gfortran.dg/default_format_2.f90
index 5ad7b3a6429..e76dce33b35 100644
--- a/gcc/testsuite/gfortran.dg/default_format_2.f90
+++ b/gcc/testsuite/gfortran.dg/default_format_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail powerpc*-apple-darwin* powerpc*-*-linux* } }
+! { dg-do run { xfail { { powerpc*-apple-darwin* || powerpc*-*-linux* } && { ! 
ppc_ieee128_ok } } } }
  ! { dg-require-effective-target fortran_large_real }
  ! Test XFAILed on these platforms because the system's printf() lacks
  ! proper support for denormalized long doubles. See PR24685
diff --git a/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90 
b/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90
index e9ccf5e8f61..1872396764d 100644
--- a/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90
+++ b/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail powerpc*-*-* } }
+! { dg-do run { xfail { powerpc*-*-* && { ! ppc_ieee128_ok } } } }
  ! { dg-require-effective-target fortran_large_real }
  ! Test XFAILed on this platform because the system's printf() lacks
  ! proper support for denormalized long doubles. See PR24685
diff --git a/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90 
b/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90
index 34b8aec462c..3a4a9e1b078 100644
--- a/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90
+++ b/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail powerpc*-apple-darwin* powerpc*-*-linux* } }
+! { dg-do run { xfail { { powerpc*-apple-darwin* || powerpc*-*-linux* } && { ! 
ppc_ieee128_ok } } } }
  ! Test XFAILed on these platforms because the system's printf() lacks
  ! proper support for denormalized long doubles. See PR24685
  ! { dg-require-effective-target fortran_large_real }




[PATCH] Only apply adjust_args in OpenMP dispatch if variant substitution occurs

2025-01-06 Thread Paul-Antoine Arras
This is a followup to
084ea8ad584 OpenMP: middle-end support for dispatch + adjust_args.

This patch fixes a bug that caused arguments in an OpenMP dispatch call to be
modified even when no variant substitution occurred.

gcc/ChangeLog:

* gimplify.cc (gimplify_call_expr): Create variable
variant_substituted_p to control whether adjust_args applies.
---
 gcc/gimplify.cc | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index bd324be926a..251d581f44c 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -3857,7 +3857,8 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool 
want_value)
   enum gimplify_status ret;
   int i, nargs;
   gcall *call;
-  bool builtin_va_start_p = false, omp_dispatch_p = false;
+  bool builtin_va_start_p = false, omp_dispatch_p = false,
+   variant_substituted_p = false;
   location_t loc = EXPR_LOCATION (*expr_p);
 
   gcc_assert (TREE_CODE (*expr_p) == CALL_EXPR);
@@ -4035,7 +4036,10 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, 
bool want_value)
 {
   tree variant = omp_resolve_declare_variant (fndecl);
   if (variant != fndecl)
-   CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+   {
+ CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+ variant_substituted_p = true;
+   }
 }
 
   /* There is a sequence point before the call, so any side effects in
@@ -4325,8 +4329,9 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool 
want_value)
}
}
 
- if ((need_device_ptr && !is_device_ptr)
- || (need_device_addr && !has_device_addr))
+ if (variant_substituted_p
+ && ((need_device_ptr && !is_device_ptr)
+ || (need_device_addr && !has_device_addr)))
{
  if (dispatch_device_num == NULL_TREE)
{
-- 
2.45.2



Re: [PATCH v2 5/7] IRA+LRA: Let the backend request to split basic blocks

2025-01-06 Thread Richard Sandiford
"Maciej W. Rozycki"  writes:
> The next change for Alpha will produce extra labels and branches in 
> reload, which in turn requires basic blocks to be split at completion. 
> We do this already for functions that can trap, so just extend the 
> arrangement with a flag for the backend to use whenever it finds it 
> necessary.
>
>   gcc/
>   * function.h (struct function): Add 
>   `split_basic_blocks_after_reload' member.
>   * lra.cc (lra): Handle it.
>   * reload1.cc (reload): Likewise.
> ---
> This was approved by Richard Sandiford in v1, so I'll commit it along with 
> 6/7, which relies on it.

Still stands, but I didn't think of this before, so thought I'd ask:

Would it also work to record max_label_num () before RA and then
compare that recorded value with max_label_num () after RA?
Or would that have too many false positives?

Thanks,
Richard

> No change from v1.
> ---
>  gcc/function.h |3 +++
>  gcc/lra.cc |6 --
>  gcc/reload1.cc |6 --
>  3 files changed, 11 insertions(+), 4 deletions(-)
>
> gcc-split-basic-blocks-after-reload.diff
>
> Index: gcc/gcc/function.h
> ===
> --- gcc.orig/gcc/function.h
> +++ gcc/gcc/function.h
> @@ -449,6 +449,9 @@ struct GTY(()) function {
>/* Set for artificial function created for [[assume (cond)]].
>   These should be GIMPLE optimized, but not expanded to RTL.  */
>unsigned int assume_function : 1;
> +
> +  /* Nonzero if reload will have to split basic blocks.  */
> +  unsigned int split_basic_blocks_after_reload : 1;
>  };
>  
>  /* Add the decl D to the local_decls list of FUN.  */
> Index: gcc/gcc/lra.cc
> ===
> --- gcc.orig/gcc/lra.cc
> +++ gcc/gcc/lra.cc
> @@ -2594,8 +2594,10 @@ lra (FILE *f, int verbose)
>  
>inserted_p = fixup_abnormal_edges ();
>  
> -  /* We've possibly turned single trapping insn into multiple ones.  */
> -  if (cfun->can_throw_non_call_exceptions)
> +  /* Split basic blocks if we've possibly turned single trapping insn
> + into multiple ones or otherwise the backend requested to do so.  */
> +  if (cfun->can_throw_non_call_exceptions
> +  || cfun->split_basic_blocks_after_reload)
>  {
>auto_sbitmap blocks (last_basic_block_for_fn (cfun));
>bitmap_ones (blocks);
> Index: gcc/gcc/reload1.cc
> ===
> --- gcc.orig/gcc/reload1.cc
> +++ gcc/gcc/reload1.cc
> @@ -1272,8 +1272,10 @@ reload (rtx_insn *first, int global)
>  
>inserted = fixup_abnormal_edges ();
>  
> -  /* We've possibly turned single trapping insn into multiple ones.  */
> -  if (cfun->can_throw_non_call_exceptions)
> +  /* Split basic blocks if we've possibly turned single trapping insn
> + into multiple ones or otherwise the backend requested to do so.  */
> +  if (cfun->can_throw_non_call_exceptions
> +  || cfun->split_basic_blocks_after_reload)
>  {
>auto_sbitmap blocks (last_basic_block_for_fn (cfun));
>bitmap_ones (blocks);


[PATCH] Accept commas between clauses in OpenMP declare variant

2025-01-06 Thread Paul-Antoine Arras
Add support to the Fortran parser for the new OpenMP syntax that allows a
comma after the directive name and between clauses of declare variant.
The C and C++ parsers already support this syntax so only a new test is added.

gcc/fortran/ChangeLog:

* openmp.cc (gfc_match_omp_declare_variant): Match comma after directive
name and between clauses.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/declare-variant-2.f90: Remove error test for a comma
after the directive name.
* c-c++-common/gomp/adjust-args-5.c: New test.
* gfortran.dg/gomp/adjust-args-11.f90: New test.
---
 gcc/fortran/openmp.cc |  4 ++
 .../c-c++-common/gomp/adjust-args-5.c | 21 +
 .../gfortran.dg/gomp/adjust-args-11.f90   | 45 +++
 .../gfortran.dg/gomp/declare-variant-2.f90|  3 --
 4 files changed, 70 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/adjust-args-5.c
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/adjust-args-11.f90

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 79c0f1b2e62..9d28dc9 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -6595,6 +6595,10 @@ gfc_match_omp_declare_variant (void)
 
   for (;;)
 {
+  gfc_gobble_whitespace ();
+  gfc_match_char (',');
+  gfc_gobble_whitespace ();
+
   enum clause
   {
match,
diff --git a/gcc/testsuite/c-c++-common/gomp/adjust-args-5.c 
b/gcc/testsuite/c-c++-common/gomp/adjust-args-5.c
new file mode 100644
index 000..863b77458e4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/adjust-args-5.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+
+/* Check that the OpenMP 6 syntax with commas after the directive name and
+   between clauses is supported. */
+
+int f (int a, void *b, float c[2]);
+
+#pragma omp declare variant (f), match (construct={dispatch}), adjust_args 
(nothing: a), adjust_args (need_device_ptr: b, c)
+int f0 (int a, void *b, float c[2]);
+
+int test () {
+  int a;
+  void *b;
+  float c[2];
+  struct {int a;} s;
+
+  #pragma omp dispatch, novariants(0), nocontext(1)
+  s.a = f0 (a, b, c);
+
+  return s.a;
+}
diff --git a/gcc/testsuite/gfortran.dg/gomp/adjust-args-11.f90 
b/gcc/testsuite/gfortran.dg/gomp/adjust-args-11.f90
new file mode 100644
index 000..3b26f1b0868
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/adjust-args-11.f90
@@ -0,0 +1,45 @@
+! { dg-do compile }
+
+! Check that the OpenMP 5.1 syntax with commas between clauses is supported.
+! A comma after the directive name is introduced in 5.2, which is not 
supported 
+! yet.
+
+module main
+  use iso_c_binding, only: c_ptr
+  implicit none
+
+  type :: struct
+integer :: a
+real :: b
+  end type
+
+  interface
+integer function f(a, b, c)
+  import c_ptr
+  integer, intent(in) :: a
+  type(c_ptr), intent(inout) :: b
+  type(c_ptr), intent(out) :: c(:)
+end function
+integer function f0(a, b, c)
+  import c_ptr
+  integer, intent(in) :: a
+  type(c_ptr), intent(inout) :: b
+  type(c_ptr), intent(out) :: c(:)
+  !$omp  declare variant (f), match (construct={dispatch}), &
+  !$omp& adjust_args (nothing: a), adjust_args (need_device_ptr: 
b, c)
+end function
+  end interface
+
+contains
+subroutine test
+  integer :: a
+  type(c_ptr) :: b
+  type(c_ptr) :: c(2)
+  type(struct) :: s
+
+  !!$omp dispatch, nocontext(.false.), novariants(.false.)   ! Not supported 
yet
+  !$omp dispatch nocontext(.false.), novariants(.false.)
+  s%a = f0 (a, b, c)
+
+end subroutine
+end module
diff --git a/gcc/testsuite/gfortran.dg/gomp/declare-variant-2.f90 
b/gcc/testsuite/gfortran.dg/gomp/declare-variant-2.f90
index 62d2cb96fac..beea713efba 100644
--- a/gcc/testsuite/gfortran.dg/gomp/declare-variant-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/declare-variant-2.f90
@@ -182,9 +182,6 @@ contains
   subroutine f74 ()
 !$omp declare variant (f1) match(construct={requires}) ! { dg-warning 
"unknown selector 'requires' for context selector set 'construct' at .1." }
   end subroutine
-  subroutine f75 ()
-!$omp declare variant (f1),match(construct={parallel}) ! { dg-error 
"expected 'match' or 'adjust_args' at .1." }
-  end subroutine
   subroutine f76 ()
 !$omp declare variant (f1) 
match(implementation={atomic_default_mem_order("relaxed")}) ! { dg-error 
"expected identifier at .1." }
   end subroutine
-- 
2.45.2



[PATCH] i386: Add br_mispredict_scale in cost table.

2025-01-06 Thread Hongyu Wang
Hi,

For later processors, the pipeline went deeper so the penalty for
untaken branch can be larger than before. Add a new parameter
br_mispredict_scale to describe the penalty, and adopt to
noce_max_ifcvt_seq_cost hook to allow longer sequence to be
converted with cmove.

This improves cpu2017 544 with -Ofast -march=native for 14% on P-core
SPR, and 8% on E-core SRF, no other regression observed.

Bootstrapped & regtested on x86-64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:

* config/i386/i386.cc (ix86_noce_max_ifcvt_seq_cost): Adjust
cost with ix86_tune_cost->br_mispredict_scale.
* config/i386/i386.h (processor_costs): Add br_mispredict_scale.
* config/i386/x86-tune-costs.h: Add new br_mispredict_scale to
all processor_costs, in which icelake_cost/alderlake_cost
with value COSTS_N_INSNS (2) + 3 and other processor with value
COSTS_N_INSNS (2).

gcc/testsuite/ChangeLog:

* gcc.target/i386/cmov12.c: New test.
---
 gcc/config/i386/i386.cc|  8 ++-
 gcc/config/i386/i386.h |  2 ++
 gcc/config/i386/x86-tune-costs.h   | 33 ++
 gcc/testsuite/gcc.target/i386/cmov12.c | 21 
 4 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/cmov12.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 655335e2f47..11770aa8a50 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25088,7 +25088,13 @@ ix86_max_noce_ifcvt_seq_cost (edge e)
return param_max_rtl_if_conversion_unpredictable_cost;
 }
 
-  return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (2);
+  /* For modern machines with deeper pipeline, the penalty for branch
+ misprediction could be higher than before to reset the pipeline
+ slots. Add parameter br_mispredict_scale as a factor to describe
+ the impact of reseting the pipeline.  */
+
+  return BRANCH_COST (true, predictable_p)
+* ix86_tune_cost->br_mispredict_scale;
 }
 
 /* Return true if SEQ is a good candidate as a replacement for the
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 00626d539a9..e8e528c7811 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -232,6 +232,8 @@ struct processor_costs {
   to be unrolled.  */
   const unsigned small_unroll_factor;   /* Unroll factor for small loop to
   be unrolled.  */
+  const int br_mispredict_scale;   /* Branch mispredict scale for ifcvt
+  threshold.  */
 };
 
 extern const struct processor_costs *ix86_cost;
diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index 56a09f12b94..a4a128cd5dd 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -137,6 +137,7 @@ struct processor_costs ix86_size_cost = {/* costs for 
tuning for size */
   NULL,/* Func alignment.  */
   4,   /* Small unroll limit.  */
   2,   /* Small unroll factor.  */
+  COSTS_N_INSNS (2),   /* Branch mispredict scale.  */
 };
 
 /* Processor costs (relative to an add) */
@@ -248,6 +249,7 @@ struct processor_costs i386_cost = {/* 386 specific 
costs */
   "4", /* Func alignment.  */
   4,   /* Small unroll limit.  */
   2,   /* Small unroll factor.  */
+  COSTS_N_INSNS (2),   /* Branch mispredict scale.  */
 };
 
 static stringop_algs i486_memcpy[2] = {
@@ -360,6 +362,7 @@ struct processor_costs i486_cost = {/* 486 specific 
costs */
   "16",/* Func alignment.  */
   4,   /* Small unroll limit.  */
   2,   /* Small unroll factor.  */
+  COSTS_N_INSNS (2),   /* Branch mispredict scale.  */
 };
 
 static stringop_algs pentium_memcpy[2] = {
@@ -470,6 +473,7 @@ struct processor_costs pentium_cost = {
   "16",/* Func alignment.  */
   4,   /* Small unroll limit.  */
   2,   /* Small unroll factor.  */
+  COSTS_N_INSNS (2),   /* Branch mispredict scale.  */
 };
 
 static const
@@ -573,6 +577,7 @@ struct processor_costs lakemont_cost = {
   "16",/* Func alignment.  */
   4,   /* Small unroll limit.  */
   2,   /* Small unroll factor.  */
+  COSTS_N_INSNS (2),   /* Branch mispredict scale.  */
 };
 
 /* PentiumPro has optimized rep instructions for blocks aligned by 8 bytes
@@ -691,6 +696,7 @@ struct processor_costs pentiumpro_cost =

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Linus Torvalds
On Mon, 6 Jan 2025 at 16:59, Linus Torvalds
 wrote:
>
> There is absolutely no gray area here. It was always buggy, and the
> alpha architecture was always completely and fundamentally
> mis-designed.

Note that I really do want to re-emphasize that while I think it's
kind of interesting that Maciej is trying to make gcc DTRT on alpha,
the non-BWX machines really are a completely broken architecture and
almost entirely unfixable.

Yeah, yeah, Maciej also has patches to avoid all the ldq_u/stq_u
sequences for regular byte accesses into actually using ldl_l / stl_c
sequences, but those instructions take hundreds of cycles and go out
on the bus outside the CPU.

So using actual thread-safe byte ops with ldl_l / stl_c turns those
non-BWX alpha CPU's into something very sad and pointless. You might
as well go full retro and use a 6502 or something.

And even the newer alphas that *had* BWX were designed to still do
byte operations with quadword ops and masking. Yes, byte ops existed,
but they were still very much designed as a "only when you really
can't use the masked word model".

For example, the standard "memset()" library routine for EV6 literally
does exactly the thing that I said would be a compiler bug to do.

Because that's literally the core design of the architecture: it's buggy.

So while I applaud Maciej's efforts, I'm not convinced they are all
that productive. Even with a fixed compiler, it's all broken anyway.

Of course, most of the time, you won't ever see the breakage. It's
there, but hitting it in practice is almost impossible.

The Linux kernel uses the known-broken memcpy and memset library code.
All user space does the same.

They are hopelessly and fundamentally broken, but they work in
practice as long as you don't have concurrent accesses "near-by" in
space-time.

 Linus


Re: [PATCH] [sanitizer] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Stefan Schulze Frielinghaus
On Mon, Jan 06, 2025 at 10:49:31AM +0100, Jakub Jelinek wrote:
> On Mon, Jan 06, 2025 at 09:55:16AM +0100, Stefan Schulze Frielinghaus wrote:
> > For some targets uptr is mapped to unsigned int and size_t to unsigned
> > long and sizeof(int)==sizeof(long) holds.  Still, these are distinct
> > types and type checking may fail.  Therefore, replace uptr by
> > usize/SIZE_T wherever a size_t is expected.
> > 
> > Part of #116957
> > 
> > Cherry picked from LLVM commit 9a156f6b2b0c892d8713ba907f07f027b24953d8
> > (removed memprof, msan, nsan, and tsan parts).
> > 
> > PR sanitizer/117725
> > libsanitizer/ChangeLog:
> > 
> > * asan/asan_interceptors.cpp: Cherry picked LLVM commit
> > 9a156f6b2b0c892d8713ba907f07f027b24953d8.
> > * asan/asan_interceptors.h: Ditto.
> > * asan/asan_interceptors_memintrinsics.h: Ditto.
> > * asan/sanitizer_common_interceptors.inc: Ditto.
> > * sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:
> > Ditto.
> > * sanitizer_common/sanitizer_platform_limits_posix.h:
> > Ditto.
> 
> Why have you removed the tsan parts?
> compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
> is in
> libsanitizer/tsan/tsan_interceptors_posix.cpp

Thanks for spotting this!  I've added this back and will send a v2 patch
promptly.  I've also moved/indented the line
PR sanitizer/117725
in all patches accordingly.

Thanks,
Stefan


[PATCH v2] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Stefan Schulze Frielinghaus
For some targets uptr is mapped to unsigned int and size_t to unsigned
long and sizeof(int)==sizeof(long) holds.  Still, these are distinct
types and type checking may fail.  Therefore, replace uptr by
usize/SIZE_T wherever a size_t is expected.

Part of #116957

Cherry picked from LLVM commit 9a156f6b2b0c892d8713ba907f07f027b24953d8
(removed memprof, msan, and nsan parts).

libsanitizer/ChangeLog:

PR sanitizer/117725
* asan/asan_interceptors.cpp: Cherry picked LLVM commit
9a156f6b2b0c892d8713ba907f07f027b24953d8.
* asan/asan_interceptors.h: Ditto.
* asan/asan_interceptors_memintrinsics.h: Ditto.
* asan/sanitizer_common_interceptors.inc: Ditto.
* sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:
Ditto.
* sanitizer_common/sanitizer_platform_limits_posix.h:
Ditto.
* tsan/tsan_interceptors_posix.cpp: Ditto.
---
 Added tsan parts.

 libsanitizer/asan/asan_interceptors.cpp   |  6 ++--
 libsanitizer/asan/asan_interceptors.h |  6 ++--
 .../asan/asan_interceptors_memintrinsics.h|  4 +--
 .../sanitizer_common_interceptors.inc | 26 +++---
 ...izer_common_interceptors_memintrinsics.inc | 34 +--
 .../sanitizer_platform_limits_posix.h |  2 +-
 libsanitizer/tsan/tsan_interceptors_posix.cpp |  4 +--
 7 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/libsanitizer/asan/asan_interceptors.cpp 
b/libsanitizer/asan/asan_interceptors.cpp
index c13bcf2382b..48f7636fbcb 100644
--- a/libsanitizer/asan/asan_interceptors.cpp
+++ b/libsanitizer/asan/asan_interceptors.cpp
@@ -85,7 +85,7 @@ int OnExit() {
 // -- Wrappers  {{{1
 using namespace __asan;
 
-DECLARE_REAL_AND_INTERCEPTOR(void *, malloc, uptr)
+DECLARE_REAL_AND_INTERCEPTOR(void *, malloc, usize)
 DECLARE_REAL_AND_INTERCEPTOR(void, free, void *)
 
 #define COMMON_INTERCEPT_FUNCTION_VER(name, ver) \
@@ -529,7 +529,7 @@ DEFINE_REAL(char*, index, const char *string, int c)
 return REAL(strcat)(to, from);
   }
 
-INTERCEPTOR(char*, strncat, char *to, const char *from, uptr size) {
+INTERCEPTOR(char*, strncat, char *to, const char *from, usize size) {
   void *ctx;
   ASAN_INTERCEPTOR_ENTER(ctx, strncat);
   AsanInitFromRtl();
@@ -617,7 +617,7 @@ INTERCEPTOR(char*, __strdup, const char *s) {
 }
 #endif // ASAN_INTERCEPT___STRDUP
 
-INTERCEPTOR(char*, strncpy, char *to, const char *from, uptr size) {
+INTERCEPTOR(char*, strncpy, char *to, const char *from, usize size) {
   void *ctx;
   ASAN_INTERCEPTOR_ENTER(ctx, strncpy);
   AsanInitFromRtl();
diff --git a/libsanitizer/asan/asan_interceptors.h 
b/libsanitizer/asan/asan_interceptors.h
index 652379d39a3..85cde07ca7e 100644
--- a/libsanitizer/asan/asan_interceptors.h
+++ b/libsanitizer/asan/asan_interceptors.h
@@ -129,11 +129,11 @@ void InitializePlatformInterceptors();
 # define ASAN_INTERCEPT_PTHREAD_ATFORK 0
 #endif
 
-DECLARE_REAL(int, memcmp, const void *a1, const void *a2, uptr size)
+DECLARE_REAL(int, memcmp, const void *a1, const void *a2, SIZE_T size)
 DECLARE_REAL(char*, strchr, const char *str, int c)
 DECLARE_REAL(SIZE_T, strlen, const char *s)
-DECLARE_REAL(char*, strncpy, char *to, const char *from, uptr size)
-DECLARE_REAL(uptr, strnlen, const char *s, uptr maxlen)
+DECLARE_REAL(char*, strncpy, char *to, const char *from, SIZE_T size)
+DECLARE_REAL(SIZE_T, strnlen, const char *s, SIZE_T maxlen)
 DECLARE_REAL(char*, strstr, const char *s1, const char *s2)
 
 #  if !SANITIZER_APPLE
diff --git a/libsanitizer/asan/asan_interceptors_memintrinsics.h 
b/libsanitizer/asan/asan_interceptors_memintrinsics.h
index eb44f8f2f72..14727a5d665 100644
--- a/libsanitizer/asan/asan_interceptors_memintrinsics.h
+++ b/libsanitizer/asan/asan_interceptors_memintrinsics.h
@@ -18,8 +18,8 @@
 #include "asan_mapping.h"
 #include "interception/interception.h"
 
-DECLARE_REAL(void *, memcpy, void *to, const void *from, uptr size)
-DECLARE_REAL(void *, memset, void *block, int c, uptr size)
+DECLARE_REAL(void *, memcpy, void *to, const void *from, SIZE_T size)
+DECLARE_REAL(void *, memset, void *block, int c, SIZE_T size)
 
 namespace __asan {
 
diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index b8627f8557a..542176e6e5a 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -445,7 +445,7 @@ INTERCEPTOR(SIZE_T, strnlen, const char *s, SIZE_T maxlen) {
 #endif
 
 #if SANITIZER_INTERCEPT_STRNDUP
-INTERCEPTOR(char*, strndup, const char *s, uptr size) {
+INTERCEPTOR(char*, strndup, const char *s, usize size) {
   void *ctx;
   COMMON_INTERCEPTOR_STRNDUP_IMPL(ctx, s, size);
 }
@@ -455,7 +455,7 @@ INTERCEPTOR(char*, strndup, const char *s, uptr size) {
 #endif // SANITIZER_INTERCEPT_STRNDUP
 
 #if SANITIZER_INTERCEPT___STRNDUP
-INTERCEPTOR(char*, __str

[PATCH] [sanitizer] Fix few size types in memprof (#119114)

2025-01-06 Thread Stefan Schulze Frielinghaus
From: Vitaly Buka 

Fix type in a few related Min() calls.

Follow up to #116957.

Co-authored-by: Stefan Schulze Frielinghaus 

Cherry picked from LLVM commit 6dec33834d1fd89f16e271dde9607c1de9554144
(removed memprof part).

PR sanitizer/117725
libsanitizer/ChangeLog:

* asan/asan_interceptors.cpp: Cherry picked from LLVM commit
6dec33834d1fd89f16e271dde9607c1de9554144.
* sanitizer_common/sanitizer_common_interceptors.inc: Ditto.
---
 libsanitizer/asan/asan_interceptors.cpp| 4 ++--
 .../sanitizer_common/sanitizer_common_interceptors.inc| 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/libsanitizer/asan/asan_interceptors.cpp 
b/libsanitizer/asan/asan_interceptors.cpp
index be214f3ebeb6..e6c95dfb82c3 100644
--- a/libsanitizer/asan/asan_interceptors.cpp
+++ b/libsanitizer/asan/asan_interceptors.cpp
@@ -535,7 +535,7 @@ INTERCEPTOR(char*, strncat, char *to, const char *from, 
usize size) {
   AsanInitFromRtl();
   if (flags()->replace_str) {
 uptr from_length = MaybeRealStrnlen(from, size);
-uptr copy_length = Min(size, from_length + 1);
+uptr copy_length = Min(size, from_length + 1);
 ASAN_READ_RANGE(ctx, from, copy_length);
 uptr to_length = internal_strlen(to);
 ASAN_READ_STRING_OF_LEN(ctx, to, to_length, to_length);
@@ -622,7 +622,7 @@ INTERCEPTOR(char*, strncpy, char *to, const char *from, 
usize size) {
   ASAN_INTERCEPTOR_ENTER(ctx, strncpy);
   AsanInitFromRtl();
   if (flags()->replace_str) {
-uptr from_size = Min(size, MaybeRealStrnlen(from, size) + 1);
+uptr from_size = Min(size, MaybeRealStrnlen(from, size) + 1);
 CHECK_RANGES_OVERLAP("strncpy", to, from_size, from, from_size);
 ASAN_READ_RANGE(ctx, from, from_size);
 ASAN_WRITE_RANGE(ctx, to, size);
diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index f6b695defce4..99ad3b244d4e 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -347,7 +347,7 @@ extern const short *_tolower_tab_;
   uptr copy_length = internal_strnlen(s, size);   \
   char *new_mem = (char *)WRAP(malloc)(copy_length + 1);  \
   if (common_flags()->intercept_strndup) {\
-COMMON_INTERCEPTOR_READ_STRING(ctx, s, Min(size, copy_length + 1));   \
+COMMON_INTERCEPTOR_READ_STRING(ctx, s, Min(size, copy_length + 1)); \
   }   \
   if (new_mem) {  \
 COMMON_INTERCEPTOR_COPY_STRING(ctx, new_mem, s, copy_length); \
-- 
2.47.0



[PATCH 0/4] libsanitizer: Fix PR117725

2025-01-06 Thread Stefan Schulze Frielinghaus
These four cherry picks resolve PR117725, i.e., bootstrap is restored on
s390 -m31.  Ok for mainline?

-- 
2.47.0



[PATCH] [sanitizer] Fix type in some Min() calls (#119248)

2025-01-06 Thread Stefan Schulze Frielinghaus
This is a follow-up to 6dec33834d1fd89f16e271dde9607c1de9554144 and
#116957 and #119114.

Cherry picked from LLVM commit 65a2eb0b1589590ae78cc1e5f05cd004b3b3bec5.

PR sanitizer/117725
libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_common_interceptors.inc: Cherry
picked from LLVM commit 65a2eb0b1589590ae78cc1e5f05cd004b3b3bec5.
---
 .../sanitizer_common_interceptors.inc  | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index 47436a6cd20f..24a8a2d4dc55 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -520,14 +520,14 @@ INTERCEPTOR(int, strncmp, const char *s1, const char *s2, 
usize size) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, strncmp, s1, s2, size);
   unsigned char c1 = 0, c2 = 0;
-  uptr i;
+  usize i;
   for (i = 0; i < size; i++) {
 c1 = (unsigned char)s1[i];
 c2 = (unsigned char)s2[i];
 if (c1 != c2 || c1 == '\0') break;
   }
-  uptr i1 = i;
-  uptr i2 = i;
+  usize i1 = i;
+  usize i2 = i;
   if (common_flags()->strict_string_checks) {
 for (; i1 < size && s1[i1]; i1++) {}
 for (; i2 < size && s2[i2]; i2++) {}
@@ -583,14 +583,14 @@ INTERCEPTOR(int, strncasecmp, const char *s1, const char 
*s2, SIZE_T size) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, strncasecmp, s1, s2, size);
   unsigned char c1 = 0, c2 = 0;
-  uptr i;
+  usize i;
   for (i = 0; i < size; i++) {
 c1 = (unsigned char)s1[i];
 c2 = (unsigned char)s2[i];
 if (CharCaseCmp(c1, c2) != 0 || c1 == '\0') break;
   }
-  uptr i1 = i;
-  uptr i2 = i;
+  usize i1 = i;
+  usize i2 = i;
   if (common_flags()->strict_string_checks) {
 for (; i1 < size && s1[i1]; i1++) {}
 for (; i2 < size && s2[i2]; i2++) {}
@@ -851,7 +851,7 @@ int MemcmpInterceptorCommon(void *ctx,
   unsigned char c1 = 0, c2 = 0;
   const unsigned char *s1 = (const unsigned char*)a1;
   const unsigned char *s2 = (const unsigned char*)a2;
-  uptr i;
+  usize i;
   for (i = 0; i < size; i++) {
 c1 = s1[i];
 c2 = s2[i];
-- 
2.47.0



[PATCH] [sanitizer] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Stefan Schulze Frielinghaus
For some targets uptr is mapped to unsigned int and size_t to unsigned
long and sizeof(int)==sizeof(long) holds.  Still, these are distinct
types and type checking may fail.  Therefore, replace uptr by
usize/SIZE_T wherever a size_t is expected.

Part of #116957

Cherry picked from LLVM commit 9a156f6b2b0c892d8713ba907f07f027b24953d8
(removed memprof, msan, nsan, and tsan parts).

PR sanitizer/117725
libsanitizer/ChangeLog:

* asan/asan_interceptors.cpp: Cherry picked LLVM commit
9a156f6b2b0c892d8713ba907f07f027b24953d8.
* asan/asan_interceptors.h: Ditto.
* asan/asan_interceptors_memintrinsics.h: Ditto.
* asan/sanitizer_common_interceptors.inc: Ditto.
* sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:
Ditto.
* sanitizer_common/sanitizer_platform_limits_posix.h:
Ditto.
---
 libsanitizer/asan/asan_interceptors.cpp|  6 ++--
 libsanitizer/asan/asan_interceptors.h  |  6 ++--
 .../asan/asan_interceptors_memintrinsics.h|  4 +--
 .../sanitizer_common_interceptors.inc | 26 +++---
 ...izer_common_interceptors_memintrinsics.inc | 34 +--
 .../sanitizer_platform_limits_posix.h |  2 +-
 14 files changed, 65 insertions(+), 66 deletions(-)

diff --git a/libsanitizer/asan/asan_interceptors.cpp 
b/libsanitizer/asan/asan_interceptors.cpp
index 4129ee807612..be214f3ebeb6 100644
--- a/libsanitizer/asan/asan_interceptors.cpp
+++ b/libsanitizer/asan/asan_interceptors.cpp
@@ -85,7 +85,7 @@ int OnExit() {
 // -- Wrappers  {{{1
 using namespace __asan;
 
-DECLARE_REAL_AND_INTERCEPTOR(void *, malloc, uptr)
+DECLARE_REAL_AND_INTERCEPTOR(void *, malloc, usize)
 DECLARE_REAL_AND_INTERCEPTOR(void, free, void *)
 
 #define COMMON_INTERCEPT_FUNCTION_VER(name, ver) \
@@ -529,7 +529,7 @@ DEFINE_REAL(char*, index, const char *string, int c)
 return REAL(strcat)(to, from);
   }
 
-INTERCEPTOR(char*, strncat, char *to, const char *from, uptr size) {
+INTERCEPTOR(char*, strncat, char *to, const char *from, usize size) {
   void *ctx;
   ASAN_INTERCEPTOR_ENTER(ctx, strncat);
   AsanInitFromRtl();
@@ -617,7 +617,7 @@ INTERCEPTOR(char*, __strdup, const char *s) {
 }
 #endif // ASAN_INTERCEPT___STRDUP
 
-INTERCEPTOR(char*, strncpy, char *to, const char *from, uptr size) {
+INTERCEPTOR(char*, strncpy, char *to, const char *from, usize size) {
   void *ctx;
   ASAN_INTERCEPTOR_ENTER(ctx, strncpy);
   AsanInitFromRtl();
diff --git a/libsanitizer/asan/asan_interceptors.h 
b/libsanitizer/asan/asan_interceptors.h
index 826b45f5ada8..3e2386eaf809 100644
--- a/libsanitizer/asan/asan_interceptors.h
+++ b/libsanitizer/asan/asan_interceptors.h
@@ -124,11 +124,11 @@ void InitializePlatformInterceptors();
 # define ASAN_INTERCEPT_PTHREAD_ATFORK 0
 #endif
 
-DECLARE_REAL(int, memcmp, const void *a1, const void *a2, uptr size)
+DECLARE_REAL(int, memcmp, const void *a1, const void *a2, SIZE_T size)
 DECLARE_REAL(char*, strchr, const char *str, int c)
 DECLARE_REAL(SIZE_T, strlen, const char *s)
-DECLARE_REAL(char*, strncpy, char *to, const char *from, uptr size)
-DECLARE_REAL(uptr, strnlen, const char *s, uptr maxlen)
+DECLARE_REAL(char*, strncpy, char *to, const char *from, SIZE_T size)
+DECLARE_REAL(SIZE_T, strnlen, const char *s, SIZE_T maxlen)
 DECLARE_REAL(char*, strstr, const char *s1, const char *s2)
 
 #  if !SANITIZER_APPLE
diff --git a/libsanitizer/asan/asan_interceptors_memintrinsics.h 
b/libsanitizer/asan/asan_interceptors_memintrinsics.h
index eb44f8f2f729..14727a5d665e 100644
--- a/libsanitizer/asan/asan_interceptors_memintrinsics.h
+++ b/libsanitizer/asan/asan_interceptors_memintrinsics.h
@@ -18,8 +18,8 @@
 #include "asan_mapping.h"
 #include "interception/interception.h"
 
-DECLARE_REAL(void *, memcpy, void *to, const void *from, uptr size)
-DECLARE_REAL(void *, memset, void *block, int c, uptr size)
+DECLARE_REAL(void *, memcpy, void *to, const void *from, SIZE_T size)
+DECLARE_REAL(void *, memset, void *block, int c, SIZE_T size)
 
 namespace __asan {
 
diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index ba3693dbd11f..f6b695defce4 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -445,7 +445,7 @@ INTERCEPTOR(SIZE_T, strnlen, const char *s, SIZE_T maxlen) {
 #endif
 
 #if SANITIZER_INTERCEPT_STRNDUP
-INTERCEPTOR(char*, strndup, const char *s, uptr size) {
+INTERCEPTOR(char*, strndup, const char *s, usize size) {
   void *ctx;
   COMMON_INTERCEPTOR_STRNDUP_IMPL(ctx, s, size);
 }
@@ -455,7 +455,7 @@ INTERCEPTOR(char*, strndup, const char *s, uptr size) {
 #endif // SANITIZER_INTERCEPT_STRNDUP
 
 #if SANITIZER_INTERCEPT___STRNDUP
-INTERCEPTOR(char*, __strndup, const char *s, uptr size) {
+INTERCEPTOR(char*, __strndup, const char *s, usize size) {
   void *ctx;
   COMMON_INTERCEP

[PATCH] [sanitizer] Add type __sanitizer::ssize (#116957)

2025-01-06 Thread Stefan Schulze Frielinghaus
Since the sanitizer merge in commit r15-5164-gfa321004f3f628 of GCC
which entails LLVM commit 61a6439f35b6de28ff4aff4450d6fca970292fd5, GCCs
bootstrap is broken on s390 -m31. This is due to commit
ec68dc1ca4d967b599f1202855917d5ec9cae52f which introduces stricter type
checking which is why GCC bootstrap fails with

```
In file included from /gcc/src/libsanitizer/interception/interception.h:18,
 from 
/gcc/src/libsanitizer/interception/interception_type_test.cpp:14:
/gcc/src/libsanitizer/interception/interception_type_test.cpp:30:61: error: 
static assertion failed
   30 | COMPILER_CHECK((__sanitizer::is_same<::SSIZE_T, ::ssize_t>::value));
  |~^~
/gcc/src/libsanitizer/sanitizer_common/sanitizer_internal_defs.h:363:44: note: 
in definition of macro 'COMPILER_CHECK'
  363 | #define COMPILER_CHECK(pred) static_assert(pred, "")
  |^~~~
make[8]: *** [Makefile:469: interception_type_test.lo] Error 1
```

The culprit seems to be that we don't check for equality of type sizes
anymore but rather whether the types are indeed the same. On s390 -m31
we have that `sizeof(int)==sizeof(long)` holds which is why previously
the checks succeeded. They fail now because

```
size_t  => unsigned long
ssize_t => long
ptrdiff_t   => int
::SSIZE_T   => __sanitizer::sptr => int
::PTRDIFF_T => __sanitizer::sptr => int
```

This is fixed by mapping `SSIZE_T` to `long` in the end.

```
#if defined(__s390__) && !defined(__s390x__)
typedef long ssize;
#else
typedef sptr ssize;
#endif

#define SSIZE_T __sanitizer::ssize
```

Cherry picked from LLVM commit ce44640fe29550461120d22b0358e6cac4aed822.

PR sanitizer/117725
libsanitizer/ChangeLog:

* interception/interception.h: Cherry picked from LLVM commit
ce44640fe29550461120d22b0358e6cac4aed822.
* sanitizer_common/sanitizer_internal_defs.h: Ditto.
---
 libsanitizer/interception/interception.h| 2 +-
 libsanitizer/sanitizer_common/sanitizer_internal_defs.h | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/libsanitizer/interception/interception.h 
b/libsanitizer/interception/interception.h
index 0580d97edda6..3cb6b446638e 100644
--- a/libsanitizer/interception/interception.h
+++ b/libsanitizer/interception/interception.h
@@ -37,7 +37,7 @@
 #endif
 
 #define SIZE_T __sanitizer::usize
-#define SSIZE_T __sanitizer::sptr
+#define SSIZE_T __sanitizer::ssize
 typedef __sanitizer::sptrPTRDIFF_T;
 typedef __sanitizer::s64 INTMAX_T;
 typedef __sanitizer::u64 UINTMAX_T;
diff --git a/libsanitizer/sanitizer_common/sanitizer_internal_defs.h 
b/libsanitizer/sanitizer_common/sanitizer_internal_defs.h
index 9208b12552ff..fff60c96f632 100644
--- a/libsanitizer/sanitizer_common/sanitizer_internal_defs.h
+++ b/libsanitizer/sanitizer_common/sanitizer_internal_defs.h
@@ -203,6 +203,12 @@ typedef __SIZE_TYPE__ usize;
 typedef uptr usize;
 #endif
 
+#if defined(__s390__) && !defined(__s390x__)
+typedef long ssize;
+#else
+typedef sptr ssize;
+#endif
+
 typedef u64 tid_t;
 
 // --- ATTENTION -
-- 
2.47.0



  1   2   >