Re: [RFC] Prevent the scheduler from moving prefetch instructions when expanding __builtin_prefetch [PR 116713]

2024-09-27 Thread Richard Biener
On Fri, Sep 27, 2024 at 6:27 AM Pietro Monteiro
 wrote:
>
> The prefetch instruction that is emitted by __builtin_prefetch is re-ordered 
> on GCC, but not on clang[0]. GCC's behavior is surprising because when using 
> the builtin you want the instruction to be placed at the exact point where 
> you put it. Moving it around, specially across load/stores, may end up being 
> a pessimization. Adding a blockage instruction before the prefetch prevents 
> the scheduler from moving it.
>
> [0] https://godbolt.org/z/Ycjr7Tq8b

I think the testcase is quite broken (aka not real-world).  I would also suggest
that a hard scheduling barrier isn't the correct tool (see Olegs response), but
instead prefetch should properly model a data dependence so it only blocks
code motion for dependent accesses.  On the GIMPLE side disambiguation
happens solely based on pointer analysis, on RTL where prefetch is likely
an UNSPEC I would suggest to model the dependence as a USE/CLOBBER
pair of a MEM.

Richard.

> -- 8< --
>
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 37c7c98e5c..fec751e0d6 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -1329,7 +1329,12 @@ expand_builtin_prefetch (tree exp)
>create_integer_operand (&ops[1], INTVAL (op1));
>create_integer_operand (&ops[2], INTVAL (op2));
>if (maybe_expand_insn (targetm.code_for_prefetch, 3, ops))
> -   return;
> +{
> +  /* Prevent the prefetch from being moved.  */
> +  rtx_insn *last = get_last_insn ();
> +  emit_insn_before (gen_blockage (), last);
> +  return;
> +}
>  }
>
>/* Don't do anything with direct references to volatile memory, but


[PATCH v2] RISC-V: Improve code generation for select of consecutive constants

2024-09-27 Thread Jovan Vukic
Based on the valuable feedback I received, I decided to implement the patch
in the RTL pipeline. Since a similar optimization already exists in
simplify_binary_operation_1, I chose to generalize my original approach
and place it directly below that code.

The expression (X xor C1) + C2 is simplified to X xor (C1 xor C2) under
the conditions described in the patch. This is a more general optimization,
but it still applies to the RISC-V case, which was my initial goal:

long f1(long x, long y) {
return (x > y) ? 2 : 3;
}


Before the patch, the generated assembly is:

f1(long, long):
sgt a0,a0,a1
xoria0,a0,1
addia0,a0,2
ret

After the patch, the generated assembly is:

f1(long, long):
sgt a0,a0,a1
xoria0,a0,3
ret


The patch optimizes cases like x LT/GT y ? 2 : 3 (and x GE/LE y ? 3 : 2),
as initially intended. Since this optimization is more general, I noticed
it also optimizes cases like x < CONST ? 3 : 2 when CONST < 0. I’ve added
tests for these cases as well.

A bit of logic behind the patch: The equality A + B == A ^ B + 2 * (A & B)
always holds true. This can be simplified to A ^ B if 2 * (A & B) == 0.
In our case, we have A == X ^ C1, B == C2 and X is either 0 or 1.

2024-09-27  Jovan Vukic  

PR target/108038

gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_binary_operation_1): New
simplification.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/slt-1.c: New test.
CONFIDENTIALITY: The contents of this e-mail are confidential and intended only 
for the above addressee(s). If you are not the intended recipient, or the 
person responsible for delivering it to the intended recipient, copying or 
delivering it to anyone else or using it in any unauthorized manner is 
prohibited and may be unlawful. If you receive this e-mail by mistake, please 
notify the sender and the systems administrator at straym...@rt-rk.com 
immediately.
---
 gcc/simplify-rtx.cc| 12 ++
 gcc/testsuite/gcc.target/riscv/slt-1.c | 59 ++
 2 files changed, 71 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/slt-1.c

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index a20a61c5ddd..e8e60404ef6 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -2994,6 +2994,18 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
code,
simplify_gen_binary (XOR, mode, op1,
 XEXP (op0, 1)));
 
+  /* (plus (xor X C1) C2) is (xor X (C1^C2)) if X is either 0 or 1 and
+2 * ((X ^ C1) & C2) == 0; based on A + B == A ^ B + 2 * (A & B). */
+  if (CONST_SCALAR_INT_P (op1)
+ && GET_CODE (op0) == XOR
+ && CONST_SCALAR_INT_P (XEXP (op0, 1))
+ && nonzero_bits (XEXP (op0, 0), mode) == 1
+ && 2 * (INTVAL (XEXP (op0, 1)) & INTVAL (op1)) == 0
+ && 2 * ((1 ^ INTVAL (XEXP (op0, 1))) & INTVAL (op1)) == 0)
+   return simplify_gen_binary (XOR, mode, XEXP (op0, 0),
+   simplify_gen_binary (XOR, mode, op1,
+XEXP (op0, 1)));
+
   /* Canonicalize (plus (mult (neg B) C) A) to (minus A (mult B C)).  */
   if (!HONOR_SIGN_DEPENDENT_ROUNDING (mode)
  && GET_CODE (op0) == MULT
diff --git a/gcc/testsuite/gcc.target/riscv/slt-1.c 
b/gcc/testsuite/gcc.target/riscv/slt-1.c
new file mode 100644
index 000..29a64066081
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/slt-1.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+#include 
+
+#define COMPARISON(TYPE, OP, OPN, RESULT_TRUE, RESULT_FALSE) \
+TYPE test_##OPN(TYPE x, TYPE y) { \
+return (x OP y) ? RESULT_TRUE : RESULT_FALSE; \
+}
+
+/* Signed comparisons */
+COMPARISON(int64_t, >, GT1, 2, 3)
+COMPARISON(int64_t, >, GT2, 5, 6)
+
+COMPARISON(int64_t, <, LT1, 2, 3)
+COMPARISON(int64_t, <, LT2, 5, 6)
+
+COMPARISON(int64_t, >=, GE1, 3, 2)
+COMPARISON(int64_t, >=, GE2, 6, 5)
+
+COMPARISON(int64_t, <=, LE1, 3, 2)
+COMPARISON(int64_t, <=, LE2, 6, 5)
+
+/* Unsigned comparisons */
+COMPARISON(uint64_t, >, GTU1, 2, 3)
+COMPARISON(uint64_t, >, GTU2, 5, 6)
+
+COMPARISON(uint64_t, <, LTU1, 2, 3)
+COMPARISON(uint64_t, <, LTU2, 5, 6)
+
+COMPARISON(uint64_t, >=, GEU1, 3, 2)
+COMPARISON(uint64_t, >=, GEU2, 6, 5)
+
+COMPARISON(uint64_t, <=, LEU1, 3, 2)
+COMPARISON(uint64_t, <=, LEU2, 6, 5)
+
+#define COMPARISON_IMM(TYPE, OP, OPN, RESULT_TRUE, RESULT_FALSE) \
+TYPE testIMM_##OPN(TYPE x) { \
+return (x OP -3) ? RESULT_TRUE : RESULT_FALSE; \
+}
+
+/* Signed comparisons with immediate */
+COMPARISON_IMM(int64_t, >, GT1, 3, 2)
+
+COMPARISON_IMM(int64_t, <, LT1, 2, 3)
+
+COMPARISON_IMM(int64_t, >=, GE1, 3, 2)
+
+COMPARISON_IMM(int64_t, <=, LE1, 2, 3)
+
+/* { dg-final {

[PATCH v1] Widening-Mul: Fix one ICE when iterate on phi node

2024-09-27 Thread pan2 . li
From: Pan Li 

We iterate all phi node of bb to try to match the SAT_* pattern
for scalar integer.  We also remove the phi mode when the relevant
pattern matched.

Unfortunately the iterator may have no idea the phi node is removed
and continue leverage the free data and then ICE similar as below.

[0] psi ptr 0x75216340c000
[0] psi ptr 0x75216340c400
[1] psi ptr 0xa5a5a5a5a5a5a5a5 <=== GC freed pointer.

during GIMPLE pass: widening_mul
tmp.c: In function ‘f’:
tmp.c:45:6: internal compiler error: Segmentation fault
   45 | void f(int rows, int cols) {
  |  ^
0x36e2788 internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:517
0x18005f0 crash_signal
../../gcc/toplev.cc:321
0x752163c4531f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x103ae0e bool is_a_helper::test(gimple*)
../../gcc/gimple.h:1256
0x103f9a5 bool is_a(gimple*)
../../gcc/is-a.h:232
0x103dc78 gphi* as_a(gimple*)
../../gcc/is-a.h:255
0x104f12e gphi_iterator::phi() const
../../gcc/gimple-iterator.h:47
0x1a57bef after_dom_children
../../gcc/tree-ssa-math-opts.cc:6140
0x3344482 dom_walker::walk(basic_block_def*)
../../gcc/domwalk.cc:354
0x1a58601 execute
../../gcc/tree-ssa-math-opts.cc:6312

This patch would like to fix the iterate on modified collection problem
by backup the next phi in advance.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

PR middle-end/116861

gcc/ChangeLog:

* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children): 
Backup
the next psi iterator before remove the phi node.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr116861-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.dg/torture/pr116861-1.c | 76 +++
 gcc/tree-ssa-math-opts.cc |  9 ++-
 2 files changed, 83 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr116861-1.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr116861-1.c 
b/gcc/testsuite/gcc.dg/torture/pr116861-1.c
new file mode 100644
index 000..7dcfe664d89
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr116861-1.c
@@ -0,0 +1,76 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void pm_message(void);
+struct CmdlineInfo {
+  _Bool wantCrop[4];
+  unsigned int margin;
+};
+typedef struct {
+  unsigned int removeSize;
+} CropOp;
+typedef struct {
+  CropOp op[4];
+} CropSet;
+static void divideAllBackgroundIntoBorders(unsigned int const totalSz,
+   _Bool const wantCropSideA,
+   _Bool const wantCropSideB,
+   unsigned int const wantMargin,
+   unsigned int *const sideASzP,
+   unsigned int *const sideBSzP) {
+  unsigned int sideASz, sideBSz;
+  if (wantCropSideA && wantCropSideB)
+  {
+sideASz = totalSz / 2;
+if (wantMargin)
+  sideBSz = totalSz - sideASz;
+  }
+  else if (wantCropSideB)
+  {
+sideBSz = 0;
+  }
+  *sideASzP = sideASz;
+  *sideBSzP = sideBSz;
+}
+static CropOp oneSideCrop(_Bool const wantCrop, unsigned int const borderSz,
+  unsigned int const margin) {
+  CropOp retval;
+  if (wantCrop)
+  {
+if (borderSz >= margin)
+  retval.removeSize = borderSz - margin;
+else
+  retval.removeSize = 0;
+  }
+  return retval;
+}
+struct CmdlineInfo cmdline1;
+void f(int rows, int cols) {
+  struct CmdlineInfo cmdline0 = cmdline1;
+  CropSet crop;
+  struct CmdlineInfo cmdline = cmdline0;
+  CropSet retval;
+  unsigned int leftBorderSz, rghtBorderSz;
+  unsigned int topBorderSz, botBorderSz;
+  divideAllBackgroundIntoBorders(cols, cmdline.wantCrop[0],
+ cmdline.wantCrop[1], cmdline.margin > 0,
+ &leftBorderSz, &rghtBorderSz);
+  divideAllBackgroundIntoBorders(rows, cmdline.wantCrop[2],
+ cmdline.wantCrop[3], cmdline.margin > 0,
+ &topBorderSz, &botBorderSz);
+  retval.op[0] =
+  oneSideCrop(cmdline.wantCrop[0], leftBorderSz, cmdline.margin);
+  retval.op[1] =
+  oneSideCrop(cmdline.wantCrop[1], rghtBorderSz, cmdline.margin);
+  retval.op[2] =
+  oneSideCrop(cmdline.wantCrop[2], topBorderSz, cmdline.margin);
+  retval.op[3] =
+  oneSideCrop(cmdline.wantCrop[3], botBorderSz, cmdline.margin);
+  crop = retval;
+  unsigned int i = 0;
+  for (i = 0; i < 4; ++i)
+  {
+if (crop.op[i].removeSize == 0)
+  pm_message();
+  }
+}
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index 8c622514dbd..f1cfe7dfab0 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -6129,10 +6129,15 @@ math_opts_dom_walker::after_dom_children (basic_block 
bb)
 
   fma_d

[committed] libgomp.texi: fix formatting; add post-TR13 OpenMP impl. status items

2024-09-27 Thread Tobias Burnus
This commitr15-3917-g6b7eaec20b046e updates .texi for one formatting (@emph → 
@code) fix and updates some items for post TR13 changes. (The latter is 
slightly questionable as the title says TR13, which is the third and 
last draft of OpenMP 6.0, scheduled to be released in time for 
Supercomputing 2024 in November - and the listed changes are in the 
current internal draft, only. But on the other hand, post-TR13 work is 
supposed to be mostly QC tasks and 6.0 is due in around 6 weeks. 
Furthermore, when looking at the spec changes for this update, I did 
find an important generator bug, causing text omissions in the spec, 
which is something I would otherwise probably only encountered after the 
spec release.) Tobias
commit 6b7eaec20b046eebc771022e460c2206580aef04
Author: Tobias Burnus 
Date:   Fri Sep 27 10:48:09 2024 +0200

libgomp.texi: fix formatting; add post-TR13 OpenMP impl. status items

libgomp/
* libgomp.texi (OpenMP Technical Report 13): Change @emph to @code;
add two post-TR13 OpenMP 6.0 items.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 22eff1d7b55..b561cb5f3f4 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -476,6 +476,7 @@ Technical Report (TR) 13 is the third preview for OpenMP 6.0.
   specifiers @tab Y @tab
 @item Support for pure directives in Fortran's @code{do concurrent} @tab N @tab
 @item All inarguable clauses take now an optional Boolean argument @tab N @tab
+@item The @code{adjust_args} clause was extended to specify the argument by position
 @item For Fortran, @emph{locator list} can be also function reference with
   data pointer result @tab N @tab
 @item Concept of @emph{assumed-size arrays} in C and C++
@@ -496,7 +497,7 @@ Technical Report (TR) 13 is the third preview for OpenMP 6.0.
   clauses @tab P @tab @code{private} not supported
 @item For Fortran, rejecting polymorphic types in data-mapping clauses
   @tab N @tab not diagnosed (and mostly unsupported)
-@item New @code{taskgraph} construct including @emph{saved} modifier and
+@item New @code{taskgraph} construct including @code{saved} modifier and
   @code{replayable} clause @tab N @tab
 @item @code{default} clause on the @code{target} directive @tab N @tab
 @item Ref-count change for @code{use_device_ptr} and @code{use_device_addr}
@@ -509,6 +510,10 @@ Technical Report (TR) 13 is the third preview for OpenMP 6.0.
 @item New @code{init_complete} clause to the @code{scan} directive
   @tab N @tab
 @item @code{ref} modifier to the @code{map} clause @tab N @tab
+@item New @code{storage} map-type modifier; context-dependent @code{alloc} and
+  @code{release} are aliases. Update to map decay @tab N @tab
+@item Update of the map-type decay for mapping and @code{declare_mapper}
+  @tab N @tab
 @item Change of the @emph{map-type} property from @emph{ultimate} to
   @emph{default} @tab N @tab
 @item @code{self} modifier to @code{map} and @code{self} as
@@ -516,7 +521,6 @@ Technical Report (TR) 13 is the third preview for OpenMP 6.0.
 @item Mapping of @emph{assumed-size arrays} in C, C++ and Fortran
   @tab N @tab
 @item @code{delete} as delete-modifier not as map type @tab N @tab
-@item @code{release} map-type modifier in @code{declare_mapper} @tab N @tab
 @item For Fortran, the @code{automap} modifier to the @code{enter} clause
   of @code{declare_target} @tab N @tab
 @item @code{groupprivate} directive @tab N @tab


Re: [PATCH v3 2/4] tree-optimization/116024 - simplify C1-X cmp C2 for unsigned types

2024-09-27 Thread Richard Biener
On Mon, 23 Sep 2024, Artemiy Volkov wrote:

> Implement a match.pd transformation inverting the sign of X in
> C1 - X cmp C2, where C1 and C2 are integer constants and X is
> of an unsigned type, by observing that:
> 
> (a) If cmp is == or !=, simply move X and C2 to opposite sides of the
> comparison to arrive at X cmp C1 - C2.
> 
> (b) If cmp is <:
>   - C1 - X < C2 means that C1 - X spans the range of 0, 1, ..., C2 - 1;
> - This means that X spans the range of C1 - (C2 - 1),
> C1 - (C2 - 2), ..., C1;
>   - Subtracting C1 - (C2 - 1), X - (C1 - (C2 - 1)) is one of 0, 1,
> ..., C1 - (C1 - (C2 - 1));
> - Simplifying the above, X - (C1 - C2 + 1) is one of 0, 1, ...,
>  C2 - 1;
> - Summarizing, the expression C1 - X < C2 can be transformed
> into X - (C1 - C2 + 1) < C2.
> 
> (c) Similarly, if cmp is <=:
>   - C1 - X <= C2 means that C1 - X is one of 0, 1, ..., C2;
>   - It follows that X is one of C1 - C2, C1 - (C2 - 1), ..., C1;
> - Subtracting C1 - C2, X - (C1 - C2) has range 0, 1, ..., C2;
> - Thus, the expression C1 - X <= C2 can be transformed into
> X - (C1 - C2) <= C2.
> 
> (d) The >= and > cases are negations of (b) and (c), respectively.
> 
> This transformation allows to occasionally save load-immediate /
> subtraction instructions, e.g. the following statement:
> 
> 300 - (unsigned int)f() < 100;
> 
> now compiles to
> 
> addia0,a0,-201
> sltiu   a0,a0,100
> 
> instead of
> 
> li  a5,300
> sub a0,a5,a0
> sltiu   a0,a0,100
> 
> on 32-bit RISC-V.
> 
> Additional examples can be found in the newly added test file.  This
> patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
> and additionally regtested on riscv32.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> PR tree-optimization/116024
> * match.pd: New transformation around integer comparison.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/tree-ssa/pr116024-1.c: New test.
> 
> Signed-off-by: Artemiy Volkov 
> ---
>  gcc/match.pd   | 23 ++-
>  gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c | 73 ++
>  2 files changed, 95 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81be0a21462..d0489789527 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8949,7 +8949,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>TYPE_SIGN (TREE_TYPE (@0)));
>  constant_boolean_node (less == ovf_high, type);
>})
> -  (rcmp @1 { res; }))
> +  (rcmp @1 { res; })))
> +/* For unsigned types, transform like so (using < as example):
> +  C1 - X < C2
> +  ==>  C1 - X = { 0, 1, ..., C2 - 1 }
> +  ==>  X = { C1 - (C2 - 1), ..., C1 + 1, C1 }
> +  ==>  X - (C1 - (C2 - 1)) = { 0, 1, ..., C1 - (C1 - (C2 - 1)) }
> +  ==>  X - (C1 - C2 + 1) = { 0, 1, ..., C2 - 1 }
> +  ==>  X - (C1 - C2 + 1) < C2.
> +
> +  Similarly,
> +  C1 - X <= C2 ==> X - (C1 - C2) <= C2;
> +  C1 - X >= C2 ==> X - (C1 - C2 + 1) >= C2;
> +  C1 - X > C2 ==> X - (C1 - C2) > C2.  */
> +   (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
> + (switch
> +   (if (cmp == EQ_EXPR || cmp == NE_EXPR)
> +  (cmp @1 (minus @0 @2)))
> +   (if (cmp == LE_EXPR || cmp == GT_EXPR)
> +  (cmp (plus @1 (minus @2 @0)) @2))
> +   (if (cmp == LT_EXPR || cmp == GE_EXPR)
> +  (cmp (plus @1 (minus @2
> +(plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
>  
>  /* Canonicalizations of BIT_FIELD_REFs.  */
>  
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c
> new file mode 100644
> index 000..48e647dc0c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c
> @@ -0,0 +1,73 @@
> +/* PR tree-optimization/116024 */
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-forwprop1-details" } */
> +
> +#include 
> +
> +uint32_t f(void);
> +
> +int32_t i2(void)
> +{
> +  uint32_t l = 2;
> +  l = 10 - (uint32_t)f();
> +  return l <= 20; // f() + 10 <= 20 
> +}
> +
> +int32_t i2a(void)
> +{
> +  uint32_t l = 2;
> +  l = 10 - (uint32_t)f();
> +  return l < 30; // f() + 19 < 30 
> +}
> +
> +int32_t i2b(void)
> +{
> +  uint32_t l = 2;
> +  l = 200 - (uint32_t)f();
> +  return l <= 100; // f() - 100 <= 100 
> +}
> +
> +int32_t i2c(void)
> +{
> +  uint32_t l = 2;
> +  l = 300 - (uint32_t)f();
> +  return l < 100; // f() - 201 < 100
> +}
> +
> +int32_t i2d(void)
> +{
> +  uint32_t l = 2;
> +  l = 1000 - (uint32_t)f();
> +  return l >= 2000; // f() + 999 >= 2000
> +}
> +
> +int32_t i2e(void)
> +{
> +  uint32_t l = 2;
> +  l = 1000 - (uint32_t)f();
> +  return l > 3000; // f() + 2000 > 3000
> +}
> +
> +int32_t i2f(void)
> +{
> +  uint32_t l = 2;
> +  l = 2 - (uint32_t)f();
> +  return l >= 1; // f() - 10001 >= 1
> +}
> +
> +int32_t i2g(void)
>

Re: [PATCH v2 3/4] tree-object-size: Handle PHI + CST type offsets

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:28PM -0400, Siddhesh Poyarekar wrote:
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1473,7 +1473,7 @@ merge_object_sizes (struct object_size_info *osi, tree 
> dest, tree orig)
> with all of its targets are constants.  */
>  
>  static tree
> -try_collapsing_offset (tree op, int object_size_type)
> +try_collapsing_offset (tree op, tree cst, tree_code code, int 
> object_size_type)
>  {
>gcc_assert (!(object_size_type & OST_DYNAMIC));

I believe you were just returning op here if it isn't SSA_NAME.
Now, if it is INTEGER_CST, it will misbehave whenever cst != NULL,
as it will just return op rather than op + cst or op - cst.

> @@ -1485,13 +1485,41 @@ try_collapsing_offset (tree op, int object_size_type)
>switch (gimple_code (stmt))
>  {
>  case GIMPLE_ASSIGN:
> -  /* Peek through casts.  */
> -  if (gimple_assign_rhs_code (stmt) == NOP_EXPR)
> +  switch (gimple_assign_rhs_code (stmt))
>   {
> -   tree ret = try_collapsing_offset (gimple_assign_rhs1 (stmt),
> - object_size_type);
> -   if (TREE_CODE (ret) == INTEGER_CST)
> - return ret;
> +   /* Peek through casts.  */
> + case NOP_EXPR:

That would be CASE_CONVERT:

> + {

Wrong indentation, { should be 2 columns to the left of case, and tree ret
4 columns.

> +   tree ret = try_collapsing_offset (gimple_assign_rhs1 (stmt),
> + NULL_TREE, NOP_EXPR,
> + object_size_type);

This loses cst and code from the caller if it isn't NULL_TREE, NOP_EXPR,
again causing miscompilations.
I'd think it should just pass through cst, code from the caller here
(but then one needs to be extra careful, because cst can have different
type from op because of the recusion on casts).

> +   if (TREE_CODE (ret) == INTEGER_CST)
> + return ret;
> +   break;
> + }
> +   /* Propagate constant offsets to PHI expressions.  */
> + case PLUS_EXPR:
> + case MINUS_EXPR:
> + {
> +   tree rhs1 = gimple_assign_rhs1 (stmt);
> +   tree rhs2 = gimple_assign_rhs2 (stmt);
> +   tree ret = NULL_TREE;
> +

And I think this should just punt on code != NOP_EXPR (or cst != NULL_TREE,
one is enough).  Because if you have
  # _2 = PHI...
  _3 = _2 + 3;
  _4 = 7 - _3;
or something similar, it will handle just the inner operation and not the
outer one.

> +   if (TREE_CODE (rhs1) == INTEGER_CST)
> + ret = try_collapsing_offset (rhs2, rhs1,
> +  gimple_assign_rhs_code (stmt),
> +  object_size_type);
> +   else if (TREE_CODE (rhs2) == INTEGER_CST)
> + ret = try_collapsing_offset (rhs1, rhs2,
> +  gimple_assign_rhs_code (stmt),
> +  object_size_type);
> +
> +   if (ret && TREE_CODE (ret) == INTEGER_CST)
> + return ret;
> +   break;
> + }
> + default:
> +   break;
>   }
>break;
>  case GIMPLE_PHI:
> @@ -1507,14 +1535,20 @@ try_collapsing_offset (tree op, int object_size_type)
> if (TREE_CODE (rhs) != INTEGER_CST)
>   return op;
>  
> +   if (cst)
> + rhs = fold_build2 (code, sizetype,
> +fold_convert (ptrdiff_type_node, rhs),
> +fold_convert (ptrdiff_type_node, cst));

This can be done on wide_int too.  But one needs to think through the cast
cases, what happened if the cast was widening (from signed or unsigned),
what happened if the cast was narrowing.

> +   else
> + rhs = fold_convert (ptrdiff_type_node, rhs);
> +
> /* Note that this is the *opposite* of what we usually do with
>sizes, because the maximum offset estimate here will give us a
>minimum size estimate and vice versa.  */
> -   enum tree_code code = (object_size_type & OST_MINIMUM
> -  ? MAX_EXPR : MIN_EXPR);
> +   enum tree_code selection_code = (object_size_type & OST_MINIMUM
> +? MAX_EXPR : MIN_EXPR);
>  
> -   off = fold_build2 (code, ptrdiff_type_node, off,
> -  fold_convert (ptrdiff_type_node, rhs));
> +   off = fold_build2 (selection_code, ptrdiff_type_node, off, rhs);
>   }
> return fold_convert (sizetype, off);
>   }
> @@ -1558,7 +1592,7 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>  return false;
>  
>if (!(object_size_type & OST_DYNAMIC) && TREE_CODE (op1) != INTEGER_CST)
> -op1 = try_collapsing_offset (op1, object_size_type);
> +op1 = try_collapsing_offset (op1, NULL_TREE, NO

Re: [PATCH v3 1/4] tree-optimization/116024 - simplify C1-X cmp C2 for UB-on-overflow types

2024-09-27 Thread Richard Biener
On Mon, 23 Sep 2024, Artemiy Volkov wrote:

> Implement a match.pd pattern for C1 - X cmp C2, where C1 and C2 are
> integer constants and X is of a UB-on-overflow type.  The pattern is
> simplified to X rcmp C1 - C2 by moving X and C2 to the other side of the
> comparison (with opposite signs).  If C1 - C2 happens to overflow,
> replace the whole expression with either a constant 0 or a constant 1
> node, depending on the comparison operator and the sign of the overflow.
> 
> This transformation allows to occasionally save load-immediate /
> subtraction instructions, e.g. the following statement:
> 
> 10 - (int) x <= 9;
> 
> now compiles to
> 
> sgt a0,a0,zero
> 
> instead of
> 
> li  a5,10
> sub a0,a5,a0
> sltia0,a0,10
> 
> on 32-bit RISC-V.
> 
> Additional examples can be found in the newly added test file. This
> patch has been bootstrapped and regtested on aarch64, x86_64, and
> i386, and additionally regtested on riscv32.  Existing tests were
> adjusted where necessary.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>   PR tree-optimization/116024
> * match.pd: New transformation around integer comparison.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/tree-ssa/pr116024.c: New test.
> * gcc.dg/pr67089-6.c: Adjust.
> 
> Signed-off-by: Artemiy Volkov 
> ---
>  gcc/match.pd | 26 +
>  gcc/testsuite/gcc.dg/pr67089-6.c |  4 +-
>  gcc/testsuite/gcc.dg/tree-ssa/pr116024.c | 74 
>  3 files changed, 102 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 940292d0d49..81be0a21462 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8925,6 +8925,32 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   }
>   (cmp @0 { res; })
>  
> +/* Invert sign of X in comparisons of the form C1 - X CMP C2.  */
> +
> +(for cmp (lt le gt ge eq ne)
> + rcmp (gt ge lt le eq ne)
> +  (simplify
> +   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
> +/* For UB-on-overflow types, simply switch sides for X and C2
> +   to arrive at X RCMP C1 - C2, handling the case when the latter
> +   expression overflows.  */
> +   (if (!TREE_OVERFLOW (@0) && !TREE_OVERFLOW (@2)
> +   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
> + (with { tree res = int_const_binop (MINUS_EXPR, @0, @2); }
> +  (if (TREE_OVERFLOW (res))
> + (switch
> +  (if (cmp == NE_EXPR)
> +   { constant_boolean_node (true, type); })
> +  (if (cmp == EQ_EXPR)
> +   { constant_boolean_node (false, type); })
> +  {
> +bool less = cmp == LE_EXPR || cmp == LT_EXPR;
> +bool ovf_high = wi::lt_p (wi::to_wide (@0), 0,
> +  TYPE_SIGN (TREE_TYPE (@0)));
> +constant_boolean_node (less == ovf_high, type);
> +  })
> +  (rcmp @1 { res; }))
> +
>  /* Canonicalizations of BIT_FIELD_REFs.  */
>  
>  (simplify
> diff --git a/gcc/testsuite/gcc.dg/pr67089-6.c 
> b/gcc/testsuite/gcc.dg/pr67089-6.c
> index b59d75b2318..80a33c3f3e2 100644
> --- a/gcc/testsuite/gcc.dg/pr67089-6.c
> +++ b/gcc/testsuite/gcc.dg/pr67089-6.c
> @@ -57,5 +57,5 @@ T (25, unsigned short, 2U - x, if (r > 2U) foo (0))
>  T (26, unsigned char, 2U - x, if (r <= 2U) foo (0))
>  
>  /* { dg-final { scan-tree-dump-times "ADD_OVERFLOW" 16 "widening_mul" { 
> target { i?86-*-* x86_64-*-* } } } } */
> -/* { dg-final { scan-tree-dump-times "SUB_OVERFLOW" 11 "widening_mul" { 
> target { { i?86-*-* x86_64-*-* } && { ! ia32 } } } } } */
> -/* { dg-final { scan-tree-dump-times "SUB_OVERFLOW" 9 "widening_mul" { 
> target { { i?86-*-* x86_64-*-* } && ia32 } } } } */
> +/* { dg-final { scan-tree-dump-times "SUB_OVERFLOW" 9 "widening_mul" { 
> target { { i?86-*-* x86_64-*-* } && { ! ia32 } } } } } */
> +/* { dg-final { scan-tree-dump-times "SUB_OVERFLOW" 7 "widening_mul" { 
> target { { i?86-*-* x86_64-*-* } && ia32 } } } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr116024.c
> new file mode 100644
> index 000..0dde9abbf89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024.c
> @@ -0,0 +1,74 @@
> +/* PR tree-optimization/116024 */
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-forwprop1-details" } */
> +
> +#include 
> +#include 
> +
> +uint32_t f(void);
> +
> +int32_t i1(void)
> +{
> +  int32_t l = 2;
> +  l = 10 - (int32_t)f();
> +  return l <= 9; // f() > 0
> +}
> +
> +int32_t i1a(void)
> +{
> +  int32_t l = 2;
> +  l = 20 - (int32_t)f();
> +  return l <= INT32_MIN; // return 0
> +}
> +
> +int32_t i1b(void)
> +{
> +  int32_t l = 2;
> +  l = 30 - (int32_t)f();
> +  return l <= INT32_MIN + 31; // f() == INT32_MAX
> +}
> +
> +int32_t i1c(void)
> +{
> +  int32_t l = 2;
> +  l = INT32_MAX - 40 - (int32_t)f();
> +  return l <= -38; // f() > INT32_MAX - 3
> +}
> +
> +int32_t i1d(void)
> +{
> +  int32_t l = 2;
> +  l = INT32_MAX - 50 - (int32_t)f();
> +  re

Re: [PATCH v3 4/4] tree-optimization/116024 - simplify some cases of X +- C1 cmp C2

2024-09-27 Thread Richard Biener
On Mon, 23 Sep 2024, Artemiy Volkov wrote:

> Whenever C1 and C2 are integer constants, X is of a wrapping type, and
> cmp is a relational operator, the expression X +- C1 cmp C2 can be
> simplified in the following cases:
> 
> (a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
> comparison in the following way:
>X +- C1 <= C2
>-INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
>-INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
>-INF -+ C1 <= X <= +INF (due to (1))
>-INF -+ C1 <= X (eliminate the right hand side since it holds for any X)
> 
> (b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
> sequence of transformations:
> 
>X +- C1 >= C2
>+INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
>+INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
>+INF -+ C1 >= X >= -INF (due to (1))
>+INF -+ C1 >= X (eliminate the right hand side since it holds for any X)
> 
> (c) The > and < cases are negations of (a) and (b), respectively.
> 
> This transformation allows to occasionally save add / sub instructions,
> for instance the expression
> 
> 3 + (uint32_t)f() < 2
> 
> compiles to
> 
> cmn w0, #4
> csetw0, ls
> 
> instead of
> 
> add w0, w0, 3
> cmp w0, 2
> csetw0, ls
> 
> on aarch64.
> 
> Testcases that go together with this patch have been split into two
> separate files, one containing testcases for unsigned variables and the
> other for wrapping signed ones (and thus compiled with -fwrapv).
> Additionally, one aarch64 test has been adjusted since the patch has
> caused the generated code to change from
> 
> cmn w0, #2
> csinc   w0, w1, wzr, cc   (x < -2)
> 
> to
> 
> cmn w0, #3
> csinc   w0, w1, wzr, cs   (x <= -3)
> 
> This patch has been bootstrapped and regtested on aarch64, x86_64, and
> i386, and additionally regtested on riscv32.
> 
> gcc/ChangeLog:
> 
> PR tree-optimization/116024
> * match.pd: New transformation around integer comparison.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/tree-ssa/pr116024-2.c: New test.
> * gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
> * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
> 
> Signed-off-by: Artemiy Volkov 
> ---
>  gcc/match.pd  | 44 ++-
>  .../gcc.dg/tree-ssa/pr116024-2-fwrapv.c   | 38 
>  gcc/testsuite/gcc.dg/tree-ssa/pr116024-2.c| 38 
>  .../gcc.target/aarch64/gtu_to_ltu_cmp_1.c |  2 +-
>  4 files changed, 120 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-2-fwrapv.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-2.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index bf3b4a2e3fe..3275a69252f 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8896,6 +8896,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (cmp @0 { TREE_OVERFLOW (res)
>? drop_tree_overflow (res) : res; }
>  (for cmp (lt le gt ge)
> + rcmp (gt ge lt le)
>   (for op (plus minus)
>rop (minus plus)
>(simplify
> @@ -8923,7 +8924,48 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> "X cmp C2 -+ C1"),
>WARN_STRICT_OVERFLOW_COMPARISON);
>   }
> - (cmp @0 { res; })
> + (cmp @0 { res; })
> +/* For wrapping types, simplify the following cases of X +- C1 CMP C2:
> +
> +   (a) If CMP is <= and C2 -+ C1 == +INF (1), simplify to X >= -INF -+ C1
> +   by observing the following:
> +
> + X +- C1 <= C2
> +  ==>  -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
> +  ==>  -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
> +  ==>  -INF -+ C1 <= X <= +INF (due to (1))
> +  ==>  -INF -+ C1 <= X (eliminate the right hand side since it holds for any 
> X)
> +
> +(b) Similarly, if CMP is >= and C2 -+ C1 == -INF (1):
> +
> + X +- C1 >= C2
> +  ==>  +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
> +  ==>  +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
> +  ==>  +INF -+ C1 >= X >= -INF (due to (1))
> +  ==>  +INF -+ C1 >= X (eliminate the right hand side since it holds for any 
> X)
> +
> +(c) The > and < cases are negations of (a) and (b), respectively.  */
> +   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))

This only works for signed types, right?  So add a !TYPE_UNSIGNED check.

> + (with
> +   {
> + wide_int max = wi::max_value (TREE_TYPE (@0));
> + wide_int min = wi::min_value (TREE_TYPE (@0));
> +
> + wide_int c2 = rop == PLUS_EXPR
> +   ? wi::add (wi::to_wide (@2), wi::to_wide (@1))
> +   : wi::sub (wi::to_wide (@2), wi::to_wide (@1));
> + }
> + (if (((cmp == LE_EXPR || cmp == GT_EXPR) && wi::eq_p (c2, max))
> + || ((cmp == LT_EXPR || cmp == GE_EXPR) && wi::eq_p (c2, min)))
> +   

Re: [PATCH] aarch64: fix build failure on aarch64-none-elf

2024-09-27 Thread Matthieu Longo

On 2024-09-26 18:41, Andrew Pinski wrote:

On Thu, Sep 26, 2024 at 10:28 AM Andrew Pinski  wrote:


On Thu, Sep 26, 2024 at 10:15 AM Matthieu Longo  wrote:


A previous patch ([1]) introduced a build regression on aarch64-none-elf
target. The changes were primarilly tested on aarch64-unknown-linux-gnu,
so the issue was missed during development.
The includes are slighly different between the two targets, and due to some
include rules ([2]), "aarch64-unwind-def.h" was not found.

[1]: 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf41d627c13bc5f0dc676991f4513daa9d9ae36

[2]: https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html

include "file"
...  It searches for a file named file first in the directory
containing the current file, ...


Can you provide more information on this because I am not sure this is
the right fix.
aarch64-unwind.h and aarch64-unwind-def.h are in the same directory so
an include from aarch64-unwind.h should find aarch64-unwind-def.h that
is in the same directory.
Maybe post what strace shows to see what files are being looked for.


Oh looking into the issue some more, aarch64-unwind.h gets linked to
md-unwind-support.h in the build directory and that is used.
I see there was a new makefile target added for md-unwind-def.h. Maybe
this should be including md-unwind-def.h instead of
aarch64-unwind-def.h.

Thanks,
Andrew Pinski



I attached the trace logs for the failure case (aarch64-none-elf).

In my understanding, what happens here is the following:

1. Symbolic links in //libgcc:

 a. for aarch64-none-elf:

md-unwind-def.h -> /libgcc/config/aarch64/aarch64-unwind-def.h
md-unwind-support.h -> /libgcc/config/aarch64/aarch64-unwind.h

 b. for aarch64-unknown-linux-gnu:

md-unwind-def.h -> /libgcc/config/aarch64/aarch64-unwind-def.h
md-unwind-support.h -> /libgcc/config/aarch64/linux-unwind.h

2. Includes sequence

Note: "->" means "#include "the_file""

 a. for aarch64-none-elf:

unwind-dw2.c
 * -> unwind-dw2.h -> md-unwind-def.h => symbolic link to 
config/aarch64/aarch64-unwind-def.h
 * -> md-unwind-support.h => symbolic link to 
config/aarch64/aarch-unwind.h -> aarch64-unwind-def.h


 b. for aarch64-unknown-linux-gnu:

unwind-dw2.c
 * -> unwind-dw2.h -> md-unwind-def.h => symbolic link to 
config/aarch64/aarch64-unwind-def.h
 * -> md-unwind-support.h => symbolic link to 
config/aarch64/linux-unwind.h -> config/aarch64/aarch64-unwind.h -> 
aarch64-unwind-def.h


3. My understanding

In the case 2.b, md-unwind-support.h points to 
config/aarch64/linux-unwind.h, which then includes 
config/aarch64/aarch64-unwind.h

According to [1]
> include "file"
> ...  It searches for a file named file first in the directory
> containing the current file, ...
This means that once config/aarch64/aarch64-unwind.h is found, the 
compiler searches for aarch64-unwind-def.h in the same directory as 
aarch64-unwind.h (i.e. config/aarch64/), hence it finds the file.


In the case 2.a, md-unwind-support.h points to 
config/aarch64/aarch64-unwind.h, which then includes aarch64-unwind-def.h.
However, in this case, the current file directory is not 
config/aarch64/, but //libgcc (the directory of the 
symlink). That's why aarch64-unwind-def.h cannot be found.


4. The fix

The fix I proposed provides the full path to aarch64-unwind-def.h, as it 
is done in config/aarch64/linux-unwind.h for 
config/aarch64/aarch64-unwind.h.
Including md-unwind-def.h instead of aarch64-unwind-def.h in 
config/aarch64/aarch64-unwind.h should work, but it makes more sense 
from my point of view to include the target file instead.


Thanks,
Matthieu.

[1]: https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html



libgcc/ChangeLog:

 * config/aarch64/aarch64-unwind.h: fix header path.
---
  libgcc/config/aarch64/aarch64-unwind.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
b/libgcc/config/aarch64/aarch64-unwind.h
index 2b774eb263c..4d36f0b26f7 100644
--- a/libgcc/config/aarch64/aarch64-unwind.h
+++ b/libgcc/config/aarch64/aarch64-unwind.h
@@ -25,7 +25,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
  #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
  #define AARCH64_UNWIND_H

-#include "aarch64-unwind-def.h"
+#include "config/aarch64/aarch64-unwind-def.h"

  #include "ansidecl.h"
  #include 
--
2.46.1

5170  execve("/data/build-dir/gcc-aarch64-none-elf/./gcc/xgcc", 
["/data/build-dir/gcc-aarch64-none"..., "-B/data/build-dir/gcc-aarch64-no"..., 
"-B/data/common-sysroot/aarch64-n"..., "-B/data/common-sysroot/aarch64-n"..., 
"-isystem", "/data/common-sysroot/aarch64-non"..., "-isystem", 
"/data/common-sysroot/aarch64-non"..., "-g", "-O2", "-O2", "-g", "-O2", 
"-DIN_GCC", "-DCROSS_DIRECTORY_STRUCTURE", "-W", "-Wall", 
"-Wno-error=narrowing", "-Wwrite-strings", "-Wcast-qual", 
"-Wstrict-prototypes", "-Wmissing-prototypes", "-Wold-style-definition", 
"-isystem", "./include", "-g", "-DIN_LIBGCC2", "-fbuilding-libgcc", 
"-fno-stack-pr

Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-27 Thread Richard Biener
On Fri, 27 Sep 2024, Jakub Jelinek wrote:

> On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote:
> > > __attribute__((noinline))
> > > struct ref_proxy f ()
> > > {
> > >struct ref_proxy ptr;
> > >struct ref_proxy D.10036;
> > >struct ref_proxy type;
> > >struct ref_proxy type;
> > >struct qual_option D.10031;
> > >struct ref_proxy D.10030;
> > >struct qual_option inner;
> > >struct variant t;
> > >struct variant D.10026;
> > >struct variant D.10024;
> > >struct inplace_ref D.10023;
> > >struct inplace_ref ptr;
> > >struct ref_proxy D.9898;
> > > 
> > > [local count: 1073741824]:
> > >MEM  [(struct variant *)&D.10024] = {};
> > 
> > Without actually checking it might be that SRA chokes on the above.
> > The IL is basically a huge chain of aggregate copies interspersed
> > with clobbers and occasional scalar inits and I fear that we really
> > only have SRA dealing with this.
> 
> True.
> 
> > Is there any reason to use the char[40] init instead of a aggregate
> > {} init of type variant?
> 
> It is dse1 which introduces that:
> -  D.10137 = {};
> +  MEM  [(struct variant *)&D.10137] = {};
> in particular maybe_trim_constructor_store.
> 
> So, if SRA chokes on it, it better be fixed to deal with that,
> DSE can create that any time.

Agreed, it might be non-trivial though.

> Though, not sure how to differentiate that from the actual C++ zero
> initialization where it is supposed to clear also padding bits if any.
> I think a CONSTRUCTOR flag for that would be best, though e.g. in GIMPLE
> representing such clears including padding bits with
> MEM  [(struct whatever *)&D.whatever] = {};
> might be an option too.  But then how to represent the DSE constructor
> trimming such that it is clear that it still doesn't initialize the padding
> bits?
> Anyway, even if padding bits are zero initialized, if SRA could see that
> nothing really inspects those padding bits, it would be nice to still optimize
> it.
>
> That said, it is
> a union of a struct with 5 pointers (i.e. 40 bytes) and an empty struct
> (1 byte, with padding) followed by size_t which_ field (the = 2 store).
> 
> And, I believe when not constexpr evaluating this, there actually is no
> clearing before the = 2; store,
> void 
> eggs::variants::detail::_storage  option_1, option_2>, true, true>::_storage<2, option_2> (struct _storage * 
> const thi
> s, struct index which, struct option_2 & args#0)
> {
>   struct index D.10676;
> 
>   *this = {CLOBBER(bob)};
>   {
> _1 = &this->D.9542;
> 
> eggs::variants::detail::_union  option_1, option_2>, true>::_union<2, option_2> (_1, D.10676, args#0);
> this->_which = 2;
>   }
> }
> and the call in there is just 3 nested inline calls which do some clobbers
> too, take address of something and call another inline and in the end
> it is just a clobber and nothing else.
> 
> So, another thing to look at is whether the CONSTRUCTOR is
> CONSTRUCTOR_NO_CLEARING or not and if that is ok, and/or whether
> the gimplification is correct in that case (say if it would be
> struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; 
> } u; unsigned long w; };
> void bar (S *);
> 
> void
> foo ()
> {
>   S s = { .u = { .v = {} }, .w = 2 };
>   bar (&s);
> }
> why do we expand it as
>   s = {};
>   s.w = 2;
> when just
>   s.w = 2;
> or maybe
>   s.u.v = {};
>   s.w = 2;
> would be enough.  Because when the large union has just a small member
> (in this case empty struct) active, clearing the whole union is really a
> waste of time.
> 
> > I would suggest to open a bugreport.
> 
> Yes.

I can investigate a bit when there's a testcase showing the issue.

Richard.


Re: [PATCH v3 3/4] tree-optimization/116024 - simplify C1-X cmp C2 for wrapping signed types

2024-09-27 Thread Richard Biener
On Mon, 23 Sep 2024, Artemiy Volkov wrote:

> Implement a match.pd transformation inverting the sign of X in
> C1 - X cmp C2, where C1 and C2 are integer constants and X is
> of a wrapping signed type, by observing that:
> 
> (a) If cmp is == or !=, simply move X and C2 to opposite sides of
> the comparison to arrive at X cmp C1 - C2.
> 
> (b) If cmp is <:
>   - C1 - X < C2 means that C1 - X spans the values of -INF,
> -INF + 1, ..., C2 - 1;
> - Therefore, X is one of C1 - -INF, C1 - (-INF + 1), ...,
> C1 - C2 + 1;
>   - Subtracting (C1 + 1), X - (C1 + 1) is one of - (-INF) - 1,
>   - (-INF) - 2, ..., -C2;
> - Using the fact that - (-INF) - 1 is +INF, derive that
>   X - (C1 + 1) spans the values +INF, +INF - 1, ..., -C2;
> - Thus, the original expression can be simplified to
>   X - (C1 + 1) > -C2 - 1.
> 
> (c) Similarly, C1 - X <= C2 is equivalent to X - (C1 + 1) >= -C2 - 1.
> 
> (d) The >= and > cases are negations of (b) and (c), respectively.
> 
> (e) In all cases, the expression -C2 - 1 can be shortened to
> bit_not (C2).
> 
> This transformation allows to occasionally save load-immediate /
> subtraction instructions, e.g. the following statement:
> 
> 10 - (int)f() >= 20;
> 
> now compiles to
> 
> addia0,a0,-11
> sltia0,a0,-20
> 
> instead of
> 
> li  a5,10
> sub a0,a5,a0
> sltit0,a0,20
> xoria0,t0,1
> 
> on 32-bit RISC-V when compiled with -fwrapv.
> 
> Additional examples can be found in the newly added test file.  This
> patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
> and additionally regtested on riscv32.
> 
> gcc/ChangeLog:
> 
> PR tree-optimization/116024
> * match.pd: New transformation around integer comparison.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/tree-ssa/pr116024-1-fwrapv.c: New test.
> 
> Signed-off-by: Artemiy Volkov 
> ---
>  gcc/match.pd  | 20 -
>  .../gcc.dg/tree-ssa/pr116024-1-fwrapv.c   | 73 +++
>  2 files changed, 92 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index d0489789527..bf3b4a2e3fe 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8970,7 +8970,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(cmp (plus @1 (minus @2 @0)) @2))
> (if (cmp == LT_EXPR || cmp == GE_EXPR)
>(cmp (plus @1 (minus @2
> -(plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
> +(plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
> +/* For wrapping signed types (-fwrapv), transform like so (using < as 
> example):
> +  C1 - X < C2
> +  ==>  C1 - X = { -INF, -INF + 1, ..., C2 - 1 }
> +  ==>  X = { C1 - (-INF), C1 - (-INF + 1), ..., C1 - C2 + 1 }
> +  ==>  X - (C1 + 1) = { - (-INF) - 1, - (-INF) - 2, ..., -C2 }
> +  ==>  X - (C1 + 1) = { +INF, +INF - 1, ..., -C2 }
> +  ==>  X - (C1 + 1) > -C2 - 1
> +  ==>  X - (C1 + 1) > bit_not (C2)
> +
> +  Similarly,
> +  C1 - X <= C2 ==> X - (C1 + 1) >= bit_not (C2);
> +  C1 - X >= C2 ==> X - (C1 + 1) <= bit_not (C2);
> +  C1 - X > C2 ==> X - (C1 + 1) < bit_not (C2).  */
> +   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))

You need to add && !TYPE_UNSIGNED (TREE_TYPE (@1)) here, 
TYPE_OVERFLOW_WRAPS is also true for unsigned types.

OK with that change.

Thanks,
Richard.

> + (if (cmp == EQ_EXPR || cmp == NE_EXPR)
> + (cmp @1 (minus @0 @2))
> + (rcmp (minus @1 (plus @0 { build_one_cst (TREE_TYPE (@1)); }))
> +  (bit_not @2
>  
>  /* Canonicalizations of BIT_FIELD_REFs.  */
>  
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
> new file mode 100644
> index 000..c2bf1d17234
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
> @@ -0,0 +1,73 @@
> +/* PR tree-optimization/116024 */
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-forwprop1-details -fwrapv" } */
> +
> +#include 
> +
> +uint32_t f(void);
> +
> +int32_t i2(void)
> +{
> +  int32_t l = 2;
> +  l = 10 - (int32_t)f();
> +  return l <= 20; // f() - 11 >= -21
> +}
> +
> +int32_t i2a(void)
> +{
> +  int32_t l = 2;
> +  l = 10 - (int32_t)f();
> +  return l < 30; // f() - 11 > -31
> +}
> +
> +int32_t i2b(void)
> +{
> +  int32_t l = 2;
> +  l = 200 - (int32_t)f();
> +  return l <= 100; // f() - 201 >= -101
> +}
> +
> +int32_t i2c(void)
> +{
> +  int32_t l = 2;
> +  l = 300 - (int32_t)f();
> +  return l < 100; // f() - 301 > -101
> +}
> +
> +int32_t i2d(void)
> +{
> +  int32_t l = 2;
> +  l = 1000 - (int32_t)f();
> +  return l >= 2000; // f() - 1001 <= -2001
> +}
> +
> +int32_t i2e(void)
> +{
> +  int32_t l = 2;
> +  l = 1000 - (int32_t)f();
> +  return l > 3000; // f() - 1001 < -3001
> +}
> +
> +int32_t i2f(void)
> +{
> +  int32_t l = 2;
> +  l = 2 - (int32_t)f();

Re: [PING] [PATCH v2] gimple ssa: Don't use __builtin_popcount in switch exp transform

2024-09-27 Thread Richard Biener
On Mon, 23 Sep 2024, Filip Kastl wrote:

> Hi,
> 
> I'd like to ping my patch.  You can find it here
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662744.html
> 
> Btw I forgot to include [PR116616] in the subject.  Hope I didn't confuse
> people.  I will take care to include the tag in the git commit message.

OK.

Thanks,
Richard.

> Thanks,
> Filip Kastl
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH v2 2/4] tree-object-size: Fold PHI node offsets with constants [PR116556]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:27PM -0400, Siddhesh Poyarekar wrote:
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1468,6 +1468,63 @@ merge_object_sizes (struct object_size_info *osi, tree 
> dest, tree orig)
>return bitmap_bit_p (osi->reexamine, SSA_NAME_VERSION (orig));
>  }
>  
> +/* For constant sizes, try collapsing a non-constant offset to a constant if
> +   possible.  The case handled at the moment is when the offset is a PHI node
> +   with all of its targets are constants.  */
> +
> +static tree
> +try_collapsing_offset (tree op, int object_size_type)
> +{
> +  gcc_assert (!(object_size_type & OST_DYNAMIC));
> +
> +  if (TREE_CODE (op) != SSA_NAME)
> +return op;
> +
> +  gimple *stmt = SSA_NAME_DEF_STMT (op);
> +
> +  switch (gimple_code (stmt))
> +{
> +case GIMPLE_ASSIGN:
> +  /* Peek through casts.  */
> +  if (gimple_assign_rhs_code (stmt) == NOP_EXPR)

Do you really want to handle all casts?  That could be widening, narrowing,
from non-integral types, ...
If only same mode, then gimple_assign_unary_nop_p probably should be used
instead, if any from integral types (same precision, widening, narrowing),
then perhaps CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
but verify that INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))
before actually recursing.
Note, I think narrowing or widening casts to sizetype are fine, but when
you recurse through, other casts might not be anymore.
Consider
  short int _1;
  unsigned int _2;
  size_t _3;
...
  # _1 = PHI <-10(7), 12(8), 4(9), -42(10)>
  _2 = (unsigned int) _1;
  _3 = (size_t) _2;
If the recursion finds minimum or maximum from the signed short int _1
values (cast to ptrdiff_type_node), pretending that is the minimum or
maximum for _3 is just wrong, as the cast from signed to unsigned will
turn negative values to something larger than the smaller positive values.

Similarly, consider casts from unsigned __int128 -> unsigned short -> size_t
(or signed short in between), what is minimum/maximum in 128-bits (the code
for PHIs actually ignores the upper bits and looks only for signed 64-bits,
but if there is unfolded cast from INTEGER_CST you actually could have even
large value) isn't necessarily minimum after cast to 16-bit (unsigned or
signed).

So, unless you want to get all the possibilities into account, perhaps only
recurse through casts from integer types to integer types with precision
of sizetype?

And perhaps also look for INTEGER_CST type returned from the recursive
call and if it doesn't have sizetype precision, either convert it to
sizetype or ignore.

> + {
> +   tree ret = try_collapsing_offset (gimple_assign_rhs1 (stmt),
> + object_size_type);
> +   if (TREE_CODE (ret) == INTEGER_CST)
> + return ret;
> + }
> +  break;
> +case GIMPLE_PHI:
> + {
> +   tree off = ((object_size_type & OST_MINIMUM)
> +   ? TYPE_MIN_VALUE (ptrdiff_type_node)
> +   : TYPE_MAX_VALUE (ptrdiff_type_node));

Because you only process constants, I wonder whether using
wide_int off and performing the min/max on wide_int wouldn't be
better, no need to create INTEGER_CSTs for all the intermediate
results.
That would be
  unsigned int prec = TYPE_PRECISION (ptrdiff_type_node);
  wide_int off = ((object_size_type & OST_MINIMUM)
  ? wi::to_wide (TYPE_MIN_VALUE (ptrdiff_type_node))
  : wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node)));

> +
> +   for (unsigned i = 0; i < gimple_phi_num_args (stmt); i++)
> + {
> +   tree rhs = gimple_phi_arg_def (stmt, i);
> +

I wonder if it wouldn't be useful to recurse here,
  rhs = try_collapsing_offset (rhs, object_size_type);
but guess with some extra counter argument and only allow some small
constant levels of nesting (but also do that for the cast cases).

> +   if (TREE_CODE (rhs) != INTEGER_CST)
> + return op;
> +
> +   /* Note that this is the *opposite* of what we usually do with
> +  sizes, because the maximum offset estimate here will give us a
> +  minimum size estimate and vice versa.  */
> +   enum tree_code code = (object_size_type & OST_MINIMUM
> +  ? MAX_EXPR : MIN_EXPR);
> +
> +   off = fold_build2 (code, ptrdiff_type_node, off,
> +  fold_convert (ptrdiff_type_node, rhs));

And this could be

  wide_int w = wi::to_wide (rhs, prec);
  if (object_size_type & OST_MINIMUM)
off = wi::smax (off, w);
  else
off = wi::smin (off, w);

> + }
> +   return fold_convert (sizetype, off);

  return wide_int_to_tree (sizetype, off);

> + }
> +default:
> +  break;
> +}
> +
> +  /* Nothing worked, so return OP untouched.  */
> +  return op;
> +}
>  
>  /* Compute 

[pushed] libgcc, Darwin: Don't build legacy libgcc_s.1 on macOS 14 [PR116809]

2024-09-27 Thread Iain Sandoe
From: Mark Mentovai 

I pushed this for Mark, after testing locally on x86_64-darwin21, 23,
thanks,
Iain

--- 8< ---

d9cafa0c4f0a stopped building libgcc_s.1 on macOS >= 15, in part because
that is required to bootstrap the compiler using the macOS 15 SDK. The
macOS 15 SDK ships in Xcode 16, which also runs on macOS 14. libgcc_s.1
can no longer be built on macOS 14 using Xcode 16 by the same logic that
the previous change disabled it for macOS 15.

PR target/116809

libgcc/ChangeLog:

* config.host: Don't build legacy libgcc_s.1 on macOS 14.

Signed-off-by: Mark Mentovai 
---
 libgcc/config.host | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/config.host b/libgcc/config.host
index 00bd6384c0f..fa001c5e900 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -239,7 +239,7 @@ case ${host} in
   esac
   tmake_file="$tmake_file t-slibgcc-darwin"
   case ${host} in
-x86_64-*-darwin2[0-3]*)
+x86_64-*-darwin2[0-2]*)
   tmake_file="t-darwin-min-11 t-darwin-libgccs1 $tmake_file"
   ;;
 *-*-darwin2*)
-- 
2.39.2 (Apple Git-143)



Re: RFC PATCH: contrib/test_summary mode for submitting testsuite results to bunsen

2024-09-27 Thread Iain Sandoe
Hi Frank, all,

> On 26 Sep 2024, at 15:53, Frank Ch. Eigler  wrote:

>> Regarding functionality, perfect enough AFAICT.  I was going to 
>> make a nitpick comment about comments with full sentences and 
>> all that GNU...but better be consistent with the rest of the 
>> file.  Thanks!
> 
> I don't mind addressing even nitpicks, while awaiting word from
> someone who can approve a push to the contrib/ dir.

I also have a version that works on macOS/Darwin (there are one or
two small differences to available tools);  currently, it’s got some things
specific to my test boxes - but could be generalised.  I note this here;
if anyone wants it - then I’m happy to try the generalisation - but short
of time at present.

Iain




[PATCH] c++/modules: Propagate purview/import for templates in duplicate_decls [PR116803]

2024-09-27 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

We need to ensure that for a declaration in the module purview, that the
resulting declaration has PURVIEW_P set and IMPORT_P cleared so that we
understand it might be something requiring exporting.  This is normally
handled for a declaration by set_instantiating_module, but when this
declaration is a redeclaration duplicate_decls needs to propagate this
to olddecl.

This patch only changes the logic for template declarations, because in
the non-template case the whole contents of olddecl's DECL_LANG_SPECIFIC
is replaced with newdecl's (which includes these flags), so there's
nothing to do.

PR c++/116803

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Propagate DECL_MODULE_PURVIEW_P and
DECL_MODULE_IMPORT_P for template redeclarations.

gcc/testsuite/ChangeLog:

* g++.dg/modules/merge-18_a.H: New test.
* g++.dg/modules/merge-18_b.H: New test.
* g++.dg/modules/merge-18_c.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/decl.cc| 10 ++
 gcc/testsuite/g++.dg/modules/merge-18_a.H |  8 
 gcc/testsuite/g++.dg/modules/merge-18_b.H | 13 +
 gcc/testsuite/g++.dg/modules/merge-18_c.C | 10 ++
 4 files changed, 41 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/modules/merge-18_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/merge-18_b.H
 create mode 100644 gcc/testsuite/g++.dg/modules/merge-18_c.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 5ddb7eafa50..a81a7dd2e9e 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -2528,6 +2528,16 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
hiding, bool was_hidden)
}
}
 
+  /* Propagate purviewness and importingness as with
+set_instantiating_module.  */
+  if (modules_p ())
+   {
+ if (DECL_MODULE_PURVIEW_P (new_result))
+   DECL_MODULE_PURVIEW_P (old_result) = true;
+ if (!DECL_MODULE_IMPORT_P (new_result))
+   DECL_MODULE_IMPORT_P (old_result) = false;
+   }
+
   /* If the new declaration is a definition, update the file and
 line information on the declaration, and also make
 the old declaration the same definition.  */
diff --git a/gcc/testsuite/g++.dg/modules/merge-18_a.H 
b/gcc/testsuite/g++.dg/modules/merge-18_a.H
new file mode 100644
index 000..8d86ad980ba
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-18_a.H
@@ -0,0 +1,8 @@
+// PR c++/116803
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+namespace ns {
+  template  void foo();
+  template  extern const int bar;
+}
diff --git a/gcc/testsuite/g++.dg/modules/merge-18_b.H 
b/gcc/testsuite/g++.dg/modules/merge-18_b.H
new file mode 100644
index 000..2a762e2ac49
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-18_b.H
@@ -0,0 +1,13 @@
+// PR c++/116803
+// { dg-additional-options "-fmodule-header -fdump-lang-module" }
+// { dg-module-cmi {} }
+
+import "merge-18_a.H";
+
+namespace ns {
+  template  void foo() {}
+  template  const int bar = 123;
+}
+
+// { dg-final { scan-lang-dump {Writing definition '::ns::template foo'} 
module } }
+// { dg-final { scan-lang-dump {Writing definition '::ns::template bar'} 
module } }
diff --git a/gcc/testsuite/g++.dg/modules/merge-18_c.C 
b/gcc/testsuite/g++.dg/modules/merge-18_c.C
new file mode 100644
index 000..b90d85f7502
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-18_c.C
@@ -0,0 +1,10 @@
+// PR c++/116803
+// { dg-module-do link }
+// { dg-additional-options "-fmodules-ts" }
+
+import "merge-18_b.H";
+
+int main() {
+  ns::foo();
+  static_assert(ns::bar == 123);
+}
-- 
2.46.0



Re: [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.

2024-09-27 Thread Thomas Schwinge
Hi Roger!

If you don't mind, I could use your help here (but: low priority!):

On 2024-07-27T19:18:35+0100, "Roger Sayle"  wrote:
> Previously, for isnormal, GCC -O2 would generate: [...]
> and with this patch becomes:
>
> mov.f64 %r23, %ar0;
> setp.neu.f64%r24, %r23, 0d;
> testp.normal.f64%r25, %r23;
> and.pred%r26, %r24, %r25;
> selp.u32%value, 1, 0, %r26;

Looking at this, shouldn't we be able to optimize ("combine") this into
somethink like (untested):

mov.f64 %r23, %ar0;
testp.normal.f64%r25, %r23;
setp.neu.and.f64%r26, %r23, 0d, %r25;
selp.u32%value, 1, 0, %r26;

(I hope I correctly understood PTX 'setp', 'combine [...] with a
predicate value by applying a Boolean operator'!)

That is, "combine":

CmpOp = { eq, ne, lt, le, gt, ge, lo, ls, hi, hs, equ, neu, ltu, leu, gtu, 
geu, num, nan };

BoolOp = { and, or, xor };

setp.CmpOp.TYPE %3, %2, %1;
BoolOp.pred %5, %3, %4

... into:

setp.CmpOp.BoolOp.TYPE %5, %2, %1, %4;

I tried adding a corresponding 'define_insn' for just the 'and' case at
hand (eventually to be generalized to 'BoolOp'), see the attached
"WIP nvptx: 'setp', 'combine [...] with a predicate value by applying a Boolean 
operator'".
This does do the expected transformation for quite a number of instances
in the GCC/nvptx target libraries (again: completely untested!) -- but it
doesn't for the new 'gcc.target/nvptx/isnormal.c', and I don't know how
to read '-fdump-rtl-combine-all', to understand, why.  Any "RTFM" or
other pointers gladly accepted, guidance about how to approach such an
issue.  (Or tell me it's just 'TARGET_RTX_COSTS'...)


Grüße
 Thomas


> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md

> +(define_insn "setcc_isnormal"
> +  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
> + (unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +UNSPEC_ISNORMAL))]
> +  ""
> +  "%.\\ttestp.normal%t1\\t%0, %1;")
> +
> +(define_expand "isnormal2"
> +  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
> + (unspec:SI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +UNSPEC_ISNORMAL))]
> +  ""
> +{
> +  rtx pred1 = gen_reg_rtx (BImode);
> +  rtx pred2 = gen_reg_rtx (BImode);
> +  rtx pred3 = gen_reg_rtx (BImode);
> +  rtx zero = CONST0_RTX (mode);
> +  rtx cmp = gen_rtx_fmt_ee (NE, BImode, operands[1], zero);
> +  emit_insn (gen_cmp (pred1, cmp, operands[1], zero));
> +  emit_insn (gen_setcc_isnormal (pred2, operands[1]));
> +  emit_insn (gen_andbi3 (pred3, pred1, pred2));
> +  emit_insn (gen_setccsi_from_bi (operands[0], pred3));
> +  DONE;
> +})

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/nvptx/isnormal.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int isnormal(double x)
> +{
> +  return __builtin_isnormal(x);
> +}
> +
> +/* { dg-final { scan-assembler-times "testp.normal.f64" 1 } } */


>From c4c389a6bd262356023202adab08a48f044e59b2 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 27 Sep 2024 15:14:19 +0200
Subject: [PATCH] WIP nvptx: 'setp', 'combine [...] with a predicate value by
 applying a Boolean operator'

Re "Implement isfinite and isnormal optabs in nvptx.md"

mov.f64 %r23, %ar0;
setp.neu.f64%r24, %r23, 0d;
testp.normal.f64%r25, %r23;
and.pred%r26, %r24, %r25;
selp.u32%value, 1, 0, %r26;

Can we optimize this into somethink like (untested):

mov.f64 %r23, %ar0;
testp.normal.f64%r25, %r23;
setp.neu.and.f64%r26, %r23, 0d, %r25;
selp.u32%value, 1, 0, %r26;

That is, "combine":

CmpOp = { eq, ne, lt, le, gt, ge, lo, ls, hi, hs, equ, neu, ltu, leu, gtu, geu, num, nan };

BoolOp = { and, or, xor };

setp.CmpOp.TYPE %3, %2, %1;
BoolOp.pred %5, %3, %4

..., into:

setp.CmpOp.BoolOp.TYPE %5, %2, %1, %4;
---
 gcc/config/nvptx/nvptx.cc |  3 +++
 gcc/config/nvptx/nvptx.md | 23 ---
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 96a1134220e..b4c4f9ff021 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -3080,6 +3080,9 @@ nvptx_print_operand (FILE *file, rtx x, int code)
 	default:
 	  gcc_unreachable ();
 	}
+  break;
+case /*TODO*/ 'C':
+  mode = GET_MODE (XEXP (x, 0));
   if (FLOAT_MODE_P (mode)
 	  || x_code == EQ || x_code == NE
 	  || x_code == GEU || x_code == GTU
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index ae711bbd250..ce2603eeccb 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -881,13 +881,22 @@
 
 ;; Comparisons and branches
 
+(define_insn ""
+  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
+	(and:BI (match_opera

Re: [PATCH v6] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 02:01:19PM +, Qing Zhao wrote:
> +  /* Currently, only when the array_ref is an indirect_ref to a call to the
> + .ACCESS_WITH_SIZE, return true.
> + More cases can be included later when the counted_by attribute is
> + extended to other situations.  */
> +  if ((TREE_CODE (array_ref) == INDIRECT_REF)

The ()s around the == are useless.

> +  && is_access_with_size_p (TREE_OPERAND (array_ref, 0)))
> +return true;
> +  return false;
> +}
> +
> +/* Get the reference to the counted-by object associated with the ARRAY_REF. 
>  */
> +static tree
> +get_counted_by_ref (tree array_ref)
> +{
> +  /* Currently, only when the array_ref is an indirect_ref to a call to the
> + .ACCESS_WITH_SIZE, get the corresponding counted_by ref.
> + More cases can be included later when the counted_by attribute is
> + extended to other situations.  */
> +  if ((TREE_CODE (array_ref) == INDIRECT_REF)

Again.

> + if (TREE_CODE (TREE_TYPE (ref)) != ARRAY_TYPE)
> +   {
> + error_at (loc, "the argument must be an array"
> +"%<__builtin_counted_by_ref%>");
> + expr.set_error ();
> + break;
> +   }
> +
> + /* if the array ref is inside TYPEOF or ALIGNOF, the call to

Comments should start with capital letter, i.e. If

> +.ACCESS_WITH_SIZE was not genereated by the routine

s/genereated/generated/

> +build_component_ref by default, we should generate it here.  */
> + if ((in_typeof || in_alignof)
&& TREE_CODE (ref) == COMPONENT_REF)

The above && ... fits on the same line as the rest of the condition.

> +   ref = handle_counted_by_for_component_ref (loc, ref);
> +
> + if (has_counted_by_object (ref))
> +   expr.value
> + = get_counted_by_ref (ref);

This too.

> + else
> +   expr.value
> + = build_int_cst (build_pointer_type (void_type_node), 0);

else
  expr.value = null_pointer_node;
instead.

> +/*
> + * For the COMPONENT_REF ref, check whether it has a counted_by attribute,
> + * if so, wrap this COMPONENT_REF with the corresponding CALL to the
> + * function .ACCESS_WITH_SIZE.
> + * Otherwise, return the ref itself.
> + */

We don't use this style of comments.  No *s at the start of each line, /*
should be immediately followed after space with the first line and */
should be right after . and two spaces.

Jakub



Re: [PATCH 1/2] JSON Dumping of GENERIC trees

2024-09-27 Thread David Malcolm
On Sat, 2024-09-21 at 22:49 -0500, -thor wrote:
> From: thor 
> 
> This is the second revision of:
> 
>  
> https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662849.html
> 
> I've incorporated the feedback given both by Richard and David - I
> didn't
> find any memory leaks when testing in valgrind :)

Thanks for the updated patch.

[...snip...]

> diff --git a/gcc/tree-emit-json.cc b/gcc/tree-emit-json.cc
> new file mode 100644
> index 000..df97069b922
> --- /dev/null
> +++ b/gcc/tree-emit-json.cc

[...snip...]

Thanks for using std::unique_ptr, but I may have been unclear in my
earlier email - please use it to indicate ownership of a heap-allocated
pointer...

> +/* Adds Identifier information to JSON object */
> +
> +void
> +identifier_node_add_json (tree t, std::unique_ptr & json_obj)
> +  {
> +    const char* buff = IDENTIFIER_POINTER (t);
> +    json_obj->set_string("id_to_locale", identifier_to_locale(buff));
> +    buff = IDENTIFIER_POINTER (t);
> +    json_obj->set_string("id_point", buff);
> +  }

...whereas here (and in many other places), the patch has a

   std::unique_ptr &json_obj

(expressing a reference to a a unique_ptr i.e. a non-modifiable non-
null pointer to a pointer that manages the lifetime of a modifiable
json::object)

where, if I'm reading the code correctly,

   json::object &json_obj

(expressing a non-modifiable non-null pointer to a modifiable
json::object).

would be much clearer and simpler.

[...snip...]

Hope the above makes sense; sorry if I'm being unclear.
Dave



Re: [Fortran, Patch, PR81265, v1] Fix passing coarrays always w/ descriptor

2024-09-27 Thread Steve Kargl
On Fri, Sep 27, 2024 at 03:20:43PM +0200, Andre Vehreschild wrote:
> 
> attached patch fixes a runtime issue when a coarray was passed as
> parameter to a procedure that was itself a parameter. The issue here
> was that the coarray was passed as array pointer (i.e. w/o descriptor)
> to the function, but the function expected it to be an array
> w/ descriptor.
> 
> Regtests ok on x86_64-pc-linux-gnu / Fedore 39. Ok for mainline?
> 

Yes.

One general question as you're plowing through the coarray
bug reports:  does the testing include -fcoarray=none, single,
and lib; or a subset of the three.

-- 
Steve


Re: [PATCH 1/2] JSON Dumping of GENERIC trees

2024-09-27 Thread David Malcolm
On Fri, 2024-09-27 at 10:21 -0500, Thor Preimesberger wrote:
> That's all correct. I think I got it.
> 
> There are times where the code is emitting a json::object* that is
> contained in another json object. Is it good style to return these
> still as a unique_ptr? 

Probably not.  If you have a json::value "contained" in another json
value, either as an array alement, or as an object property, then the
parent "owns" the child: note how the destructor for json::object calls
delete on its property values, and how the destructor for json::array
calls delete on its elements.

So if you have code that is creating a new json value that's about to
be added somewhere in the value tree, that new json value should
probably use std::unique_ptr to indicate ownership (responsibility to
delete), whereas if you're borrowing a value that's already owned by
something in the value tree, just use a regular pointer (or a
reference, if it's guaranteed to be non-null).

Hopefully that makes sense.
Dave


> I'm looking over what I wrote again, and in
> some parts I wrap the new json object in a unique_ptr (as a return in
> some function calls) and in others I use new and delete.
> 
> Thanks,
> Thor Preimesberger
> 
> On Fri, Sep 27, 2024 at 9:18 AM David Malcolm 
> wrote:
> > 
> > On Sat, 2024-09-21 at 22:49 -0500, -thor wrote:
> > > From: thor 
> > > 
> > > This is the second revision of:
> > > 
> > > 
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662849.html
> > > 
> > > I've incorporated the feedback given both by Richard and David -
> > > I
> > > didn't
> > > find any memory leaks when testing in valgrind :)
> > 
> > Thanks for the updated patch.
> > 
> > [...snip...]
> > 
> > > diff --git a/gcc/tree-emit-json.cc b/gcc/tree-emit-json.cc
> > > new file mode 100644
> > > index 000..df97069b922
> > > --- /dev/null
> > > +++ b/gcc/tree-emit-json.cc
> > 
> > [...snip...]
> > 
> > Thanks for using std::unique_ptr, but I may have been unclear in my
> > earlier email - please use it to indicate ownership of a heap-
> > allocated
> > pointer...
> > 
> > > +/* Adds Identifier information to JSON object */
> > > +
> > > +void
> > > +identifier_node_add_json (tree t, std::unique_ptr
> > > & json_obj)
> > > +  {
> > > +    const char* buff = IDENTIFIER_POINTER (t);
> > > +    json_obj->set_string("id_to_locale",
> > > identifier_to_locale(buff));
> > > +    buff = IDENTIFIER_POINTER (t);
> > > +    json_obj->set_string("id_point", buff);
> > > +  }
> > 
> > ...whereas here (and in many other places), the patch has a
> > 
> >    std::unique_ptr &json_obj
> > 
> > (expressing a reference to a a unique_ptr i.e. a non-modifiable
> > non-
> > null pointer to a pointer that manages the lifetime of a modifiable
> > json::object)
> > 
> > where, if I'm reading the code correctly,
> > 
> >    json::object &json_obj
> > 
> > (expressing a non-modifiable non-null pointer to a modifiable
> > json::object).
> > 
> > would be much clearer and simpler.
> > 
> > [...snip...]
> > 
> > Hope the above makes sense; sorry if I'm being unclear.
> > Dave
> > 
> 



Re: [Fortran, Patch, PR81265, v1] Fix passing coarrays always w/ descriptor

2024-09-27 Thread Andre Vehreschild

Hi Steve,

the testcase is in the coarray directory, where tests are executed mit
-fcoarray=single and lib. I don't know about none. Because the code stops
compiling when it encounters a coarray with no single or lib. Therefore I
suppose there no way to run it without coarrays.

Hope that helps,
Andre
Andre Vehreschild


Re: [PATCH] arm: Force flag_pic for FDPIC

2024-09-27 Thread rep . dot . nop
On 27 September 2024 16:05:01 CEST, "Richard Earnshaw (lists)" 
 wrote:
>On 26/09/2024 19:21, Ramana Radhakrishnan wrote:
>> On Mon, Mar 4, 2024 at 1:43 PM Fangrui Song  wrote:
>>>
>>> From: Fangrui Song 
>>>
>>> -fno-pic -mfdpic generated code is like regular -fno-pic, not suitable
>>> for FDPIC (absolute addressing for symbol references and no function
>>> descriptor).  The sh port simply upgrades -fno-pic to -fpie by setting
>>> flag_pic.  Let's follow suit.
>>>
>>> Link: 
>>> https://inbox.sourceware.org/gcc-patches/20150913165303.gc17...@brightrain.aerifal.cx/
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/arm/arm.cc (arm_option_override): Set flag_pic if
>>>   TARGET_FDPIC.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/fdpic-pie.c: New test.
>>> ---
>>>  gcc/config/arm/arm.cc|  6 +
>>>  gcc/testsuite/gcc.target/arm/fdpic-pie.c | 30 
>>>  2 files changed, 36 insertions(+)
>>>  create mode 100644 gcc/testsuite/gcc.target/arm/fdpic-pie.c
>>>
>>> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
>>> index 1cd69268ee9..f2fd3cce48c 100644
>>> --- a/gcc/config/arm/arm.cc
>>> +++ b/gcc/config/arm/arm.cc
>>> @@ -3682,6 +3682,12 @@ arm_option_override (void)
>>>arm_pic_register = FDPIC_REGNUM;
>>>if (TARGET_THUMB1)
>>> sorry ("FDPIC mode is not supported in Thumb-1 mode");
>>> +
>>> +  /* FDPIC code is a special form of PIC, and the vast majority of code
>>> +generation constraints that apply to PIC also apply to FDPIC, so we
>>> + set flag_pic to avoid the need to check TARGET_FDPIC everywhere
>>> + flag_pic is checked. */
>>> +  flag_pic = 2;
>>>  }
>> 
>> Been a while since I looked at this stuff but should this not be
>> flag_pie being set rather than flag_pic here if the expectation is to
>> turn -fno-PIC -mfdpic into fPIE ?
>
>-fPIE implies -fPIC, but has the added implication that any definition of an 
>object we see is the one true definition and cannot be pre-empted during 
>loading (in a shared library, a definition of X may be pre-empted by another 
>definition of X in either the application itself or another shared library 
>that was loaded first).
>
>Part of the confusion comes from the manual, though:
>
>Select the FDPIC ABI, which uses 64-bit function descriptors to
>represent pointers to functions.  When the compiler is configured for
>@code{arm-*-uclinuxfdpiceabi} targets, this option is on by default
>and implies @option{-fPIE} if none of the PIC/PIE-related options is
>provided.  On other targets, it only enables the FDPIC-specific code
>generation features, and the user should explicitly provide the
>PIC/PIE-related options as needed.
>
>Which conflates things relating to the option flag and the compiler 
>configuration.  I think that needs clearing up as well.  Something like
>
>Select the FDPIC ABI, which uses 64-bit function descriptors to
>represent pointers to functions.  @option{-mfdpic} implies @option{-fPIC}.
>
>When the compiler is configured for @code{arm-*-uclinuxfdpiceabi} targets, 
>this option is on by default and the compiler defaults to @option{-fPIE}, 
>unless @option{-fPIC} is explicitly specified.
>
>might cover it, but I'm not sure I've fully untangled the web of option 
>permutations here.  Perhaps someone could tabulate the expected options 
>somewhere for clarity.

yep, I think that's about it. fore please  TIA

>
>The other option would be to error if flag_pic is not set, when -mfdpic is 
>set, which would force the user to be explicit as to which pic options they 
>want (technically the explicit combination -mno-pic -mfdpic has no meaning).

nod

>
>R.
>
>> 
>> 
>>>
>>>if (arm_pic_register_string != NULL)
>>> diff --git a/gcc/testsuite/gcc.target/arm/fdpic-pie.c 
>>> b/gcc/testsuite/gcc.target/arm/fdpic-pie.c
>>> new file mode 100644
>>> index 000..909db8bce74
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/fdpic-pie.c
>>> @@ -0,0 +1,30 @@
>>> +// { dg-do compile }
>>> +// { dg-options "-O2 -fno-pic -mfdpic" }
>>> +// { dg-skip-if "-mpure-code and -fPIC incompatible" { *-*-* } { 
>>> "-mpure-code" } }
>>> +
>>> +__attribute__((visibility("hidden"))) void hidden_fun(void);
>>> +void fun(void);
>>> +__attribute__((visibility("hidden"))) extern int hidden_var;
>>> +extern int var;
>>> +__attribute__((visibility("hidden"))) const int ro_hidden_var = 42;
>>> +
>>> +// { dg-final { scan-assembler "hidden_fun\\(GOTOFFFUNCDESC\\)" } }
>>> +void *addr_hidden_fun(void) { return hidden_fun; }
>>> +
>>> +// { dg-final { scan-assembler "fun\\(GOTFUNCDESC\\)" } }
>>> +void *addr_fun(void) { return fun; }
>>> +
>>> +// { dg-final { scan-assembler "hidden_var\\(GOT\\)" } }
>>> +void *addr_hidden_var(void) { return &hidden_var; }
>>> +
>>> +// { dg-final { scan-assembler "var\\(GOT\\)" } }
>>> +void *addr_var(void) { return &var; }
>>> +
>>> +// { dg-final { scan-assembler ".LANCHOR0\\(GOT\\)" } }
>>> +const int *addr_

Re: [PATCH v2] libgcc, libstdc++: Make TU-local declarations in headers external linkage [PR115126]

2024-09-27 Thread Jason Merrill

On 9/26/24 6:34 AM, Nathaniel Shead wrote:

On Thu, Sep 26, 2024 at 01:46:27PM +1000, Nathaniel Shead wrote:

On Wed, Sep 25, 2024 at 01:30:55PM +0200, Jakub Jelinek wrote:

On Wed, Sep 25, 2024 at 12:18:07PM +0100, Jonathan Wakely wrote:

  And whether similarly we couldn't use
__attribute__((__visibility__ ("hidden"))) on the static block scope
vars for C++ (again, if compiler supports that), so that the changes
don't affect ABI of C++ libraries.


That sounds good too.


Can you use visibility attributes on a local static? I get a warning
that it's ignored.


Indeed :(

And #pragma GCC visibility push(hidden)/#pragma GCC visibility pop around
just the static block scope var definition does nothing.
If it is around the whole inline function though, then it seems to work.
Though, unsure if we want that around the whole header; wonder what it would
do with the weakrefs.

Jakub



Thanks for the thoughts.  WRT visibility, it looks like the main gthr.h
surrounds the whole function in a

   #ifndef HIDE_EXPORTS
   #pragma GCC visibility push(default)
   #endif

block, though I can't quite work out what the purpose of that is here
(since everything is currently internal linkage to start with).

But it sounds like doing something like

   #ifdef __has_attribute
   # if __has_attribute(__always_inline__)
   #  define __GTHREAD_ALWAYS_INLINE __attribute__((__always_inline__))
   # endif
   #endif
   #ifndef __GTHREAD_ALWAYS_INLINE
   # define __GTHREAD_ALWAYS_INLINE
   #endif

   #ifdef __cplusplus
   # define __GTHREAD_INLINE inline __GTHREAD_ALWAYS_INLINE
   #else
   # define __GTHREAD_INLINE static inline
   #endif

and then marking maybe even just the new inline functions with
visibility hidden should be OK?

Nathaniel


Here's a new patch that does this.  Also since v1 it adds another two
internal linkage declarations I'd missed earlier from libstdc++, in
pstl; it turns out that  doesn't include .

Bootstrapped and regtested on x86_64-pc-linux-gnu and
aarch64-unknown-linux-gnu, OK for trunk?

-- >8 --

In C++20, modules streaming check for exposures of TU-local entities.
In general exposing internal linkage functions in a header is liable to
cause ODR violations in C++, and this is now detected in a module
context.

This patch goes through and removes 'static' from many declarations
exposed through libstdc++ to prevent code like the following from
failing:

   export module M;
   extern "C++" {
 #include 
   }

Since gthreads is used from C as well, we need to choose whether to use
'inline' or 'static inline' depending on whether we're compiling for C
or C++ (since the semantics of 'inline' are different between the
languages).  Additionally we need to remove static global variables, so
we migrate these to function-local statics to avoid the ODR issues.


Why function-local static rather than inline variable?


+++ b/libstdc++-v3/include/pstl/algorithm_impl.h
@@ -2890,7 +2890,7 @@ __pattern_includes(__parallel_tag<_IsVector> __tag, 
_ExecutionPolicy&& __exec, _
  });
  }
  
-constexpr auto __set_algo_cut_off = 1000;

+inline constexpr auto __set_algo_cut_off = 1000;
  
+++ b/libstdc++-v3/include/pstl/unseq_backend_simd.h

@@ -22,7 +22,7 @@ namespace __unseq_backend
  {
  
  // Expect vector width up to 64 (or 512 bit)

-const std::size_t __lane_size = 64;
+inline const std::size_t __lane_size = 64;


These changes should not be necessary; the uses of these variables are 
not exposures under https://eel.is/c++draft/basic#link-14.4


Jason



[PATCH 3/3] bpf: set index entry for a VAR_DECL in CO-RE relocs

2024-09-27 Thread Cupertino Miranda
CO-RE accesses with non pointer struct variables will also generate a
"0" string access within the CO-RE relocation.
The first index within the access string, has sort of a different
meaning then the remaining of the indexes.
For i0:i1:...:in being an access index for "struct A a" declaration, its
semantics are represented by:
  (&a + (sizeof(struct A) * i0) + offsetof(i1:...:in)
---
 gcc/config/bpf/core-builtins.cc  |  5 -
 gcc/testsuite/gcc.target/bpf/core-builtin-1.c| 16 
 gcc/testsuite/gcc.target/bpf/core-builtin-2.c|  3 ++-
 .../gcc.target/bpf/core-builtin-exprlist-1.c | 16 
 4 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/gcc/config/bpf/core-builtins.cc b/gcc/config/bpf/core-builtins.cc
index cdfb356660e..fc6379cf028 100644
--- a/gcc/config/bpf/core-builtins.cc
+++ b/gcc/config/bpf/core-builtins.cc
@@ -698,10 +698,13 @@ compute_field_expr (tree node, unsigned int *accessors,
  access_node, false, callback);
   return n;
 
+case VAR_DECL:
+  accessors[0] = 0;
+  return 1;
+
 case ADDR_EXPR:
 case CALL_EXPR:
 case SSA_NAME:
-case VAR_DECL:
 case PARM_DECL:
   return 0;
 default:
diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-1.c 
b/gcc/testsuite/gcc.target/bpf/core-builtin-1.c
index b4f9998afb8..0706005f0e5 100644
--- a/gcc/testsuite/gcc.target/bpf/core-builtin-1.c
+++ b/gcc/testsuite/gcc.target/bpf/core-builtin-1.c
@@ -24,16 +24,16 @@ unsigned long ula[8];
 unsigned long
 func (void)
 {
-  /* 1 */
+  /* 0:1 */
   int b = _(my_s.b);
 
-  /* 2 */
+  /* 0:2 */
   char c = _(my_s.c);
 
-  /* 2:3 */
+  /* 0:2:3 */
   unsigned char uc = _(my_u.uc[3]);
 
-  /* 6 */
+  /* 0:6 */
   unsigned long ul = _(ula[6]);
 
   return b + c + uc + ul;
@@ -55,10 +55,10 @@ u_ptr (union U *pu)
   return x;
 }
 
-/* { dg-final { scan-assembler-times "ascii \"1.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"2:3.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"6.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:1.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:2:3.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:6.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
 /* { dg-final { scan-assembler-times "ascii \"0:2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
 /* { dg-final { scan-assembler-times "ascii \"0:2:3.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
 
diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-2.c 
b/gcc/testsuite/gcc.target/bpf/core-builtin-2.c
index b72e2566b71..04b3f6b2652 100644
--- a/gcc/testsuite/gcc.target/bpf/core-builtin-2.c
+++ b/gcc/testsuite/gcc.target/bpf/core-builtin-2.c
@@ -16,11 +16,12 @@ struct S foo;
 
 void func (void)
 {
+  /* 0:1:3:2 */
   char *x = __builtin_preserve_access_index (&foo.u[3].c);
 
   *x = 's';
 }
 
 /* { dg-final { scan-assembler-times "\[\t \]0x402\[\t 
\]+\[^\n\]*btt_info" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"1:3:2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:1:3:2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
 /* { dg-final { scan-assembler-times "bpfcr_type" 1 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-exprlist-1.c 
b/gcc/testsuite/gcc.target/bpf/core-builtin-exprlist-1.c
index 8ce4a6e70de..c53daf81c5f 100644
--- a/gcc/testsuite/gcc.target/bpf/core-builtin-exprlist-1.c
+++ b/gcc/testsuite/gcc.target/bpf/core-builtin-exprlist-1.c
@@ -31,16 +31,16 @@ func (void)
   int ic;
 
   __builtin_preserve_access_index (({
-/* 1 */
+/* 0:1 */
 b = my_s.b;
 
-/* 2 */
+/* 0:2 */
 ic = my_s.c;
 
-/* 2:3 */
+/* 0:2:3 */
 uc = my_u.uc[3];
 
-/* 6 */
+/* 0:6 */
 ul = ula[6];
   }));
 
@@ -65,10 +65,10 @@ u_ptr (union U *pu)
   return x;
 }
 
-/* { dg-final { scan-assembler-times "ascii \"1.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"2:3.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
-/* { dg-final { scan-assembler-times "ascii \"6.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:1.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:2.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:2:3.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"0:6.0\"\[\t 
\]+\[^\n\]*btf_aux_string" 1 } } */
 /*

Re: [PATCH v3 3/4] tree-optimization/116024 - simplify C1-X cmp C2 for wrapping signed types

2024-09-27 Thread Artemiy Volkov
On 9/27/2024 1:24 PM, Richard Biener wrote:
> On Mon, 23 Sep 2024, Artemiy Volkov wrote:
> 
>> Implement a match.pd transformation inverting the sign of X in
>> C1 - X cmp C2, where C1 and C2 are integer constants and X is
>> of a wrapping signed type, by observing that:
>>
>> (a) If cmp is == or !=, simply move X and C2 to opposite sides of
>> the comparison to arrive at X cmp C1 - C2.
>>
>> (b) If cmp is <:
>>  - C1 - X < C2 means that C1 - X spans the values of -INF,
>>-INF + 1, ..., C2 - 1;
>>  - Therefore, X is one of C1 - -INF, C1 - (-INF + 1), ...,
>>C1 - C2 + 1;
>>  - Subtracting (C1 + 1), X - (C1 + 1) is one of - (-INF) - 1,
>>- (-INF) - 2, ..., -C2;
>>  - Using the fact that - (-INF) - 1 is +INF, derive that
>>X - (C1 + 1) spans the values +INF, +INF - 1, ..., -C2;
>>  - Thus, the original expression can be simplified to
>>X - (C1 + 1) > -C2 - 1.
>>
>> (c) Similarly, C1 - X <= C2 is equivalent to X - (C1 + 1) >= -C2 - 1.
>>
>> (d) The >= and > cases are negations of (b) and (c), respectively.
>>
>> (e) In all cases, the expression -C2 - 1 can be shortened to
>> bit_not (C2).
>>
>> This transformation allows to occasionally save load-immediate /
>> subtraction instructions, e.g. the following statement:
>>
>> 10 - (int)f() >= 20;
>>
>> now compiles to
>>
>> addia0,a0,-11
>> sltia0,a0,-20
>>
>> instead of
>>
>> li  a5,10
>> sub a0,a5,a0
>> sltit0,a0,20
>> xoria0,t0,1
>>
>> on 32-bit RISC-V when compiled with -fwrapv.
>>
>> Additional examples can be found in the newly added test file.  This
>> patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
>> and additionally regtested on riscv32.
>>
>> gcc/ChangeLog:
>>
>>  PR tree-optimization/116024
>>  * match.pd: New transformation around integer comparison.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.dg/tree-ssa/pr116024-1-fwrapv.c: New test.
>>
>> Signed-off-by: Artemiy Volkov 
>> ---
>>   gcc/match.pd  | 20 -
>>   .../gcc.dg/tree-ssa/pr116024-1-fwrapv.c   | 73 +++
>>   2 files changed, 92 insertions(+), 1 deletion(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index d0489789527..bf3b4a2e3fe 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -8970,7 +8970,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>   (cmp (plus @1 (minus @2 @0)) @2))
>>  (if (cmp == LT_EXPR || cmp == GE_EXPR)
>>   (cmp (plus @1 (minus @2
>> -   (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
>> +   (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
>> +/* For wrapping signed types (-fwrapv), transform like so (using < as 
>> example):
>> + C1 - X < C2
>> +  ==>  C1 - X = { -INF, -INF + 1, ..., C2 - 1 }
>> +  ==>  X = { C1 - (-INF), C1 - (-INF + 1), ..., C1 - C2 + 1 }
>> +  ==>  X - (C1 + 1) = { - (-INF) - 1, - (-INF) - 2, ..., -C2 }
>> +  ==>  X - (C1 + 1) = { +INF, +INF - 1, ..., -C2 }
>> +  ==>  X - (C1 + 1) > -C2 - 1
>> +  ==>  X - (C1 + 1) > bit_not (C2)
>> +
>> +  Similarly,
>> + C1 - X <= C2 ==> X - (C1 + 1) >= bit_not (C2);
>> + C1 - X >= C2 ==> X - (C1 + 1) <= bit_not (C2);
>> + C1 - X > C2 ==> X - (C1 + 1) < bit_not (C2).  */
>> +   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
> 
> You need to add && !TYPE_UNSIGNED (TREE_TYPE (@1)) here,
> TYPE_OVERFLOW_WRAPS is also true for unsigned types.

This pattern is written as a series of nested ifs, and this if is 
actually the else arm of the
 if (TYPE_UNSIGNED (TREE_TYPE (@1))
check on line 9078 (introduced in patch #2), so if we got here then 
TYPE_UNSIGNED is already necessarily false (and thus this code already 
only deals with signed wrapping types).  I don't think the extra check 
is necessary, but I can add it if it makes things clearer.  What do you 
think?

Thanks,
Artemiy

> 
> OK with that change.
> 
> Thanks,
> Richard.
> 
>> + (if (cmp == EQ_EXPR || cmp == NE_EXPR)
>> +(cmp @1 (minus @0 @2))
>> + (rcmp (minus @1 (plus @0 { build_one_cst (TREE_TYPE (@1)); }))
>> + (bit_not @2
>>   
>>   /* Canonicalizations of BIT_FIELD_REFs.  */
>>   
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
>> new file mode 100644
>> index 000..c2bf1d17234
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
>> @@ -0,0 +1,73 @@
>> +/* PR tree-optimization/116024 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O1 -fdump-tree-forwprop1-details -fwrapv" } */
>> +
>> +#include 
>> +
>> +uint32_t f(void);
>> +
>> +int32_t i2(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 10 - (int32_t)f();
>> +  return l <= 20; // f() - 11 >= -21
>> +}
>> +
>> +int32_t i2a(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 10 - (int32_t)f();
>> +  return l 

[PATCH v2] c++: concept in default argument [PR109859]

2024-09-27 Thread Marek Polacek
On Fri, Sep 27, 2024 at 04:57:58PM -0400, Jason Merrill wrote:
> On 9/18/24 5:06 PM, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > 1) We're hitting the assert in cp_parser_placeholder_type_specifier.
> > It says that if it turns out to be false, we should do error() instead.
> > Do so, then.
> > 
> > 2) lambda-targ8.C should compile fine, though.  The problem was that
> > local_variables_forbidden_p wasn't cleared when we're about to parse
> > the optional template-parameter-list for a lambda in a default argument.
> > 
> > PR c++/109859
> > 
> > gcc/cp/ChangeLog:
> > 
> > * parser.cc (cp_parser_lambda_declarator_opt): Temporarily clear
> > local_variables_forbidden_p.
> > (cp_parser_placeholder_type_specifier): Turn an assert into an error.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/concepts-defarg3.C: New test.
> > * g++.dg/cpp2a/lambda-targ8.C: New test.
> > ---
> >   gcc/cp/parser.cc  |  9 +++--
> >   gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C |  8 
> >   gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C | 10 ++
> >   3 files changed, 25 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C
> > 
> > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> > index 4dd9474cf60..bdc4fef243a 100644
> > --- a/gcc/cp/parser.cc
> > +++ b/gcc/cp/parser.cc
> > @@ -11891,6 +11891,11 @@ cp_parser_lambda_declarator_opt (cp_parser* 
> > parser, tree lambda_expr)
> >  "lambda templates are only available with "
> >  "%<-std=c++20%> or %<-std=gnu++20%>");
> > +  /* Even though the whole lambda may be a default argument, its
> > +template-parameter-list is a context where it's OK to create
> > +new parameters.  */
> > +  auto lvf = make_temp_override (parser->local_variables_forbidden_p, 
> > 0u);
> > +
> > cp_lexer_consume_token (parser->lexer);
> > template_param_list = cp_parser_template_parameter_list (parser);
> > @@ -20978,8 +20983,8 @@ cp_parser_placeholder_type_specifier (cp_parser 
> > *parser, location_t loc,
> > /* In a default argument we may not be creating new parameters.  */
> > if (parser->local_variables_forbidden_p & LOCAL_VARS_FORBIDDEN)
> > {
> > - /* If this assert turns out to be false, do error() instead.  */
> > - gcc_assert (tentative);
> > + if (!tentative)
> > +   error_at (loc, "local variables may not appear in this context");
> 
> There's no local variable in the new testcase, the error should talk about a
> concept-name.

Ah sure.  So like this?

Tested dg.exp.

-- >8 --
1) We're hitting the assert in cp_parser_placeholder_type_specifier.
It says that if it turns out to be false, we should do error() instead.
Do so, then.

2) lambda-targ8.C should compile fine, though.  The problem was that
local_variables_forbidden_p wasn't cleared when we're about to parse
the optional template-parameter-list for a lambda in a default argument.

PR c++/109859

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_declarator_opt): Temporarily clear
local_variables_forbidden_p.
(cp_parser_placeholder_type_specifier): Turn an assert into an error.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-defarg3.C: New test.
* g++.dg/cpp2a/lambda-targ8.C: New test.
---
 gcc/cp/parser.cc  |  9 +++--
 gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C |  8 
 gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C | 10 ++
 3 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index f50534f5f39..a92e6a29ba6 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11891,6 +11891,11 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
 "lambda templates are only available with "
 "%<-std=c++20%> or %<-std=gnu++20%>");
 
+  /* Even though the whole lambda may be a default argument, its
+template-parameter-list is a context where it's OK to create
+new parameters.  */
+  auto lvf = make_temp_override (parser->local_variables_forbidden_p, 0u);
+
   cp_lexer_consume_token (parser->lexer);
 
   template_param_list = cp_parser_template_parameter_list (parser);
@@ -20989,8 +20994,8 @@ cp_parser_placeholder_type_specifier (cp_parser 
*parser, location_t loc,
   /* In a default argument we may not be creating new parameters.  */
   if (parser->local_variables_forbidden_p & LOCAL_VARS_FORBIDDEN)
{
- /* If this assert turns out to be false, do error() instead.  */
- gcc_assert (tentative);
+ if (!tentative)
+   

Re: Fwd: [patch, fortran] Matmul and dot_product for unsigned

2024-09-27 Thread Mikael Morin

Le 26/09/2024 à 21:57, Thomas Koenig a écrit :


Now for the remaining intrinsics (FINDLOC, MAXLOC,
MINLOC, MAXVAL, MINVAL, CSHIFT and EOSHIFT still missing).

I have one patch series touching (inline) MINLOC and MAXLOC to post in 
the coming days.  Could you please keep away from them for one more week 
or two?


Re: [PATCH] RISC-V/libgcc: Save/Restore routines for E goes with ABI.

2024-09-27 Thread Kito Cheng
LGTM, you just need write few more boring ChangeLog in commit log like below:

libgcc/ChangeLog:
   * config/riscv/save-restore.S: Check with __riscv_abi_rve
rather than __riscv_32e.

Anyway I committed to trunk with that changelog :)

On Fri, Sep 27, 2024 at 5:19 PM Tsung Chun Lin  wrote:
>
> Hi,
>
> This is my first patch of GCC. It there are any problems, please let me know.


Re: [PATCH 2/2] HTML Dumping of trees from gdb

2024-09-27 Thread Richard Biener
On Sun, Sep 22, 2024 at 5:49 AM -thor  wrote:
>
> From: thor 
>
> This patch allows one to dump a tree as HTML from within gdb by invoking,
> i.e,
>   htlml-tree tree

I have managed to get a browser window launched with the following
incremental patch
(xdg-open should be a better default than firefox, given it uses the
desktop configured
default browser).  I used the temporary file name given that CWD based
relative names
likely pose problems.  I needed (FILE *) and (int) casts as otherwise
without glibc
debuginfo gdb will complain.

The experience browsing is a bit sub-par, but that could be worked on.
I suppose
using a web service we can feed the local temporary JSON object that
it visualizes
directly with some javascript would be more scalable/customizable than
outputting
good html from within gdbhooks.  There are multiple existing services
but I'm not
sure if we can feed those a local file by customizing the URL (or if
that would be
allowed by browsers).

The current state is useful for evaluating and working on the JSON
dumping itself
though which is good enough.  There's a delicate balance of having the actual
JSON being nice and compact to view vs. having separate parts of the data
available as separate JSON objects - that shows for example with location data
that's like

  "expr_loc": [{"file": "t.c",
"line": 6,
"column": 22}]

where t.c:6:22 is the compact way to visualize but separating out line
and column
is of course better.  There's the chance for the JSON to X to do such massaging
which requires of course understanding the semantics of the data.

I think the JSON data should error on the verbose side so it can
subsume the -raw
dumping we have.

Thanks,
Richard.

> gcc/ChangeLog:
> * gcc/gdbhooks.py: Rudimentary dumping of GENERIC trees as html through
>   one new python function (jsonNodeToHtml) and one new gdb command
>   (html-tree). There is also a parameter to allow html-tree to
>   automatically open a browser to view the HTML, but that needs a fix
>   or workaround that I don't understand.
>
> Signed-off-by: Thor C Preimesberger 
>
> ---
>  gcc/gdbhooks.py | 113 
>  1 file changed, 113 insertions(+)
>
> diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
> index 904ee28423a..fff85d738b4 100644
> --- a/gcc/gdbhooks.py
> +++ b/gcc/gdbhooks.py
> @@ -143,6 +143,7 @@ import os.path
>  import re
>  import sys
>  import tempfile
> +import json
>
>  import gdb
>  import gdb.printing
> @@ -889,6 +890,118 @@ class DotFn(gdb.Command):
>
>  DotFn()
>
> +# Quick and dirty way to turn a tree as JSON object into HTML.
> +# Used in treeToHtml.
> +
> +def jsonNodeToHtml(node_list, html_doc):
> +for node in node_list:
> +id = node["addr"]
> +html_doc.write("" % id)
> +for key, value in node.items():
> +if (key == "addr"):
> +html_doc.write("addr:")
> +html_doc.write(f"")
> +html_doc.write(f"{value}")
> +html_doc.write(f"")
> +if (type(value) == dict):
> +html_doc.write(f"{key}:")
> +sub = value
> +if "ref_addr" in sub.keys():
> +html_doc.write(f"")
> +subAddress = sub["ref_addr"]
> +subCode = sub["tree_code"]
> +html_doc.write(f"ref_addr:  href=#{subAddress}>{subAddress}")
> +html_doc.write(f"tree_code: {subCode}")
> +html_doc.write("")
> +# Currently, child tree nodes that are referred to by OMP
> +# accsessors are not dumped recursively by
> +# dump_generic_node_json, i.e. they have no corresponding
> +# entry in node_list. So we just read it out key-value pairs.
> +else:
> +html_doc.write(f"")
> +for key, value in sub.items():
> +html_doc.write(f"{key}: {value}")
> +html_doc.write("")
> +elif (type(value) == list):
> +html_doc.write(f"{key}:")
> +html_doc.write(f"")
> +for i in value:
> +for key, value in i.items():
> +if (key == "ref_addr"):
> +html_doc.write("ref_addr:")
> +html_doc.write(f"")
> +html_doc.write(f"{value}")
> +html_doc.write(f"")
> +else:
> +html_doc.write(f"{key}: {value}")
> +html_doc.write("")
> +elif (key != "addr"):
> +html_doc.write(f"{key}: {value}")
> +html_doc.write("")
> +
> +class GCChtml (gdb.Parameter):
> +"""
> +This parameter defines what program is used to view HTML files
> +by the html-tree command. It w

Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread Lewis Hyatt
On Fri, Sep 27, 2024 at 9:41 AM David Malcolm  wrote:
>
> On Thu, 2024-09-26 at 23:28 +0200, Jakub Jelinek wrote:
> > Hi!
> >
> > The following patch on top of the just posted cleanup patch
> > saves/restores the m_classification_history and m_push_list
> > vectors for PCH.  Without that as the testcase shows during parsing
> > of the templates we don't report ignored diagnostics, but after
> > loading
> > PCH header when instantiating those templates those warnings can be
> > emitted.  This doesn't show up on x86_64-linux build because
> > configure
> > injects there -fcf-protection -mshstk flags during library build (and
> > so
> > also during PCH header creation), but make check doesn't use those
> > flags
> > and so the PCH header is ignored.
> >
> > Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-linux
> > and
> > i686-linux still pending, ok for trunk if it passes it?
>
> Thanks, yes please
>
> Dave
>

A couple comments that may be helpful...

-This is also PR 64117 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117)

-I submitted a patch last year for that but did not get any response
(https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html).
I guess I never pinged it because I am still trying to ping two other
ones :). My patch did not switch to vec so it was not as nice as this
one. I wonder though, if some of the testcases I added could be
incorporated? In particular the testcase from my patch
pragma-diagnostic-3.C I believe will still be failing after this one.
There is an issue with C++ because it processes the pragmas twice,
once in early mode and once in normal mode, that makes it do the wrong
thing for this case:

t.h:

 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored...
 //no pop at end of the file

t.c

 #include "t.h"
 #pragma GCC diagnostic pop
 //expect to be at the initial state here, but won't be if t.h is a PCH

In my patch I had separated the PCH restore from a more general "state
restore" logic so that the C++ frontend can restore the state after
the first pass through the data.

-Lewis


Re: [PATCH 2/2] HTML Dumping of trees from gdb

2024-09-27 Thread David Malcolm
On Sat, 2024-09-21 at 22:49 -0500, -thor wrote:
> From: thor 
> 
> This patch allows one to dump a tree as HTML from within gdb by
> invoking,
> i.e,
>   htlml-tree tree
> 
> gcc/ChangeLog:
>     * gcc/gdbhooks.py: Rudimentary dumping of GENERIC trees as html
> through
>   one new python function (jsonNodeToHtml) and one new gdb
> command 
>   (html-tree). There is also a parameter to allow html-tree to 
>   automatically open a browser to view the HTML, but that needs a
> fix
>   or workaround that I don't understand.
> 
> Signed-off-by: Thor C Preimesberger 


Hi Thor, thanks for the patch.

I didn't try running it, but I notice that the patch is building the
HTML directly by writing strings to the output file, using python f-
strings, and there's no escaping of values.  Hence if a value contains
characters like '"', '<', or '>' the resulting HTML will be ill-formed
(similar to a SQL injection attack).

You probably need to use html.escape when writing string values from
the JSON into the HTML; see:
  https://docs.python.org/3/library/html.html#html.escape

Another approach would be to the HTML as a DOM tree in the python
script, and then serialize that; see e.g.:
  https://docs.python.org/3/library/xml.etree.elementtree.html
for a relatively simple API that's readily available in the Python
standard library - but that would be a rewrite of jsonNodeToHtml (but
probably be more robust in the long term).

Hope this is helpful
Dave

> 
> ---
>  gcc/gdbhooks.py | 113
> 
>  1 file changed, 113 insertions(+)
> 
> diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
> index 904ee28423a..fff85d738b4 100644
> --- a/gcc/gdbhooks.py
> +++ b/gcc/gdbhooks.py
> @@ -143,6 +143,7 @@ import os.path
>  import re
>  import sys
>  import tempfile
> +import json
>  
>  import gdb
>  import gdb.printing
> @@ -889,6 +890,118 @@ class DotFn(gdb.Command):
>  
>  DotFn()
>  
> +# Quick and dirty way to turn a tree as JSON object into HTML.
> +# Used in treeToHtml.
> +
> +def jsonNodeToHtml(node_list, html_doc):
> +    for node in node_list:
> +    id = node["addr"]
> +    html_doc.write("" % id)
> +    for key, value in node.items():
> +    if (key == "addr"):
> +    html_doc.write("addr:")
> +    html_doc.write(f"")
> +    html_doc.write(f"{value}")
> +    html_doc.write(f"")
> +    if (type(value) == dict):
> +    html_doc.write(f"{key}:")
> +    sub = value
> +    if "ref_addr" in sub.keys():
> +    html_doc.write(f" 10px\">") 
> +    subAddress = sub["ref_addr"]
> +    subCode = sub["tree_code"]
> +    html_doc.write(f"ref_addr:  href=#{subAddress}>{subAddress}")
> +    html_doc.write(f"tree_code: {subCode}")
> +    html_doc.write("")
> +    # Currently, child tree nodes that are referred to
> by OMP
> +    # accsessors are not dumped recursively by
> +    # dump_generic_node_json, i.e. they have no
> corresponding
> +    # entry in node_list. So we just read it out key-
> value pairs.
> +    else:
> +    html_doc.write(f" px\">") 
> +    for key, value in sub.items():
> +    html_doc.write(f"{key}: {value}")
> +    html_doc.write("")
> +    elif (type(value) == list):
> +    html_doc.write(f"{key}:")
> +    html_doc.write(f"") 
> +    for i in value:
> +    for key, value in i.items():
> +    if (key == "ref_addr"):
> +    html_doc.write("ref_addr:")
> +    html_doc.write(f"")
> +    html_doc.write(f"{value}")
> +    html_doc.write(f"")
> +    else:
> +    html_doc.write(f"{key}: {value}")
> +    html_doc.write("")
> +    elif (key != "addr"):
> +    html_doc.write(f"{key}: {value}")
> +    html_doc.write("")
> +
> +class GCChtml (gdb.Parameter):
> +    """
> +    This parameter defines what program is used to view HTML files
> +    by the html-tree command. It will be invoked as gcc-html  file>.
> +    """
> +    def __init__(self):
> +    super(GCChtml, self).__init__('gcc-html',
> +    gdb.COMMAND_NONE, gdb.PARAM_STRING)
> +    self.value = "firefox"
> +
> +gcc_html_cmd = GCChtml()
> +
> +class treeToHtml (gdb.Command):
> +    """
> +    A custom command that converts a tree to html after it is
> +    first parsed to JSON. The html is saved in cwd as  +
> ".html".
> +    
> +    TODO : It'd be nice if we then open the html with the program
> specified
> +    by the GCChtml parameter, but there's an error thrown whenever I
> try
> +    to do this while attached 

Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 12:14:47PM +0200, Richard Biener wrote:
> I can investigate a bit when there's a testcase showing the issue.

The testcase is pr78687.C with Marek's cp-gimplify.cc patch.

Or the
struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; } 
u; unsigned long w; };
void bar (struct S *);

void
foo ()
{
  struct S s = { .u = { .v = {} }, .w = 2 };
  bar (&s);
}
I've mentioned shows the same behavior except for SRA (which
is there of course prevented through the object escaping).
Though, not really sure right now if this reduced testcase
in C or C++ actually isn't supposed to clear the whole object rather than
just the initialized fields and what exactly is CONSTRUCTOR_NO_CLEARING
vs. !CONSTRUCTOR_NO_CLEARING supposed to mean.

On pr78687.C with Marek's patch, CONSTRUCTOR_NO_CLEARING is cleared with
  /* The result of a constexpr function must be completely initialized.

 However, in C++20, a constexpr constructor doesn't necessarily have
 to initialize all the fields, so we don't clear CONSTRUCTOR_NO_CLEARING
 in order to detect reading an unitialized object in constexpr instead
 of value-initializing it.  (reduced_constant_expression_p is expected to
 take care of clearing the flag.)  */
  if (TREE_CODE (result) == CONSTRUCTOR
  && (cxx_dialect < cxx20
  || !DECL_CONSTRUCTOR_P (fun)))
clear_no_implicit_zero (result);

generic.texi says:
"Unrepresented fields will be cleared (zeroed), unless the
CONSTRUCTOR_NO_CLEARING flag is set, in which case their value becomes
undefined."
Now, for RECORD_TYPE, I think the !CONSTRUCTOR_NO_CLEARING meaning is clear,
if the flag isn't set, then if there is no constructor_elt for certain
FIELD_DECL, that FIELD_DECL is implicitly zeroed.  The state of padding bits
is fuzzy, we don't really have a flag whether the CONSTRUCTOR clears also
padding bits (aka C++ zero initialization) or not.
The problem above is with UNION_TYPE.  If the CONSTRUCTOR for it is empty,
that should IMHO still imply zero initialization of the whole thing, we
don't really know which union member is active.  But if the CONSTRUCTOR
has one elt (it should never have more than one), shouldn't that mean
(unless there is a new flag which says that padding bits are cleared too)
that CONSTRUCTOR_NO_CLEARING and !CONSTRUCTOR_NO_CLEARING behave the same,
in particular that the selected FIELD_DECL is initialized to whatever
the initializer is but nothing else is?

The reason why the gimplifier clears the whole struct is because
on (with Marek's patch on the pr78687.C testcase)
D.10137 = 
{._storage={.D.9542={.D.9123={._tail={.D.9181={._tail={.D.9240={._head={}}},
 ._which=2}};
or that
s = {.u={.v={}}, .w=2};
in the above testcase, categorize_ctor_elements yields
valid_const_initializer = true
num_nonzero_elements = 1
num_unique_nonzero_elements = 1
num_ctor_elements = 1
complete_p = false
- there is a single non-empty initializer in both CONSTRUCTORs,
they aren't CONSTRUCTOR_NO_CLEARING and the reason complete_p is false
is that categorize_ctor_elements_1 does:
  if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor),
num_fields, elt_type))
*p_complete = 0;
  else if (*p_complete > 0
   && type_has_padding_at_level_p (TREE_TYPE (ctor)))
*p_complete = -1;
and for UNION_TYPE/QUAL_UNION_TYPE complete_ctor_at_level_p does:
  if (num_elts == 0)
return false;
...
  /* ??? We could look at each element of the union, and find the
 largest element.  Which would avoid comparing the size of the
 initialized element against any tail padding in the union.
 Doesn't seem worth the effort...  */
  return simple_cst_equal (TYPE_SIZE (type), TYPE_SIZE (last_type)) == 1;
Now, given the documentation of categorize_ctor_elements:
   * whether the constructor is complete -- in the sense that every
 meaningful byte is explicitly given a value --
 and place it in *P_COMPLETE:
 -  0 if any field is missing
 -  1 if all fields are initialized, and there's no padding
 - -1 if all fields are initialized, but there's padding
I'd argue this handling of UNION_TYPE/QUAL_UNION_TYPE is wrong
(though note that type_has_padding_at_level_p returns true if any of the
union members is smaller than the whole, rather than checking whether
the actually initialized one has the same size as whole), it will
set *p_complete to 0 as if any field is missing, even when no field
is missing, just the union has padding.

So, I think we should go with (but so far completely untested except
for pr78687.C which is optimized with Marek's patch and the above testcase
which doesn't have the clearing anymore) the following patch.

2024-09-27  Jakub Jelinek  

PR c++/116416
* expr.cc (categorize_ctor_elements_1): Fix up union handling of
*p_complete.  Clear it only if num_fields is 0 and the union has
at least one FIELD_DECL, se

[committed] i386: Modernize AMD processor types

2024-09-27 Thread Uros Bizjak
Use iterative PTA definitions for members of the same AMD processor family.

Also, fix a couple of related M_CPU_TYPE/M_CPU_SUBTYPE inconsistencies.

No functional changes intended.

gcc/ChangeLog:

* config/i386/i386.h: Add PTA_BDVER1, PTA_BDVER2, PTA_BDVER3,
PTA_BDVER4, PTA_BTVER1 and PTA_BTVER2.
* common/config/i386/i386-common.cc (processor_alias_table)
<"bdver1">: Use PTA_BDVER1.
<"bdver2">: Use PTA_BDVER2.
<"bdver3">: Use PTA_BDVER3.
<"bdver4">: Use PTA_BDVER4.
<"btver1">: Use PTA_BTVER1.  Use M_CPU_TYPE (AMD_BTVER1).
<"btver2">: Use PTA_BTVER2.
<"shanghai>: Use M_CPU_SUBTYPE (AMDFAM10H_SHANGHAI).
<"istanbul>: Use M_CPU_SUBTYPE (AMDFAM10H_ISTANBUL).

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index fb744319b05..3f2fc599009 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -2348,34 +2348,16 @@ const pta processor_alias_table[] =
   | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_PRFCHW | PTA_FXSR,
 M_CPU_SUBTYPE (AMDFAM10H_BARCELONA), P_PROC_DYNAMIC},
   {"bdver1", PROCESSOR_BDVER1, CPU_BDVER1,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
-  | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
-  | PTA_XOP | PTA_LWP | PTA_PRFCHW | PTA_FXSR | PTA_XSAVE,
-M_CPU_TYPE (AMDFAM15H_BDVER1), P_PROC_XOP},
+PTA_BDVER1,
+M_CPU_SUBTYPE (AMDFAM15H_BDVER1), P_PROC_XOP},
   {"bdver2", PROCESSOR_BDVER2, CPU_BDVER2,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
-  | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
-  | PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
-  | PTA_FMA | PTA_PRFCHW | PTA_FXSR | PTA_XSAVE,
-M_CPU_TYPE (AMDFAM15H_BDVER2), P_PROC_FMA},
+PTA_BDVER2,
+M_CPU_SUBTYPE (AMDFAM15H_BDVER2), P_PROC_FMA},
   {"bdver3", PROCESSOR_BDVER3, CPU_BDVER3,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
-  | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
-  | PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
-  | PTA_FMA | PTA_PRFCHW | PTA_FXSR | PTA_XSAVE
-  | PTA_XSAVEOPT | PTA_FSGSBASE,
+PTA_BDVER3,
 M_CPU_SUBTYPE (AMDFAM15H_BDVER3), P_PROC_FMA},
   {"bdver4", PROCESSOR_BDVER4, CPU_BDVER4,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
-  | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_AVX2
-  | PTA_FMA4 | PTA_XOP | PTA_LWP | PTA_BMI | PTA_BMI2
-  | PTA_TBM | PTA_F16C | PTA_FMA | PTA_PRFCHW | PTA_FXSR
-  | PTA_XSAVE | PTA_XSAVEOPT | PTA_FSGSBASE | PTA_RDRND
-  | PTA_MOVBE | PTA_MWAITX,
+PTA_BDVER4,
 M_CPU_SUBTYPE (AMDFAM15H_BDVER4), P_PROC_AVX2},
   {"znver1", PROCESSOR_ZNVER1, CPU_ZNVER1,
 PTA_ZNVER1,
@@ -2393,16 +2375,10 @@ const pta processor_alias_table[] =
 PTA_ZNVER5,
 M_CPU_SUBTYPE (AMDFAM1AH_ZNVER5), P_PROC_AVX512F},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
-  | PTA_SSSE3 | PTA_SSE4A | PTA_ABM | PTA_CX16 | PTA_PRFCHW
-  | PTA_FXSR | PTA_XSAVE,
-   M_CPU_SUBTYPE (AMDFAM15H_BDVER1), P_PROC_SSE4_A},
+PTA_BTVER1,
+M_CPU_TYPE (AMD_BTVER1), P_PROC_SSE4_A},
   {"btver2", PROCESSOR_BTVER2, CPU_BTVER2,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
-  | PTA_SSSE3 | PTA_SSE4A | PTA_ABM | PTA_CX16 | PTA_SSE4_1
-  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
-  | PTA_BMI | PTA_F16C | PTA_MOVBE | PTA_PRFCHW
-  | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT,
+PTA_BTVER2,
 M_CPU_TYPE (AMD_BTVER2), P_PROC_BMI},
 
   {"generic", PROCESSOR_GENERIC, CPU_GENERIC,
@@ -2421,9 +2397,9 @@ const pta processor_alias_table[] =
   {"amdfam19h", PROCESSOR_GENERIC, CPU_GENERIC, 0,
 M_CPU_TYPE (AMDFAM19H), P_NONE},
   {"shanghai", PROCESSOR_GENERIC, CPU_GENERIC, 0,
-M_CPU_TYPE (AMDFAM10H_SHANGHAI), P_NONE},
+M_CPU_SUBTYPE (AMDFAM10H_SHANGHAI), P_NONE},
   {"istanbul", PROCESSOR_GENERIC, CPU_GENERIC, 0,
-M_CPU_TYPE (AMDFAM10H_ISTANBUL), P_NONE},
+M_CPU_SUBTYPE (AMDFAM10H_ISTANBUL), P_NONE},
 };
 
 /* NB: processor_alias_table stops at the "generic" entry.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 58e6f2826bf..82177b9d383 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2429,6 +2429,18 @@ constexpr wide_int_bitmask PTA_CLEARWATERFOREST = 
PTA_SIERRAFOREST
   | PTA_AVXVNNIINT16 | PTA_SHA512 | PTA_SM3 | PTA_SM4 | PTA_USER_MSR
   | PTA_PREFETCHI;
 constexpr wide_int_bitmask PTA_PANTHERLAKE = PTA_ARROWLAKE_S | PTA_PREFETCHI;
+
+constexpr wide_int_bitmask PTA_BDVER1 = PTA_64BIT | PTA_MMX | PTA_SSE
+  | PTA_SSE2 | PTA_SSE3 | PTA_

Re: [PATCH] arm: Force flag_pic for FDPIC

2024-09-27 Thread Richard Earnshaw (lists)
On 26/09/2024 19:21, Ramana Radhakrishnan wrote:
> On Mon, Mar 4, 2024 at 1:43 PM Fangrui Song  wrote:
>>
>> From: Fangrui Song 
>>
>> -fno-pic -mfdpic generated code is like regular -fno-pic, not suitable
>> for FDPIC (absolute addressing for symbol references and no function
>> descriptor).  The sh port simply upgrades -fno-pic to -fpie by setting
>> flag_pic.  Let's follow suit.
>>
>> Link: 
>> https://inbox.sourceware.org/gcc-patches/20150913165303.gc17...@brightrain.aerifal.cx/
>>
>> gcc/ChangeLog:
>>
>> * config/arm/arm.cc (arm_option_override): Set flag_pic if
>>   TARGET_FDPIC.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/arm/fdpic-pie.c: New test.
>> ---
>>  gcc/config/arm/arm.cc|  6 +
>>  gcc/testsuite/gcc.target/arm/fdpic-pie.c | 30 
>>  2 files changed, 36 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.target/arm/fdpic-pie.c
>>
>> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
>> index 1cd69268ee9..f2fd3cce48c 100644
>> --- a/gcc/config/arm/arm.cc
>> +++ b/gcc/config/arm/arm.cc
>> @@ -3682,6 +3682,12 @@ arm_option_override (void)
>>arm_pic_register = FDPIC_REGNUM;
>>if (TARGET_THUMB1)
>> sorry ("FDPIC mode is not supported in Thumb-1 mode");
>> +
>> +  /* FDPIC code is a special form of PIC, and the vast majority of code
>> +generation constraints that apply to PIC also apply to FDPIC, so we
>> + set flag_pic to avoid the need to check TARGET_FDPIC everywhere
>> + flag_pic is checked. */
>> +  flag_pic = 2;
>>  }
> 
> Been a while since I looked at this stuff but should this not be
> flag_pie being set rather than flag_pic here if the expectation is to
> turn -fno-PIC -mfdpic into fPIE ?

-fPIE implies -fPIC, but has the added implication that any definition of an 
object we see is the one true definition and cannot be pre-empted during 
loading (in a shared library, a definition of X may be pre-empted by another 
definition of X in either the application itself or another shared library that 
was loaded first).

Part of the confusion comes from the manual, though:

Select the FDPIC ABI, which uses 64-bit function descriptors to
represent pointers to functions.  When the compiler is configured for
@code{arm-*-uclinuxfdpiceabi} targets, this option is on by default
and implies @option{-fPIE} if none of the PIC/PIE-related options is
provided.  On other targets, it only enables the FDPIC-specific code
generation features, and the user should explicitly provide the
PIC/PIE-related options as needed.

Which conflates things relating to the option flag and the compiler 
configuration.  I think that needs clearing up as well.  Something like

Select the FDPIC ABI, which uses 64-bit function descriptors to
represent pointers to functions.  @option{-mfdpic} implies @option{-fPIC}.

When the compiler is configured for @code{arm-*-uclinuxfdpiceabi} targets, 
this option is on by default and the compiler defaults to @option{-fPIE}, 
unless @option{-fPIC} is explicitly specified.

might cover it, but I'm not sure I've fully untangled the web of option 
permutations here.  Perhaps someone could tabulate the expected options 
somewhere for clarity.

The other option would be to error if flag_pic is not set, when -mfdpic is set, 
which would force the user to be explicit as to which pic options they want 
(technically the explicit combination -mno-pic -mfdpic has no meaning).

R.

> 
> 
>>
>>if (arm_pic_register_string != NULL)
>> diff --git a/gcc/testsuite/gcc.target/arm/fdpic-pie.c 
>> b/gcc/testsuite/gcc.target/arm/fdpic-pie.c
>> new file mode 100644
>> index 000..909db8bce74
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/fdpic-pie.c
>> @@ -0,0 +1,30 @@
>> +// { dg-do compile }
>> +// { dg-options "-O2 -fno-pic -mfdpic" }
>> +// { dg-skip-if "-mpure-code and -fPIC incompatible" { *-*-* } { 
>> "-mpure-code" } }
>> +
>> +__attribute__((visibility("hidden"))) void hidden_fun(void);
>> +void fun(void);
>> +__attribute__((visibility("hidden"))) extern int hidden_var;
>> +extern int var;
>> +__attribute__((visibility("hidden"))) const int ro_hidden_var = 42;
>> +
>> +// { dg-final { scan-assembler "hidden_fun\\(GOTOFFFUNCDESC\\)" } }
>> +void *addr_hidden_fun(void) { return hidden_fun; }
>> +
>> +// { dg-final { scan-assembler "fun\\(GOTFUNCDESC\\)" } }
>> +void *addr_fun(void) { return fun; }
>> +
>> +// { dg-final { scan-assembler "hidden_var\\(GOT\\)" } }
>> +void *addr_hidden_var(void) { return &hidden_var; }
>> +
>> +// { dg-final { scan-assembler "var\\(GOT\\)" } }
>> +void *addr_var(void) { return &var; }
>> +
>> +// { dg-final { scan-assembler ".LANCHOR0\\(GOT\\)" } }
>> +const int *addr_ro_hidden_var(void) { return &ro_hidden_var; }
>> +
>> +// { dg-final { scan-assembler "hidden_var\\(GOT\\)" } }
>> +int read_hidden_var(void) { return hidden_var; }
>> +
>> +// { dg-final { scan-assembler "var\\(GOT\\)" } }
>> +int read_var(void

[PATCH] RISC-V/libgcc: Save/Restore routines for E goes with ABI.

2024-09-27 Thread Tsung Chun Lin
Hi,

This is my first patch of GCC. It there are any problems, please let me know.


0001-RISC-V-libgcc-Save-Restore-routines-for-E-goes-with-.patch
Description: Binary data


[PATCH 1/3] bpf: make sure CO-RE relocs are never typed with a BTF_KIND_CONST

2024-09-27 Thread Cupertino Miranda
Based on observation within bpf-next selftests and comparisson of GCC
and clang compiled code, the BPF loader expects all CO-RE relocations to
point to BTF non const type nodes.
---
 gcc/btfout.cc |  2 +-
 gcc/config/bpf/btfext-out.cc  |  6 
 gcc/ctfc.h|  2 ++
 .../gcc.target/bpf/core-attr-const.c  | 32 +++
 4 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/core-attr-const.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 8b91bde8798..24f62ec1a52 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -167,7 +167,7 @@ get_btf_kind (uint32_t ctf_kind)
 
 /* Convenience wrapper around get_btf_kind for the common case.  */
 
-static uint32_t
+uint32_t
 btf_dtd_kind (ctf_dtdef_ref dtd)
 {
   if (!dtd)
diff --git a/gcc/config/bpf/btfext-out.cc b/gcc/config/bpf/btfext-out.cc
index 095c35b894b..655da23066d 100644
--- a/gcc/config/bpf/btfext-out.cc
+++ b/gcc/config/bpf/btfext-out.cc
@@ -320,6 +320,12 @@ bpf_core_reloc_add (const tree type, const char * 
section_name,
   ctf_container_ref ctfc = ctf_get_tu_ctfc ();
   ctf_dtdef_ref dtd = ctf_lookup_tree_type (ctfc, type);
 
+  /* Make sure CO-RE type is never the const version.  */
+  if (btf_dtd_kind (dtd) == BTF_KIND_CONST
+  && kind >= BPF_RELO_FIELD_BYTE_OFFSET
+  && kind <= BPF_RELO_FIELD_RSHIFT_U64)
+dtd = dtd->ref_type;
+
   /* Buffer the access string in the auxiliary strtab.  */
   bpfcr->bpfcr_astr_off = 0;
   gcc_assert (accessor != NULL);
diff --git a/gcc/ctfc.h b/gcc/ctfc.h
index 41e1169f271..e5967f590f9 100644
--- a/gcc/ctfc.h
+++ b/gcc/ctfc.h
@@ -465,4 +465,6 @@ extern void btf_mark_type_used (tree);
 extern int ctfc_get_dtd_srcloc (ctf_dtdef_ref, ctf_srcloc_ref);
 extern int ctfc_get_dvd_srcloc (ctf_dvdef_ref, ctf_srcloc_ref);
 
+extern uint32_t btf_dtd_kind (ctf_dtdef_ref dtd);
+
 #endif /* GCC_CTFC_H */
diff --git a/gcc/testsuite/gcc.target/bpf/core-attr-const.c 
b/gcc/testsuite/gcc.target/bpf/core-attr-const.c
new file mode 100644
index 000..34a4a9cc5e8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/core-attr-const.c
@@ -0,0 +1,32 @@
+/* Test to make sure CO-RE access relocs point to non const versions of the
+   type.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -dA -gbtf -mco-re -masm=normal" } */
+
+struct S {
+  int a;
+  int b;
+  int c;
+} __attribute__((preserve_access_index));
+
+void
+func (struct S * s)
+{
+  int *x;
+  int *y;
+  const struct S *cs = s;
+
+  /* 0:2 */
+  x = &(s->c);
+
+  /* 0:2 */
+  y = (int *) &(cs->c);
+
+  *x = 4;
+  *y = 4;
+}
+
+/* Both const and non const struct type should have the same bpfcr_type. */
+/* { dg-final { scan-assembler-times "0x1\t# bpfcr_type \\(struct S\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "0x1\t# bpfcr_type \\(const struct S\\)" 
1 } } */
-- 
2.39.5



[PATCH 2/3] bpf: calls do not promote attr access_index on lhs

2024-09-27 Thread Cupertino Miranda
When traversing gimple to introduce CO-RE relocation entries to
expressions that are accesses to attributed perserve_access_index types,
the access is likely to be split in multiple gimple statments.
In order to keep doing the proper CO-RE convertion we will need to mark
the LHS tree nodes of gimple expressions as explicit CO-RE accesses,
such that the gimple traverser will further convert the sub-expressions.

This patch makes sure that this LHS marking will not happen in case the
gimple statement is a function call, which case it is no longer
expecting to keep generating CO-RE accesses with the remaining of the
expression.
---
 gcc/config/bpf/core-builtins.cc   |  1 +
 .../gcc.target/bpf/core-attr-calls.c  | 49 +++
 2 files changed, 50 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/bpf/core-attr-calls.c

diff --git a/gcc/config/bpf/core-builtins.cc b/gcc/config/bpf/core-builtins.cc
index 86e2e9d6e39..cdfb356660e 100644
--- a/gcc/config/bpf/core-builtins.cc
+++ b/gcc/config/bpf/core-builtins.cc
@@ -1822,6 +1822,7 @@ make_gimple_core_safe_access_index (tree *tp,
 
   tree lhs;
   if (!wi->is_lhs
+ && gimple_code (wi->stmt) != GIMPLE_CALL
  && (lhs = gimple_get_lhs (wi->stmt)) != NULL_TREE)
core_mark_as_access_index (lhs);
 }
diff --git a/gcc/testsuite/gcc.target/bpf/core-attr-calls.c 
b/gcc/testsuite/gcc.target/bpf/core-attr-calls.c
new file mode 100644
index 000..87290c5c211
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/core-attr-calls.c
@@ -0,0 +1,49 @@
+/* Test for BPF CO-RE __attribute__((preserve_access_index)) with accesses on
+   LHS and both LHS and RHS of assignment with calls involved.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -dA -gbtf -mco-re -masm=normal" } */
+
+struct U {
+  int c;
+  struct V {
+int d;
+int e[4];
+int f;
+int *g;
+  } v;
+};
+
+struct T {
+  int a;
+  int b;
+  struct U u;
+  struct U *ptr_u;
+  struct U *array_u;
+} __attribute__((preserve_access_index));
+
+extern struct U *get_other_u(struct U *);
+extern struct V *get_other_v(struct V *);
+
+void
+func (struct T *t, int i)
+{
+  /* Since we are using the builtin all accesses are converted to CO-RE.  */
+  /* 0:30:0   */
+  __builtin_preserve_access_index(({ get_other_u(t->ptr_u)->c = 42; }));
+
+  /* This should not pass-through CO-RE accesses beyond the call since struct U
+ is not explicitly marked with preserve_access_index. */
+  /* 0:3  */
+  get_other_u(t->ptr_u)->c = 43;
+
+  /* 0:2:1  */
+  get_other_v(&t->u.v)->d = 44;
+}
+
+/* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:3\"\\)" 2 } } */
+/* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:0\"\\)" 1 } } */
+/* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:2:1\"\\)" 1 } } */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(struct T\\)" 3 } } */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(struct U\\)" 1 } } */
+
-- 
2.39.5



[PATCH 0/3] bpf: CO-RE fixes

2024-09-27 Thread Cupertino Miranda


Hi everyone,

This patches series includes fixes for bugs uncovered when executing
bpf-next selftests.
Looking forward to your comments.

Regards,
Cupertino

Cupertino Miranda (3):
  bpf: make sure CO-RE relocs are never typed with a BTF_KIND_CONST
  bpf: calls do not promote attr access_index on lhs
  bpf: set index entry for a VAR_DECL in CO-RE relocs

 gcc/btfout.cc |  2 +-
 gcc/config/bpf/btfext-out.cc  |  6 +++
 gcc/config/bpf/core-builtins.cc   |  6 ++-
 gcc/ctfc.h|  2 +
 .../gcc.target/bpf/core-attr-calls.c  | 49 +++
 .../gcc.target/bpf/core-attr-const.c  | 32 
 gcc/testsuite/gcc.target/bpf/core-builtin-1.c | 16 +++---
 gcc/testsuite/gcc.target/bpf/core-builtin-2.c |  3 +-
 .../gcc.target/bpf/core-builtin-exprlist-1.c  | 16 +++---
 9 files changed, 113 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/core-attr-calls.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/core-attr-const.c

-- 
2.39.5



Ping: [PATCH] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2024-09-27 Thread Michael Meissner
This patch seems to have been over looked.

https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663101.html

I ran a set of spec 2017 benchmarks with this patch applied and compared it to
a run without the patch applied.  There were no regressions, but 3 benchmarks
had slight improvement in runtime with this patch applied on a power10 system:

505.mcf_r 101.67%
520.omnetpp_r 103.35%
523.xalancbmk_r   101.15%

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [Fortran, Patch, PR81265, v1] Fix passing coarrays always w/ descriptor

2024-09-27 Thread Steve Kargl
On Fri, Sep 27, 2024 at 08:12:01PM +0200, Andre Vehreschild wrote:
> 
> the testcase is in the coarray directory, where tests are executed mit
> -fcoarray=single and lib. I don't know about none. Because the code stops
> compiling when it encounters a coarray with no single or lib. Therefore I
> suppose there no way to run it without coarrays.
> 

Thanks for the clarification. And, yes, -fcoarray=none should fail. 
I forgot about that!

-- 
Steve


Re: [PATCH 2/2] c++: Implement resolution for DR 36 [PR116160]

2024-09-27 Thread Jason Merrill

On 9/19/24 7:56 PM, Nathaniel Shead wrote:

Noticed how to fix this while working on the other patch.
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?


OK.


-- >8 --

This implements part of P1787 to no longer complain about redeclaring an
entity via using-decl other than in a class scope.

PR c++/116160

gcc/cp/ChangeLog:

* name-lookup.cc (supplement_binding): Allow redeclaration via
USING_DECL if not in class scope.
(do_nonmember_using_decl): Remove function-scope exemption.
(push_using_decl_bindings): Remove outdated comment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/using-enum-3.C: No longer expect an error.
* g++.dg/lookup/using53.C: Remove XFAIL.
* g++.dg/cpp2a/using-enum-11.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/name-lookup.cc  | 12 +++-
  gcc/testsuite/g++.dg/cpp0x/using-enum-3.C  |  2 +-
  gcc/testsuite/g++.dg/cpp2a/using-enum-11.C |  9 +
  gcc/testsuite/g++.dg/lookup/using53.C  |  2 +-
  4 files changed, 18 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/using-enum-11.C

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 94b031e6be2..22a1c6aac8c 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -2874,6 +2874,12 @@ supplement_binding (cxx_binding *binding, tree decl)
 "%<-std=c++2c%> or %<-std=gnu++2c%>");
binding->value = name_lookup::ambiguous (decl, binding->value);
  }
+  else if (binding->scope->kind != sk_class
+  && TREE_CODE (decl) == USING_DECL
+  && decls_match (target_bval, target_decl))
+/* Since P1787 (DR 36) it is OK to redeclare entities via using-decl,
+   except in class scopes.  */
+ok = false;
else
  {
if (!error_operand_p (bval))
@@ -5375,8 +5381,7 @@ do_nonmember_using_decl (name_lookup &lookup, bool 
fn_scope_p,
else if (value
   /* Ignore anticipated builtins.  */
   && !anticipated_builtin_p (value)
-  && (fn_scope_p
-  || !decls_match (lookup.value, strip_using_decl (value
+  && !decls_match (lookup.value, strip_using_decl (value)))
  {
diagnose_name_conflict (lookup.value, value);
failed = true;
@@ -6648,9 +6653,6 @@ push_using_decl_bindings (name_lookup *lookup, tree name, 
tree value)
type = binding->type;
  }
  
-  /* DR 36 questions why using-decls at function scope may not be

- duplicates.  Disallow it, as C++11 claimed and PR 20420
- implemented.  */
if (lookup)
  do_nonmember_using_decl (*lookup, true, true, &value, &type);
  
diff --git a/gcc/testsuite/g++.dg/cpp0x/using-enum-3.C b/gcc/testsuite/g++.dg/cpp0x/using-enum-3.C

index 34f8bf4fa0b..4638181c63c 100644
--- a/gcc/testsuite/g++.dg/cpp0x/using-enum-3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/using-enum-3.C
@@ -9,7 +9,7 @@
  void f ()
  {
enum e { a };
-  using e::a;  // { dg-error "redeclaration" }
+  using e::a;  // { dg-bogus "redeclaration" "P1787" }
// { dg-error "enum" "" { target { ! c++2a } } .-1 }
  }
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/using-enum-11.C b/gcc/testsuite/g++.dg/cpp2a/using-enum-11.C

new file mode 100644
index 000..ff99ed422d5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/using-enum-11.C
@@ -0,0 +1,9 @@
+// PR c++/116160
+// { dg-do compile { target c++20 } }
+
+enum class Blah { b };
+void foo() {
+  using Blah::b;
+  using Blah::b;
+  using enum Blah;
+}
diff --git a/gcc/testsuite/g++.dg/lookup/using53.C 
b/gcc/testsuite/g++.dg/lookup/using53.C
index e91829e939a..8279c73bfc4 100644
--- a/gcc/testsuite/g++.dg/lookup/using53.C
+++ b/gcc/testsuite/g++.dg/lookup/using53.C
@@ -52,5 +52,5 @@ void
  f ()
  {
using N::i;
-  using N::i;   // { dg-bogus "conflicts" "See P1787 (CWG36)" { xfail 
*-*-* } }
+  using N::i;   // { dg-bogus "conflicts" "See P1787 (CWG36)" }
  }




Re: [PATCH] c++: ICE with structured bindings and m-d array [PR102594]

2024-09-27 Thread Jason Merrill

On 9/5/24 6:32 PM, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?


OK.


-- >8 --
We ICE in decay_conversion with this test:

   struct S {
 S() {}
   };
   S arr[1][1];
   auto [m](arr3);

But not when the last line is:

   auto [n] = arr3;

Therefore the difference is between copy- and direct-init.  In
particular, in build_vec_init we have:

   if (direct_init)
 from = build_tree_list (NULL_TREE, from);

and then we call build_vec_init again with init==from.  Then
decay_conversion gets the TREE_LIST and it crashes.

build_aggr_init has:

   /* Wrap the initializer in a CONSTRUCTOR so that build_vec_init
  recognizes it as direct-initialization.  */
   init = build_constructor_single (init_list_type_node,
NULL_TREE, init);
   CONSTRUCTOR_IS_DIRECT_INIT (init) = true;

so I propose to do the same in build_vec_init.

PR c++/102594

gcc/cp/ChangeLog:

* init.cc (build_vec_init): Build up a CONSTRUCTOR to signal
direct-initialization rather than a TREE_LIST.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/decomp61.C: New test.
---
  gcc/cp/init.cc|  8 +++-
  gcc/testsuite/g++.dg/cpp1z/decomp61.C | 28 +++
  2 files changed, 35 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/decomp61.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index be7fdb40dd6..f785015e477 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -4958,7 +4958,13 @@ build_vec_init (tree base, tree maxindex, tree init,
  if (xvalue)
from = move (from);
  if (direct_init)
-   from = build_tree_list (NULL_TREE, from);
+   {
+ /* Wrap the initializer in a CONSTRUCTOR so that
+build_vec_init recognizes it as direct-initialization.  */
+ from = build_constructor_single (init_list_type_node,
+  NULL_TREE, from);
+ CONSTRUCTOR_IS_DIRECT_INIT (from) = true;
+   }
}
  else
from = NULL_TREE;
diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp61.C 
b/gcc/testsuite/g++.dg/cpp1z/decomp61.C
new file mode 100644
index 000..ad0a20c1add
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/decomp61.C
@@ -0,0 +1,28 @@
+// PR c++/102594
+// { dg-do compile { target c++17 } }
+
+struct S {
+  S() {}
+};
+S arr1[2];
+S arr2[2][1];
+S arr3[1][1];
+auto [m](arr3);
+auto [n] = arr3;
+
+struct X {
+  int i;
+};
+
+void
+g (X x)
+{
+  auto [a, b](arr2);
+  auto [c, d] = arr2;
+  auto [e, f] = (arr2);
+  auto [i, j](arr1);
+  auto [k, l] = arr1;
+  auto [m, n] = (arr1);
+  auto [z] = x;
+  auto [y](x);
+}

base-commit: b567e5ead5d54f022c57b48f31653f6ae6ece007




Re: [PATCH] c++: concept in default argument [PR109859]

2024-09-27 Thread Jason Merrill

On 9/18/24 5:06 PM, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
1) We're hitting the assert in cp_parser_placeholder_type_specifier.
It says that if it turns out to be false, we should do error() instead.
Do so, then.

2) lambda-targ8.C should compile fine, though.  The problem was that
local_variables_forbidden_p wasn't cleared when we're about to parse
the optional template-parameter-list for a lambda in a default argument.

PR c++/109859

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_declarator_opt): Temporarily clear
local_variables_forbidden_p.
(cp_parser_placeholder_type_specifier): Turn an assert into an error.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-defarg3.C: New test.
* g++.dg/cpp2a/lambda-targ8.C: New test.
---
  gcc/cp/parser.cc  |  9 +++--
  gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C |  8 
  gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C | 10 ++
  3 files changed, 25 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4dd9474cf60..bdc4fef243a 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11891,6 +11891,11 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
 "lambda templates are only available with "
 "%<-std=c++20%> or %<-std=gnu++20%>");
  
+  /* Even though the whole lambda may be a default argument, its

+template-parameter-list is a context where it's OK to create
+new parameters.  */
+  auto lvf = make_temp_override (parser->local_variables_forbidden_p, 0u);
+
cp_lexer_consume_token (parser->lexer);
  
template_param_list = cp_parser_template_parameter_list (parser);

@@ -20978,8 +20983,8 @@ cp_parser_placeholder_type_specifier (cp_parser 
*parser, location_t loc,
/* In a default argument we may not be creating new parameters.  */
if (parser->local_variables_forbidden_p & LOCAL_VARS_FORBIDDEN)
{
- /* If this assert turns out to be false, do error() instead.  */
- gcc_assert (tentative);
+ if (!tentative)
+   error_at (loc, "local variables may not appear in this context");


There's no local variable in the new testcase, the error should talk 
about a concept-name.



  return error_mark_node;
}
return build_constrained_parameter (con, proto, args);
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C
new file mode 100644
index 000..1d689dc8f30
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-defarg3.C
@@ -0,0 +1,8 @@
+// PR c++/109859
+// { dg-do compile { target c++20 } }
+
+template
+concept C = true;
+
+template  // { dg-error "may not appear in this context" }
+int f();
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C
new file mode 100644
index 000..3685b0ef880
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-targ8.C
@@ -0,0 +1,10 @@
+// PR c++/109859
+// { dg-do compile { target c++20 } }
+
+template
+concept A = true;
+
+template {}>
+int x;
+
+void g() { (void) x<>; }

base-commit: cc62b2c3da118f08f71d2ae9c08bafb55b35767a




[committed] libstdc++: Fix -Wsign-compare warning in std::string::resize_for_overwrite

2024-09-27 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/basic_string.tcc (resize_for_overwrite): Fix
-Wsign-compare warning.
* include/bits/cow_string.h (resize_for_overwrite): Likewise.
---
 libstdc++-v3/include/bits/basic_string.tcc | 2 +-
 libstdc++-v3/include/bits/cow_string.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index 2c17d258bfe..caeddaf2f5b 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -611,7 +611,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static_assert(__gnu_cxx::__is_integer_nonstrict::__value,
"resize_and_overwrite operation must return an integer");
 #endif
-  _GLIBCXX_DEBUG_ASSERT(__r >= 0 && __r <= __n);
+  _GLIBCXX_DEBUG_ASSERT(__r >= 0 && size_type(__r) <= __n);
   __term._M_r = size_type(__r);
   if (__term._M_r > __n)
__builtin_unreachable();
diff --git a/libstdc++-v3/include/bits/cow_string.h 
b/libstdc++-v3/include/bits/cow_string.h
index b78aa74fbfa..087ddf81dd8 100644
--- a/libstdc++-v3/include/bits/cow_string.h
+++ b/libstdc++-v3/include/bits/cow_string.h
@@ -3800,7 +3800,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static_assert(__gnu_cxx::__is_integer_nonstrict::__value,
"resize_and_overwrite operation must return an integer");
 #endif
-  _GLIBCXX_DEBUG_ASSERT(__r >= 0 && __r <= __n);
+  _GLIBCXX_DEBUG_ASSERT(__r >= 0 && size_type(__r) <= __n);
   __term._M_r = size_type(__r);
   if (__term._M_r > __n)
__builtin_unreachable();
-- 
2.46.1



Re: [PATCH v3 4/4] tree-optimization/116024 - simplify some cases of X +- C1 cmp C2

2024-09-27 Thread Artemiy Volkov
On 9/27/2024 1:29 PM, Richard Biener wrote:
> On Mon, 23 Sep 2024, Artemiy Volkov wrote:
> 
>> Whenever C1 and C2 are integer constants, X is of a wrapping type, and
>> cmp is a relational operator, the expression X +- C1 cmp C2 can be
>> simplified in the following cases:
>>
>> (a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
>> comparison in the following way:
>> X +- C1 <= C2
>> -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
>> -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
>> -INF -+ C1 <= X <= +INF (due to (1))
>> -INF -+ C1 <= X (eliminate the right hand side since it holds for any X)
>>
>> (b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
>> sequence of transformations:
>>
>> X +- C1 >= C2
>> +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
>> +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
>> +INF -+ C1 >= X >= -INF (due to (1))
>> +INF -+ C1 >= X (eliminate the right hand side since it holds for any X)
>>
>> (c) The > and < cases are negations of (a) and (b), respectively.
>>
>> This transformation allows to occasionally save add / sub instructions,
>> for instance the expression
>>
>> 3 + (uint32_t)f() < 2
>>
>> compiles to
>>
>> cmn w0, #4
>> csetw0, ls
>>
>> instead of
>>
>> add w0, w0, 3
>> cmp w0, 2
>> csetw0, ls
>>
>> on aarch64.
>>
>> Testcases that go together with this patch have been split into two
>> separate files, one containing testcases for unsigned variables and the
>> other for wrapping signed ones (and thus compiled with -fwrapv).
>> Additionally, one aarch64 test has been adjusted since the patch has
>> caused the generated code to change from
>>
>> cmn w0, #2
>> csinc   w0, w1, wzr, cc   (x < -2)
>>
>> to
>>
>> cmn w0, #3
>> csinc   w0, w1, wzr, cs   (x <= -3)
>>
>> This patch has been bootstrapped and regtested on aarch64, x86_64, and
>> i386, and additionally regtested on riscv32.
>>
>> gcc/ChangeLog:
>>
>>  PR tree-optimization/116024
>>  * match.pd: New transformation around integer comparison.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.dg/tree-ssa/pr116024-2.c: New test.
>>  * gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
>>  * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
>>
>> Signed-off-by: Artemiy Volkov 
>> ---
>>   gcc/match.pd  | 44 ++-
>>   .../gcc.dg/tree-ssa/pr116024-2-fwrapv.c   | 38 
>>   gcc/testsuite/gcc.dg/tree-ssa/pr116024-2.c| 38 
>>   .../gcc.target/aarch64/gtu_to_ltu_cmp_1.c |  2 +-
>>   4 files changed, 120 insertions(+), 2 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-2-fwrapv.c
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-2.c
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index bf3b4a2e3fe..3275a69252f 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -8896,6 +8896,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>  (cmp @0 { TREE_OVERFLOW (res)
>>   ? drop_tree_overflow (res) : res; }
>>   (for cmp (lt le gt ge)
>> + rcmp (gt ge lt le)
>>(for op (plus minus)
>> rop (minus plus)
>> (simplify
>> @@ -8923,7 +8924,48 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>"X cmp C2 -+ C1"),
>>   WARN_STRICT_OVERFLOW_COMPARISON);
>>  }
>> -(cmp @0 { res; })
>> +(cmp @0 { res; })
>> +/* For wrapping types, simplify the following cases of X +- C1 CMP C2:
>> +
>> +   (a) If CMP is <= and C2 -+ C1 == +INF (1), simplify to X >= -INF -+ C1
>> +   by observing the following:
>> +
>> +X +- C1 <= C2
>> +  ==>  -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
>> +  ==>  -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
>> +  ==>  -INF -+ C1 <= X <= +INF (due to (1))
>> +  ==>  -INF -+ C1 <= X (eliminate the right hand side since it holds for 
>> any X)
>> +
>> +(b) Similarly, if CMP is >= and C2 -+ C1 == -INF (1):
>> +
>> +X +- C1 >= C2
>> +  ==>  +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
>> +  ==>  +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
>> +  ==>  +INF -+ C1 >= X >= -INF (due to (1))
>> +  ==>  +INF -+ C1 >= X (eliminate the right hand side since it holds for 
>> any X)
>> +
>> +(c) The > and < cases are negations of (a) and (b), respectively.  */
>> +   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
> 
> This only works for signed types, right?  So add a !TYPE_UNSIGNED check.

Nope, unlike the one in patch #3, this transformation is intended to 
work for both unsigned and signed wrapping types, so in this case 
there's no need for the check.

> 
>> + (with
>> +   {
>> +wide_int max = wi::max_value (TREE_TYPE (@0));
>> +wide_int min = wi::min_value (TREE_TYPE (@0));
>> +
>> +wide_

[PATCH v7] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-27 Thread Qing Zhao
Hi, this is the 7th version of the patch.

Compare to the 6th version, the major changes are several style issues
raised by Jakub for the 6th version of the patchs.

The 6th version is at:
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663992.html

bootstrapped and regress tested on both X86 and aarch64. no issue.

Okay for the trunk?

thanks.

Qing.


With the addition of the 'counted_by' attribute and its wide roll-out
within the Linux kernel, a use case has been found that would be very
nice to have for object allocators: being able to set the counted_by
counter variable without knowing its name.

For example, given:

  struct foo {
...
int counter;
...
struct bar array[] __attribute__((counted_by (counter)));
  } *p;

The existing Linux object allocators are roughly:

  #define MAX(A, B) (A > B) ? (A) : (B)
  #define alloc(P, FAM, COUNT) ({ \
__auto_type __p = &(P); \
size_t __size = MAX (sizeof(*P),
 __builtin_offsetof (__typeof(*P), FAM)
 + sizeof (*(P->FAM)) * COUNT); \
*__p = kmalloc(__size); \
  })

Right now, any addition of a counted_by annotation must also
include an open-coded assignment of the counter variable after
the allocation:

  p = alloc(p, array, how_many);
  p->counter = how_many;

In order to avoid the tedious and error-prone work of manually adding
the open-coded counted-by intializations everywhere in the Linux
kernel, a new GCC builtin __builtin_counted_by_ref will be very useful
to be added to help the adoption of the counted-by attribute.

 -- Built-in Function: TYPE __builtin_counted_by_ref (PTR)
 The built-in function '__builtin_counted_by_ref' checks whether the
 array object pointed by the pointer PTR has another object
 associated with it that represents the number of elements in the
 array object through the 'counted_by' attribute (i.e.  the
 counted-by object).  If so, returns a pointer to the corresponding
 counted-by object.  If such counted-by object does not exist,
 returns a NULL pointer.

 This built-in function is only available in C for now.

 The argument PTR must be a pointer to an array.  The TYPE of the
 returned value must be a pointer type pointing to the corresponding
 type of the counted-by object or VOID pointer type in case of a
 NULL pointer being returned.

With this new builtin, the central allocator could be updated to:

  #define MAX(A, B) (A > B) ? (A) : (B)
  #define alloc(P, FAM, COUNT) ({ \
__auto_type __p = &(P); \
__auto_type __c = (COUNT); \
size_t __size = MAX (sizeof (*(*__p)),\
 __builtin_offsetof (__typeof(*(*__p)),FAM) \
 + sizeof (*((*__p)->FAM)) * __c); \
if ((*__p = kmalloc(__size))) { \
  __auto_type ret = __builtin_counted_by_ref((*__p)->FAM); \
  *_Generic(ret, void *: &(size_t){0}, default: ret) = __c; \
} \
  })

And then structs can gain the counted_by attribute without needing
additional open-coded counter assignments for each struct, and
unannotated structs could still use the same allocator.

PR c/116016

gcc/c-family/ChangeLog:

* c-common.cc: Add new __builtin_counted_by_ref.
* c-common.h (enum rid): Add RID_BUILTIN_COUNTED_BY_REF.

gcc/c/ChangeLog:

* c-decl.cc (names_builtin_p): Add RID_BUILTIN_COUNTED_BY_REF.
* c-parser.cc (has_counted_by_object): New routine.
(get_counted_by_ref): New routine.
(c_parser_postfix_expression): Handle New RID_BUILTIN_COUNTED_BY_REF.
* c-tree.h: New routine handle_counted_by_for_component_ref.
* c-typeck.cc (handle_counted_by_for_component_ref): New routine.
(build_component_ref): Call the new routine.

gcc/ChangeLog:

* doc/extend.texi: Add documentation for __builtin_counted_by_ref.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-counted-by-ref-1.c: New test.
* gcc.dg/builtin-counted-by-ref.c: New test.
---
 gcc/c-family/c-common.cc  |   1 +
 gcc/c-family/c-common.h   |   1 +
 gcc/c/c-decl.cc   |   1 +
 gcc/c/c-parser.cc |  79 ++
 gcc/c/c-tree.h|   1 +
 gcc/c/c-typeck.cc |  33 +++--
 gcc/doc/extend.texi   |  55 +++
 .../gcc.dg/builtin-counted-by-ref-1.c | 135 ++
 gcc/testsuite/gcc.dg/builtin-counted-by-ref.c |  61 
 9 files changed, 358 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-counted-by-ref-1.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-counted-by-ref.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index ec6a5da892d..8ad9b998e7b 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -430,6 +430,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__builtin_choose_expr", RID_

Re: [PATCH] c++/modules: Propagate purview/import for templates in duplicate_decls [PR116803]

2024-09-27 Thread Jason Merrill

On 9/27/24 5:14 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?


OK.


-- >8 --

We need to ensure that for a declaration in the module purview, that the
resulting declaration has PURVIEW_P set and IMPORT_P cleared so that we
understand it might be something requiring exporting.  This is normally
handled for a declaration by set_instantiating_module, but when this
declaration is a redeclaration duplicate_decls needs to propagate this
to olddecl.

This patch only changes the logic for template declarations, because in
the non-template case the whole contents of olddecl's DECL_LANG_SPECIFIC
is replaced with newdecl's (which includes these flags), so there's
nothing to do.

PR c++/116803

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Propagate DECL_MODULE_PURVIEW_P and
DECL_MODULE_IMPORT_P for template redeclarations.

gcc/testsuite/ChangeLog:

* g++.dg/modules/merge-18_a.H: New test.
* g++.dg/modules/merge-18_b.H: New test.
* g++.dg/modules/merge-18_c.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/decl.cc| 10 ++
  gcc/testsuite/g++.dg/modules/merge-18_a.H |  8 
  gcc/testsuite/g++.dg/modules/merge-18_b.H | 13 +
  gcc/testsuite/g++.dg/modules/merge-18_c.C | 10 ++
  4 files changed, 41 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/modules/merge-18_a.H
  create mode 100644 gcc/testsuite/g++.dg/modules/merge-18_b.H
  create mode 100644 gcc/testsuite/g++.dg/modules/merge-18_c.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 5ddb7eafa50..a81a7dd2e9e 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -2528,6 +2528,16 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
hiding, bool was_hidden)
}
}
  
+  /* Propagate purviewness and importingness as with

+set_instantiating_module.  */
+  if (modules_p ())
+   {
+ if (DECL_MODULE_PURVIEW_P (new_result))
+   DECL_MODULE_PURVIEW_P (old_result) = true;
+ if (!DECL_MODULE_IMPORT_P (new_result))
+   DECL_MODULE_IMPORT_P (old_result) = false;
+   }
+
/* If the new declaration is a definition, update the file and
 line information on the declaration, and also make
 the old declaration the same definition.  */
diff --git a/gcc/testsuite/g++.dg/modules/merge-18_a.H 
b/gcc/testsuite/g++.dg/modules/merge-18_a.H
new file mode 100644
index 000..8d86ad980ba
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-18_a.H
@@ -0,0 +1,8 @@
+// PR c++/116803
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+namespace ns {
+  template  void foo();
+  template  extern const int bar;
+}
diff --git a/gcc/testsuite/g++.dg/modules/merge-18_b.H 
b/gcc/testsuite/g++.dg/modules/merge-18_b.H
new file mode 100644
index 000..2a762e2ac49
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-18_b.H
@@ -0,0 +1,13 @@
+// PR c++/116803
+// { dg-additional-options "-fmodule-header -fdump-lang-module" }
+// { dg-module-cmi {} }
+
+import "merge-18_a.H";
+
+namespace ns {
+  template  void foo() {}
+  template  const int bar = 123;
+}
+
+// { dg-final { scan-lang-dump {Writing definition '::ns::template foo'} 
module } }
+// { dg-final { scan-lang-dump {Writing definition '::ns::template bar'} 
module } }
diff --git a/gcc/testsuite/g++.dg/modules/merge-18_c.C 
b/gcc/testsuite/g++.dg/modules/merge-18_c.C
new file mode 100644
index 000..b90d85f7502
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/merge-18_c.C
@@ -0,0 +1,10 @@
+// PR c++/116803
+// { dg-module-do link }
+// { dg-additional-options "-fmodules-ts" }
+
+import "merge-18_b.H";
+
+int main() {
+  ns::foo();
+  static_assert(ns::bar == 123);
+}




Re: Fwd: [patch, fortran] Matmul and dot_product for unsigned

2024-09-27 Thread Mikael Morin

Le 27/09/2024 à 17:08, Thomas Koenig a écrit :

Hi Mikael,


Now for the remaining intrinsics (FINDLOC, MAXLOC,
MINLOC, MAXVAL, MINVAL, CSHIFT and EOSHIFT still missing).

I have one patch series touching (inline) MINLOC and MAXLOC to post in 
the coming days.  Could you please keep away from them for one more 
week or two?


Looking at the previous patches, this will touch only check.cc,
iresolve.cc and simplify.cc (plus the library files).

Will your patches touch those areas?  If not, I think there should
be no conflict.


Perfect, they don't touch them.


Re: [PATCH v2] libgcc, libstdc++: Make TU-local declarations in headers external linkage [PR115126]

2024-09-27 Thread Jonathan Wakely
On Fri, 27 Sept 2024 at 19:46, Jason Merrill  wrote:
>
> On 9/26/24 6:34 AM, Nathaniel Shead wrote:
> > On Thu, Sep 26, 2024 at 01:46:27PM +1000, Nathaniel Shead wrote:
> >> On Wed, Sep 25, 2024 at 01:30:55PM +0200, Jakub Jelinek wrote:
> >>> On Wed, Sep 25, 2024 at 12:18:07PM +0100, Jonathan Wakely wrote:
> >>   And whether similarly we couldn't use
> >> __attribute__((__visibility__ ("hidden"))) on the static block scope
> >> vars for C++ (again, if compiler supports that), so that the changes
> >> don't affect ABI of C++ libraries.
> >
> > That sounds good too.
> 
>  Can you use visibility attributes on a local static? I get a warning
>  that it's ignored.
> >>>
> >>> Indeed :(
> >>>
> >>> And #pragma GCC visibility push(hidden)/#pragma GCC visibility pop around
> >>> just the static block scope var definition does nothing.
> >>> If it is around the whole inline function though, then it seems to work.
> >>> Though, unsure if we want that around the whole header; wonder what it 
> >>> would
> >>> do with the weakrefs.
> >>>
> >>> Jakub
> >>>
> >>
> >> Thanks for the thoughts.  WRT visibility, it looks like the main gthr.h
> >> surrounds the whole function in a
> >>
> >>#ifndef HIDE_EXPORTS
> >>#pragma GCC visibility push(default)
> >>#endif
> >>
> >> block, though I can't quite work out what the purpose of that is here
> >> (since everything is currently internal linkage to start with).
> >>
> >> But it sounds like doing something like
> >>
> >>#ifdef __has_attribute
> >># if __has_attribute(__always_inline__)
> >>#  define __GTHREAD_ALWAYS_INLINE __attribute__((__always_inline__))
> >># endif
> >>#endif
> >>#ifndef __GTHREAD_ALWAYS_INLINE
> >># define __GTHREAD_ALWAYS_INLINE
> >>#endif
> >>
> >>#ifdef __cplusplus
> >># define __GTHREAD_INLINE inline __GTHREAD_ALWAYS_INLINE
> >>#else
> >># define __GTHREAD_INLINE static inline
> >>#endif
> >>
> >> and then marking maybe even just the new inline functions with
> >> visibility hidden should be OK?
> >>
> >> Nathaniel
> >
> > Here's a new patch that does this.  Also since v1 it adds another two
> > internal linkage declarations I'd missed earlier from libstdc++, in
> > pstl; it turns out that  doesn't include .
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu and
> > aarch64-unknown-linux-gnu, OK for trunk?
> >
> > -- >8 --
> >
> > In C++20, modules streaming check for exposures of TU-local entities.
> > In general exposing internal linkage functions in a header is liable to
> > cause ODR violations in C++, and this is now detected in a module
> > context.
> >
> > This patch goes through and removes 'static' from many declarations
> > exposed through libstdc++ to prevent code like the following from
> > failing:
> >
> >export module M;
> >extern "C++" {
> >  #include 
> >}
> >
> > Since gthreads is used from C as well, we need to choose whether to use
> > 'inline' or 'static inline' depending on whether we're compiling for C
> > or C++ (since the semantics of 'inline' are different between the
> > languages).  Additionally we need to remove static global variables, so
> > we migrate these to function-local statics to avoid the ODR issues.
>
> Why function-local static rather than inline variable?

We can make that conditional on __cplusplus but can we do that for
C++98? With Clang too?


>
> > +++ b/libstdc++-v3/include/pstl/algorithm_impl.h
> > @@ -2890,7 +2890,7 @@ __pattern_includes(__parallel_tag<_IsVector> __tag, 
> > _ExecutionPolicy&& __exec, _
> >   });
> >   }
> >
> > -constexpr auto __set_algo_cut_off = 1000;
> > +inline constexpr auto __set_algo_cut_off = 1000;
> >
> > +++ b/libstdc++-v3/include/pstl/unseq_backend_simd.h
> > @@ -22,7 +22,7 @@ namespace __unseq_backend
> >   {
> >
> >   // Expect vector width up to 64 (or 512 bit)
> > -const std::size_t __lane_size = 64;
> > +inline const std::size_t __lane_size = 64;
>
> These changes should not be necessary; the uses of these variables are
> not exposures under https://eel.is/c++draft/basic#link-14.4
>
> Jason
>



Re: [PATCH v2 3/6] c++/modules: Support anonymous namespaces in header units

2024-09-27 Thread Jason Merrill

On 9/27/24 1:58 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu,
OK for trunk?

-- >8 --

A header unit may contain anonymous namespaces, and those declarations
are exported (as with any declaration in a header unit).  This patch
ensures that such declarations are correctly handled.

The change to 'make_namespace_finish' is required so that if an
anonymous namespace is first seen by an import it is correctly handled
within 'add_imported_namespace'.  I don't see any particular reason why
handling of anonymous namespaces here had to be handled separately
outside that function since these are the only two callers.


Please add a rationale comment to that hunk; OK with that change.


gcc/cp/ChangeLog:

* module.cc (depset::hash::add_binding_entity): Also walk
anonymous namespaces.
(module_state::write_namespaces): Adjust assertion.
* name-lookup.cc (push_namespace): Move anon using-directive
handling to...
(make_namespace_finish): ...here.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-8_a.H: New test.
* g++.dg/modules/internal-8_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/module.cc|  7 +++--
  gcc/cp/name-lookup.cc   |  8 +++---
  gcc/testsuite/g++.dg/modules/internal-8_a.H | 28 
  gcc/testsuite/g++.dg/modules/internal-8_b.C | 29 +
  4 files changed, 63 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/internal-8_a.H
  create mode 100644 gcc/testsuite/g++.dg/modules/internal-8_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 5d7f4941b2a..df407c1fd55 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -13723,15 +13723,15 @@ depset::hash::add_binding_entity (tree decl, 
WMB_Flags flags, void *data_)
return (flags & WMB_Using
  ? flags & WMB_Export : DECL_MODULE_EXPORT_P (decl));
  }
-  else if (DECL_NAME (decl) && !data->met_namespace)
+  else if (!data->met_namespace)
  {
/* Namespace, walk exactly once.  */
-  gcc_checking_assert (TREE_PUBLIC (decl));
data->met_namespace = true;
if (data->hash->add_namespace_entities (decl, data->partitions))
{
  /* It contains an exported thing, so it is exported.  */
  gcc_checking_assert (DECL_MODULE_PURVIEW_P (decl));
+ gcc_checking_assert (TREE_PUBLIC (decl) || header_module_p ());
  DECL_MODULE_EXPORT_P (decl) = true;
}
  
@@ -16126,8 +16126,7 @@ module_state::write_namespaces (elf_out *to, vec spaces,

tree ns = b->get_entity ();
  
gcc_checking_assert (TREE_CODE (ns) == NAMESPACE_DECL);

-  /* P1815 may have something to say about this.  */
-  gcc_checking_assert (TREE_PUBLIC (ns));
+  gcc_checking_assert (TREE_PUBLIC (ns) || header_module_p ());
  
unsigned flags = 0;

if (TREE_PUBLIC (ns))
diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index eb365b259d9..fe2eb2e0917 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -9098,6 +9098,9 @@ make_namespace_finish (tree ns, tree *slot, bool 
from_import = false)
  
if (DECL_NAMESPACE_INLINE_P (ns) || !DECL_NAME (ns))

  emit_debug_info_using_namespace (ctx, ns, true);
+
+  if (!DECL_NAMESPACE_INLINE_P (ns) && !DECL_NAME (ns))
+add_using_namespace (NAMESPACE_LEVEL (ctx)->using_directives, ns);
  }
  
  /* Push into the scope of the NAME namespace.  If NAME is NULL_TREE,

@@ -9234,11 +9237,6 @@ push_namespace (tree name, bool make_inline)
  gcc_checking_assert (slot);
}
  make_namespace_finish (ns, slot);
-
- /* Add the anon using-directive here, we don't do it in
-make_namespace_finish.  */
- if (!DECL_NAMESPACE_INLINE_P (ns) && !name)
-   add_using_namespace (current_binding_level->using_directives, ns);
}
  }
  
diff --git a/gcc/testsuite/g++.dg/modules/internal-8_a.H b/gcc/testsuite/g++.dg/modules/internal-8_a.H

new file mode 100644
index 000..57fe60bb3c0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/internal-8_a.H
@@ -0,0 +1,28 @@
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+static int x = 123;
+static void f() {}
+template  static void t() {}
+
+namespace {
+  int y = 456;
+  void g() {};
+  template  void u() {}
+
+  namespace ns { int in_ns = 456; }
+
+  struct A {};
+  template  struct B {};
+
+  enum E { X };
+  enum class F { Y };
+
+  template  using U = int;
+
+#if __cplusplus >= 202002L
+  template  concept C = true;
+#endif
+}
+
+namespace ns2 = ns;
diff --git a/gcc/testsuite/g++.dg/modules/internal-8_b.C 
b/gcc/testsuite/g++.dg/modules/internal-8_b.C
new file mode 100644
index 000..a2d74a87473
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/internal-8_b.C
@@ -0,0 +1,29 @@
+// { dg-additional-options "-fmodules-ts" }
+
+im

Re: [PATCH v2] libgcc, libstdc++: Make TU-local declarations in headers external linkage [PR115126]

2024-09-27 Thread Jason Merrill

On 9/27/24 3:38 PM, Jonathan Wakely wrote:

On Fri, 27 Sept 2024 at 19:46, Jason Merrill  wrote:


On 9/26/24 6:34 AM, Nathaniel Shead wrote:

On Thu, Sep 26, 2024 at 01:46:27PM +1000, Nathaniel Shead wrote:

On Wed, Sep 25, 2024 at 01:30:55PM +0200, Jakub Jelinek wrote:

On Wed, Sep 25, 2024 at 12:18:07PM +0100, Jonathan Wakely wrote:

   And whether similarly we couldn't use
__attribute__((__visibility__ ("hidden"))) on the static block scope
vars for C++ (again, if compiler supports that), so that the changes
don't affect ABI of C++ libraries.


That sounds good too.


Can you use visibility attributes on a local static? I get a warning
that it's ignored.


Indeed :(

And #pragma GCC visibility push(hidden)/#pragma GCC visibility pop around
just the static block scope var definition does nothing.
If it is around the whole inline function though, then it seems to work.
Though, unsure if we want that around the whole header; wonder what it would
do with the weakrefs.

 Jakub



Thanks for the thoughts.  WRT visibility, it looks like the main gthr.h
surrounds the whole function in a

#ifndef HIDE_EXPORTS
#pragma GCC visibility push(default)
#endif

block, though I can't quite work out what the purpose of that is here
(since everything is currently internal linkage to start with).

But it sounds like doing something like

#ifdef __has_attribute
# if __has_attribute(__always_inline__)
#  define __GTHREAD_ALWAYS_INLINE __attribute__((__always_inline__))
# endif
#endif
#ifndef __GTHREAD_ALWAYS_INLINE
# define __GTHREAD_ALWAYS_INLINE
#endif

#ifdef __cplusplus
# define __GTHREAD_INLINE inline __GTHREAD_ALWAYS_INLINE
#else
# define __GTHREAD_INLINE static inline
#endif

and then marking maybe even just the new inline functions with
visibility hidden should be OK?

Nathaniel


Here's a new patch that does this.  Also since v1 it adds another two
internal linkage declarations I'd missed earlier from libstdc++, in
pstl; it turns out that  doesn't include .

Bootstrapped and regtested on x86_64-pc-linux-gnu and
aarch64-unknown-linux-gnu, OK for trunk?

-- >8 --

In C++20, modules streaming check for exposures of TU-local entities.
In general exposing internal linkage functions in a header is liable to
cause ODR violations in C++, and this is now detected in a module
context.

This patch goes through and removes 'static' from many declarations
exposed through libstdc++ to prevent code like the following from
failing:

export module M;
extern "C++" {
  #include 
}

Since gthreads is used from C as well, we need to choose whether to use
'inline' or 'static inline' depending on whether we're compiling for C
or C++ (since the semantics of 'inline' are different between the
languages).  Additionally we need to remove static global variables, so
we migrate these to function-local statics to avoid the ODR issues.


Why function-local static rather than inline variable?


We can make that conditional on __cplusplus but can we do that for
C++98? With Clang too?


Yes for both compilers, disabling -Wc++17-extensions.


+++ b/libstdc++-v3/include/pstl/algorithm_impl.h
@@ -2890,7 +2890,7 @@ __pattern_includes(__parallel_tag<_IsVector> __tag, 
_ExecutionPolicy&& __exec, _
   });
   }

-constexpr auto __set_algo_cut_off = 1000;
+inline constexpr auto __set_algo_cut_off = 1000;

+++ b/libstdc++-v3/include/pstl/unseq_backend_simd.h
@@ -22,7 +22,7 @@ namespace __unseq_backend
   {

   // Expect vector width up to 64 (or 512 bit)
-const std::size_t __lane_size = 64;
+inline const std::size_t __lane_size = 64;


These changes should not be necessary; the uses of these variables are
not exposures under https://eel.is/c++draft/basic#link-14.4

Jason







Re: [PATCH 1/2] c++: Don't strip USING_DECLs when updating local bindings [PR116748]

2024-09-27 Thread Jason Merrill

On 9/19/24 7:53 PM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?


OK.


Alternatively I could solve this the other way around (have a new
'old_target = strip_using_decl (old)' and replace all usages of 'old'
except the usages in this patch); this is more churn but probably better
matches how other functions are structured.

-- >8 --

Currently update_binding strips USING_DECLs too eagerly, leading to ICEs
in pop_local_decl as it can't find the decl it's popping in the binding
list.  Let's rather try to keep the original USING_DECL around.

This also means that using59.C can point to the location of the
using-decl rather than the underlying object directly; this is in the
direction required to fix PR c++/106851 (though more work is needed to
emit properly helpful diagnostics here).

PR c++/116748

gcc/cp/ChangeLog:

* name-lookup.cc (update_binding): Maintain USING_DECLs in the
binding slots.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/using59.C: Update location.
* g++.dg/lookup/using69.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/name-lookup.cc | 12 +++-
  gcc/testsuite/g++.dg/lookup/using59.C |  4 ++--
  gcc/testsuite/g++.dg/lookup/using69.C | 10 ++
  3 files changed, 19 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/lookup/using69.C

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index c7a693e02d5..94b031e6be2 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -3005,6 +3005,8 @@ update_binding (cp_binding_level *level, cxx_binding 
*binding, tree *slot,
  
if (old == error_mark_node)

  old = NULL_TREE;
+
+  tree old_bval = old;
old = strip_using_decl (old);
  
if (DECL_IMPLICIT_TYPEDEF_P (decl))

@@ -3021,7 +3023,7 @@ update_binding (cp_binding_level *level, cxx_binding 
*binding, tree *slot,
  gcc_checking_assert (!to_type);
  hide_type = hiding;
  to_type = decl;
- to_val = old;
+ to_val = old_bval;
}
else
hide_value = hiding;
@@ -3034,7 +3036,7 @@ update_binding (cp_binding_level *level, cxx_binding 
*binding, tree *slot,
/* OLD is an implicit typedef.  Move it to to_type.  */
gcc_checking_assert (!to_type);
  
-  to_type = old;

+  to_type = old_bval;
hide_type = hide_value;
old = NULL_TREE;
hide_value = false;
@@ -3093,7 +3095,7 @@ update_binding (cp_binding_level *level, cxx_binding 
*binding, tree *slot,
{
  if (same_type_p (TREE_TYPE (old), TREE_TYPE (decl)))
/* Two type decls to the same type.  Do nothing.  */
-   return old;
+   return old_bval;
  else
goto conflict;
}
@@ -3106,7 +3108,7 @@ update_binding (cp_binding_level *level, cxx_binding 
*binding, tree *slot,
  
  	  /* The new one must be an alias at this point.  */

  gcc_assert (DECL_NAMESPACE_ALIAS (decl));
- return old;
+ return old_bval;
}
else if (TREE_CODE (old) == VAR_DECL)
{
@@ -3121,7 +3123,7 @@ update_binding (cp_binding_level *level, cxx_binding 
*binding, tree *slot,
else
{
conflict:
- diagnose_name_conflict (decl, old);
+ diagnose_name_conflict (decl, old_bval);
  to_val = NULL_TREE;
}
  }
diff --git a/gcc/testsuite/g++.dg/lookup/using59.C 
b/gcc/testsuite/g++.dg/lookup/using59.C
index 3c3a73c28d5..b7ec325d234 100644
--- a/gcc/testsuite/g++.dg/lookup/using59.C
+++ b/gcc/testsuite/g++.dg/lookup/using59.C
@@ -1,10 +1,10 @@
  
  namespace Y

  {
-  extern int I; //  { dg-message "previous declaration" }
+  extern int I;
  }
  
-using Y::I;

+using Y::I; // { dg-message "previous declaration" }
  extern int I; // { dg-error "conflicts with a previous" }
  
  extern int J;

diff --git a/gcc/testsuite/g++.dg/lookup/using69.C 
b/gcc/testsuite/g++.dg/lookup/using69.C
new file mode 100644
index 000..7d52b73b9ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/using69.C
@@ -0,0 +1,10 @@
+// PR c++/116748
+
+namespace ns {
+  struct empty;
+}
+
+void foo() {
+  using ns::empty;
+  int empty;
+}




[PATCH] Fix sorting in Contributors.html

2024-09-27 Thread Richard Biener
The following moves my entry to where it belongs alphabetically
(it wasn't moved when s/Guenther/Biener/).

Pushed as obvious.

* doc/contrib.texi (Richard Biener): Move entry.
---
 gcc/doc/contrib.texi | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/doc/contrib.texi b/gcc/doc/contrib.texi
index 7714d52823b..0e2823f278a 100644
--- a/gcc/doc/contrib.texi
+++ b/gcc/doc/contrib.texi
@@ -63,6 +63,10 @@ improved alias analysis, plus migrating GCC to Bugzilla.
 @item
 Geoff Berry for his Java object serialization work and various patches.
 
+@item
+Richard Biener for his ongoing middle-end contributions and bug fixes
+and for release management.
+
 @item
 David Binderman for testing GCC trunk against Fedora Rawhide
 and csmith.
@@ -364,10 +368,6 @@ Stu Grossman for gdb hacking, allowing GCJ developers to 
debug Java code.
 @item
 Michael K. Gschwind contributed the port to the PDP-11.
 
-@item
-Richard Biener for his ongoing middle-end contributions and bug fixes
-and for release management.
-
 @item
 Ron Guilmette implemented the @command{protoize} and @command{unprotoize}
 tools, the support for DWARF 1 symbolic debugging information, and much of
-- 
2.43.0


[Fortran, Patch, PR81265, v1] Fix passing coarrays always w/ descriptor

2024-09-27 Thread Andre Vehreschild
Hi all,

attached patch fixes a runtime issue when a coarray was passed as
parameter to a procedure that was itself a parameter. The issue here
was that the coarray was passed as array pointer (i.e. w/o descriptor)
to the function, but the function expected it to be an array
w/ descriptor.

Regtests ok on x86_64-pc-linux-gnu / Fedore 39. Ok for mainline?

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gcc dot gnu dot org
From 7438255c4988958a03401a24b495637142853e7d Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Fri, 27 Sep 2024 14:18:42 +0200
Subject: [PATCH] [Fortran] Ensure coarrays in calls use a descriptor [PR81265]

gcc/fortran/ChangeLog:

	PR fortran/81265

	* trans-expr.cc (gfc_conv_procedure_call): Ensure coarrays use a
	descriptor when passed.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray/pr81265.f90: New test.
---
 gcc/fortran/trans-expr.cc |  8 +-
 gcc/testsuite/gfortran.dg/coarray/pr81265.f90 | 74 +++
 2 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/coarray/pr81265.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 18ef5e246ce..dbd6547f0fe 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6450,11 +6450,15 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 {
   bool finalized = false;
   tree derived_array = NULL_TREE;
+  symbol_attribute *attr;

   e = arg->expr;
   fsym = formal ? formal->sym : NULL;
   parm_kind = MISSING;

+  attr = fsym ? &(fsym->ts.type == BT_CLASS ? CLASS_DATA (fsym)->attr
+		: fsym->attr)
+		  : nullptr;
   /* If the procedure requires an explicit interface, the actual
 	 argument is passed according to the corresponding formal
 	 argument.  If the corresponding formal argument is a POINTER,
@@ -6470,7 +6474,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   if (comp)
 	nodesc_arg = nodesc_arg || !comp->attr.always_explicit;
   else
-	nodesc_arg = nodesc_arg || !sym->attr.always_explicit;
+	nodesc_arg
+	  = nodesc_arg
+	|| !(sym->attr.always_explicit || (attr && attr->codimension));

   /* Class array expressions are sometimes coming completely unadorned
 	 with either arrayspec or _data component.  Correct that here.
diff --git a/gcc/testsuite/gfortran.dg/coarray/pr81265.f90 b/gcc/testsuite/gfortran.dg/coarray/pr81265.f90
new file mode 100644
index 000..378733bfa7c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray/pr81265.f90
@@ -0,0 +1,74 @@
+!{ dg-do run }
+
+! Contributed by Anton Shterenlikht  
+! Check PR81265 is fixed.
+
+module m
+implicit none
+private
+public :: s
+
+abstract interface
+  subroutine halo_exchange( array )
+integer, allocatable, intent( inout ) :: array(:,:,:,:)[:,:,:]
+  end subroutine halo_exchange
+end interface
+
+interface
+  module subroutine s( coarray, hx )
+integer, allocatable, intent( inout ) :: coarray(:,:,:,:)[:,:,:]
+procedure( halo_exchange ) :: hx
+  end subroutine s
+end interface
+
+end module m
+submodule( m ) sm
+contains
+module procedure s
+
+if ( .not. allocated(coarray) ) then
+  write (*,*) "ERROR: s: coarray is not allocated"
+  error stop
+end if
+
+sync all
+
+call hx( coarray )
+
+end procedure s
+
+end submodule sm
+module m2
+  implicit none
+  private
+  public :: s2
+  contains
+subroutine s2( coarray )
+  integer, allocatable, intent( inout ) :: coarray(:,:,:,:)[:,:,:]
+  if ( .not. allocated( coarray ) ) then
+write (*,'(a)') "ERROR: s2: coarray is not allocated"
+error stop
+  end if
+end subroutine s2
+end module m2
+program p
+use m
+use m2
+implicit none
+integer, allocatable :: space(:,:,:,:)[:,:,:]
+integer :: errstat
+
+allocate( space(10,10,10,2) [2,2,*], source=0, stat=errstat )
+if ( errstat .ne. 0 ) then
+  write (*,*) "ERROR: p: allocate( space ) )"
+  error stop
+end if
+
+if ( .not. allocated (space) ) then
+  write (*,*) "ERROR: p: space is not allocated"
+  error stop
+end if
+
+call s( space, s2 )
+
+end program p
--
2.46.1



Re: [PATCH] aarch64: fix build failure on aarch64-none-elf

2024-09-27 Thread Richard Earnshaw (lists)
On 26/09/2024 18:14, Matthieu Longo wrote:
> A previous patch ([1]) introduced a build regression on aarch64-none-elf
> target. The changes were primarilly tested on aarch64-unknown-linux-gnu,
> so the issue was missed during development.
> The includes are slighly different between the two targets, and due to some
> include rules ([2]), "aarch64-unwind-def.h" was not found.
> 
> [1]: 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf41d627c13bc5f0dc676991f4513daa9d9ae36
> 
> [2]: https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html
>> include "file"
>> ...  It searches for a file named file first in the directory
>> containing the current file, ...
> 
> libgcc/ChangeLog:
> 
> * config/aarch64/aarch64-unwind.h: fix header path.
> ---
>  libgcc/config/aarch64/aarch64-unwind.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
> b/libgcc/config/aarch64/aarch64-unwind.h
> index 2b774eb263c..4d36f0b26f7 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -25,7 +25,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
>  #define AARCH64_UNWIND_H
>  
> -#include "aarch64-unwind-def.h"
> +#include "config/aarch64/aarch64-unwind-def.h"
>  
>  #include "ansidecl.h"
>  #include 

OK.

R.


[committed] libgomp.texi: Remove now duplicate TR13 item (was: [committed] libgomp.texi: fix formatting; add post-TR13 OpenMP impl. status items)

2024-09-27 Thread Tobias Burnus
Continuing reading 
https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Technical-Report-13.html 
showed that I missed one old item, which could be now removed:


With the new 'storage' map type it was also no longer fully applicable – 
and the newly added text already covered it.


Committed as Rev. r15-3919-gcfdc0a384aff5e as follow up to 
r15-3917-g6b7eaec20b046e.


* * *

While useful, those tables are unfortunately not very readable. (And I 
wonder how many more non-Appendix B items should be added; it probably 
requires a full go through the changes and will still likely miss 
several important but more hidden changes.)


Tobias
commit cfdc0a384aff5e06f80d3f55f4615abf350b193b
Author: Tobias Burnus 
Date:   Fri Sep 27 12:06:17 2024 +0200

libgomp.texi: Remove now duplicate TR13 item

Remove an item under "Other new TR 13 features" that since the last commit
(r15-3917-g6b7eaec20b046e) to this file is is covered by the added
  "New @code{storage} map-type modifier; context-dependent @code{alloc} and
   @code{release} are aliases"
  "Update of the map-type decay for mapping and @code{declare_mapper}"

libgomp/
* libgomp.texi (TR13 status): Update semi-duplicated, semi-obsoleted
item; remove left-over half-sentence.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index b561cb5f3f4..c6464ece32e 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -511,7 +511,7 @@ Technical Report (TR) 13 is the third preview for OpenMP 6.0.
   @tab N @tab
 @item @code{ref} modifier to the @code{map} clause @tab N @tab
 @item New @code{storage} map-type modifier; context-dependent @code{alloc} and
-  @code{release} are aliases. Update to map decay @tab N @tab
+  @code{release} are aliases @tab N @tab
 @item Update of the map-type decay for mapping and @code{declare_mapper}
   @tab N @tab
 @item Change of the @emph{map-type} property from @emph{ultimate} to
@@ -633,8 +633,6 @@ Technical Report (TR) 13 is the third preview for OpenMP 6.0.
 @item Multi-word directive names are now permitted with underscore @tab N @tab
 @item In Fortran (fixed + free), space between directive names is mandatory
   @tab N @tab
-@item @code{map(release: ...)} on @code{target} and @code{target_data} (map-type
-  decay changes) @tab N @tab post-TR13 item
 @end multitable
 
 


[PATCH] [MAINTAINERS]: Add myself as MVE Reviewer for the AArch32 (arm) port

2024-09-27 Thread Christophe Lyon
Pushed.
-- >8 --

ChangeLog:
* MAINTAINERS: Add myself as MVE Reviewer for the AArch32 (arm)
port.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 47b5915e9f8..ded5b3d4f64 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -272,6 +272,7 @@ check in changes outside of the parts of the compiler they 
maintain.
 
 Reviewers
 
+arm port (MVE)  Christophe Lyon 
 callgraph   Martin Jambor   
 C front end Marek Polacek   
 CTF, BTFIndu Bhagat 
-- 
2.34.1



Re: [PATCH 06/10] c++/modules: Detect exposures of TU-local entities

2024-09-27 Thread Jason Merrill

On 9/23/24 7:45 PM, Nathaniel Shead wrote:

I feel like there should be a way to make use of LAMBDA_TYPE_EXTRA_SCOPE to
avoid the need for the new TYPE_DEFINED_IN_INITIALIZER_P flag, perhaps once
something like my patch here[1] is accepted (but with further embellishments
for concepts, probably), but I wasn't able to work it out. Since currently as
far as I'm aware only lambdas can satisfy being a type with no name defined in
an 'initializer' this does seem a little overkill but I've applied it to all
class types just in case.



[1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662393.html


It's definitely overkill, since a defining-type-specifier can't appear 
in an expression.  I'm reluctant to allocate a bit on all types to such 
an obscure use.


I agree that LAMBDA_TYPE_EXTRA_SCOPE should do the trick; a lambda is 
TU-local if it has no mangling scope.  I'd prefer to pursue fixing holes 
in LAMBDA_TYPE_EXTRA_SCOPE as needed to make it work, starting with the 
patch above.


Maybe be permissive for lambdas for this patch, and fix them up later?


-  /* Determine Strongy Connected Components.  */
+  /* Determine Strongy Connected Components.  This will also strip any


While we're touching this, "Strongly".

Jason



Re: [PATCH 07/10] c++/modules: Implement ignored TU-local exposures

2024-09-27 Thread Jason Merrill

On 9/23/24 7:46 PM, Nathaniel Shead wrote:

Currently I just stream DECL_NAME in TU_LOCAL_ENTITYs for use in diagnostics,
but this feels perhaps insufficient.  Are there any better approached here?
Otherwise I don't think it matters too much, as which entity it is will also
be hopefully clear from the 'declared here' notes.

I've put the new warning in Wextra, but maybe it would be better to just
leave it out of any of the normal warning groups since there's currently
no good way to work around the warnings it produces?

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

[basic.link] p14 lists a number of circumstances where a declaration
naming a TU-local entity is not an exposure, notably the bodies of
non-inline templates and friend declarations in classes.  This patch
ensures that these references do not error when exporting the module.

We do need to still error on instantiation from a different module,
however, in case this refers to a TU-local entity.  As such this patch
adds a new tree TU_LOCAL_ENTITY which is used purely as a placeholder to
poison any attempted template instantiations that refer to it.

This is also streamed for friend decls so that merging (based on the
index of an entity into the friend decl list) doesn't break and to
prevent complicating the logic; I imagine this shouldn't ever come up
though.

We also add a new warning, '-Wignored-exposures', to handle the case
where someone accidentally refers to a TU-local value from within a
non-inline function template.  This will compile without errors as-is,
but any attempt to instantiate the decl will fail; this warning can be
used to ensure that this doesn't happen.  Unfortunately the warning has
quite some false positives; for instance, a user could deliberately only
call explicit instantiations of the decl, or use 'if constexpr' to avoid
instantiating the TU-local entity from other TUs, neither of which are
currently detected.


I disagree with the term "ignored exposure", in the warning name and in 
the rest of the patch; these references are not exposures.  It's the 
naming of a TU-local entity that is ignored in basic.link/14.


I like the warning, just would change the name. 
"-Wtemplate-names-tu-local"?  "-Wtu-local-in-template"?


I'm not too concerned about false positives, as long as it can be 
effectively suppressed with #pragma GCC diagnostic ignored.  If you only 
use explicit instantiations you don't need to have the template body in 
the interface anyway.



+ /* Ideally we would only warn in cases where there are no explicit
+instantiations of the template, but we don't currently track this
+in an easy-to-find way.  */


You can't walk DECL_TEMPLATE_INSTANTIATIONS checking 
DECL_EXPLICIT_INSTANTIATION?


What happens with a non-templated, non-inline function body that names a 
TU-local entity?  I don't see any specific handling or testcase.  Should 
we just omit its definition, like you do in the previous patch for 
TU-local variable initializers?


Jason



Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote:
> > __attribute__((noinline))
> > struct ref_proxy f ()
> > {
> >struct ref_proxy ptr;
> >struct ref_proxy D.10036;
> >struct ref_proxy type;
> >struct ref_proxy type;
> >struct qual_option D.10031;
> >struct ref_proxy D.10030;
> >struct qual_option inner;
> >struct variant t;
> >struct variant D.10026;
> >struct variant D.10024;
> >struct inplace_ref D.10023;
> >struct inplace_ref ptr;
> >struct ref_proxy D.9898;
> > 
> > [local count: 1073741824]:
> >MEM  [(struct variant *)&D.10024] = {};
> 
> Without actually checking it might be that SRA chokes on the above.
> The IL is basically a huge chain of aggregate copies interspersed
> with clobbers and occasional scalar inits and I fear that we really
> only have SRA dealing with this.

True.

> Is there any reason to use the char[40] init instead of a aggregate
> {} init of type variant?

It is dse1 which introduces that:
-  D.10137 = {};
+  MEM  [(struct variant *)&D.10137] = {};
in particular maybe_trim_constructor_store.

So, if SRA chokes on it, it better be fixed to deal with that,
DSE can create that any time.
Though, not sure how to differentiate that from the actual C++ zero
initialization where it is supposed to clear also padding bits if any.
I think a CONSTRUCTOR flag for that would be best, though e.g. in GIMPLE
representing such clears including padding bits with
MEM  [(struct whatever *)&D.whatever] = {};
might be an option too.  But then how to represent the DSE constructor
trimming such that it is clear that it still doesn't initialize the padding
bits?
Anyway, even if padding bits are zero initialized, if SRA could see that
nothing really inspects those padding bits, it would be nice to still optimize
it.

That said, it is
a union of a struct with 5 pointers (i.e. 40 bytes) and an empty struct
(1 byte, with padding) followed by size_t which_ field (the = 2 store).

And, I believe when not constexpr evaluating this, there actually is no
clearing before the = 2; store,
void 
eggs::variants::detail::_storage, true, true>::_storage<2, option_2> (struct _storage * 
const thi
s, struct index which, struct option_2 & args#0)
{
  struct index D.10676;

  *this = {CLOBBER(bob)};
  {
_1 = &this->D.9542;

eggs::variants::detail::_union, true>::_union<2, option_2> (_1, D.10676, args#0);
this->_which = 2;
  }
}
and the call in there is just 3 nested inline calls which do some clobbers
too, take address of something and call another inline and in the end
it is just a clobber and nothing else.

So, another thing to look at is whether the CONSTRUCTOR is
CONSTRUCTOR_NO_CLEARING or not and if that is ok, and/or whether
the gimplification is correct in that case (say if it would be
struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; } 
u; unsigned long w; };
void bar (S *);

void
foo ()
{
  S s = { .u = { .v = {} }, .w = 2 };
  bar (&s);
}
why do we expand it as
  s = {};
  s.w = 2;
when just
  s.w = 2;
or maybe
  s.u.v = {};
  s.w = 2;
would be enough.  Because when the large union has just a small member
(in this case empty struct) active, clearing the whole union is really a
waste of time.

> I would suggest to open a bugreport.

Yes.

Jakub



Re: [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.

2024-09-27 Thread Thomas Schwinge
Hi Roger!

On 2024-07-27T19:18:35+0100, "Roger Sayle"  wrote:
> Firstly, thanks to Haochen Gui for recently adding optab support for
> isfinite and isnormal to the middle-end.

Do we, by the way, have documentation (I suppose that should be in
"GNU Compiler Collection (GCC) Internals"?) about the rationale and
subsequent optimization opportunities for having vs. not having
representations of "codes" (like, 'isfinite') in the various GCC IRs
etc., like builtins, internal functions, GIMPLE, optabs, RTL (..., and
I've probably missed some more)?

Of course, a lot of it can be inferred from the context or otherwise,
like having builtins corresponding to C library functions and then be
able to optimize according to their defined semantics, but others are not
always clear to me: like, why do we have 'copysign' RTL but not
'ifnormal'?

> This patch adds define_expand
> for both these functions to the nvptx backend, which conveniently has
> special instructions to simplify their implementation.

ACK.

> As this patch
> adds UNSPEC_ISFINITE and UNSPEC_ISNORMAL, I've also taken the opportunity
> to include/repost my tweak to clean-up/eliminate UNSPEC_COPYSIGN.

I'd seen your 2023 "Add RTX codes for [...] COPYSIGN", but not yet seen a
patch to use it for nvptx -- but indeed have stumbled over nvptx
'UNSPEC_COPYSIGN' a while ago; ACK.

> Previously, for isfinite, GCC on nvptx-none with -O2 would generate:
>
> mov.f64 %r26, %ar0;
> abs.f64 %r28, %r26;
> setp.gtu.f64%r31, %r28, 0d7fef;
> selp.u32%value, 0, 1, %r31;
>
> and with this patch, we now generate:
>
> mov.f64 %r23, %ar0;
> testp.finite.f64%r24, %r23;
> selp.u32%value, 1, 0, %r24;

Nice!

> Previously, for isnormal, GCC -O2 would generate:
>
> mov.f64 %r28, %ar0;
> abs.f64 %r22, %r28;
> setp.gtu.f64%r32, %r22, 0d7fef;
> setp.ltu.f64%r35, %r22, 0d0010;
> or.pred %r43, %r35, %r32;
> selp.u32%value, 0, 1, %r43;
>
> and with this patch becomes:
>
> mov.f64 %r23, %ar0;
> setp.neu.f64%r24, %r23, 0d;
> testp.normal.f64%r25, %r23;
> and.pred%r26, %r24, %r25;
> selp.u32%value, 1, 0, %r26;
>
> Notice that although nvptx provides a testp.normal.f{32,64} instruction,
> the semantics don't quite match those required of libm [+0.0 and -0.0
> are considered normal by this instruction, but need to return false
> for __builtin_isnormal, hence the additional logic

Ugh.  ;-)

> which is still
> better than the original].

ACK.

> This patch has been tested on nvptx-none hosted by x86_64-pc-linux-gnu
> using make and make -k check, with only one new failure in the testsuite.
> The test case g++.dg/opt/pr107569.C exposes a latent bug in the middle-end
> (actually a missed optimization) as evrp fails to bound the results of
> isfinite.  This issue is independent of the back-end, as the tree-ssa
> evrp pass is run long before __builtin_finite is expanded by the backend,
> and the existence of an (any) isfinite optab is sufficient to expose it.
> Fortunately, Haochem Gui has already posted/proposed a fix at
> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657881.html
> [which I'm sad to see is taking a while to review/get approved].

Well, now this nvptx one here took me even longer to look into, so the
'g++.dg/opt/pr107569.C' regression is resolved by now.  ;-\

> Ok for mainline?

Just minor items: generally, I do like seeing logically separate changes
as separate commits (like, the 'copysign' cleanup is not conceptually
related to the 'isfinite', 'isnormal' enhancements).  However, that's my
own ambition; I do acknowledge that others do things differently, like
mixing in small cleanups with other changes.  Also, I personally strive
to go one step further with enhancing test suite coverage (for example,
move towards using 'check-function-bodies' instead of 'scan-assembler',
and first push the current/"bad" test case as its own commit, possibly
partly XFAILed, and as part of the code-changes commit then "fix up" the
test case, so that the latter changes are visible in the commit history).
But again, that's my own ambition; I do acknowledge that others do things
differently.

All that said, the patch is OK as is, with just one small enhancement,
see below.  Thank you!

> Thanks in advance (p.s. don't forget the nvptx_rtx_costs patch),

Aye-aye!

> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md

> +(define_insn "setcc_isnormal"
> +  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
> + (unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +UNSPEC_ISNORMAL))]
> +  ""
> +  "%.\\ttestp.normal%t1\\t%0, %1;")
> +
> +(define

Re: [PATCH v1] Widening-Mul: Fix one ICE when iterate on phi node

2024-09-27 Thread Richard Biener
On Fri, Sep 27, 2024 at 9:52 AM  wrote:
>
> From: Pan Li 
>
> We iterate all phi node of bb to try to match the SAT_* pattern
> for scalar integer.  We also remove the phi mode when the relevant
> pattern matched.
>
> Unfortunately the iterator may have no idea the phi node is removed
> and continue leverage the free data and then ICE similar as below.
>
> [0] psi ptr 0x75216340c000
> [0] psi ptr 0x75216340c400
> [1] psi ptr 0xa5a5a5a5a5a5a5a5 <=== GC freed pointer.
>
> during GIMPLE pass: widening_mul
> tmp.c: In function ‘f’:
> tmp.c:45:6: internal compiler error: Segmentation fault
>45 | void f(int rows, int cols) {
>   |  ^
> 0x36e2788 internal_error(char const*, ...)
> ../../gcc/diagnostic-global-context.cc:517
> 0x18005f0 crash_signal
> ../../gcc/toplev.cc:321
> 0x752163c4531f ???
> ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
> 0x103ae0e bool is_a_helper::test(gimple*)
> ../../gcc/gimple.h:1256
> 0x103f9a5 bool is_a(gimple*)
> ../../gcc/is-a.h:232
> 0x103dc78 gphi* as_a(gimple*)
> ../../gcc/is-a.h:255
> 0x104f12e gphi_iterator::phi() const
> ../../gcc/gimple-iterator.h:47
> 0x1a57bef after_dom_children
> ../../gcc/tree-ssa-math-opts.cc:6140
> 0x3344482 dom_walker::walk(basic_block_def*)
> ../../gcc/domwalk.cc:354
> 0x1a58601 execute
> ../../gcc/tree-ssa-math-opts.cc:6312
>
> This patch would like to fix the iterate on modified collection problem
> by backup the next phi in advance.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.

OK.

Thanks,
Richard.

> PR middle-end/116861
>
> gcc/ChangeLog:
>
> * tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children): 
> Backup
> the next psi iterator before remove the phi node.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/torture/pr116861-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/testsuite/gcc.dg/torture/pr116861-1.c | 76 +++
>  gcc/tree-ssa-math-opts.cc |  9 ++-
>  2 files changed, 83 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr116861-1.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr116861-1.c 
> b/gcc/testsuite/gcc.dg/torture/pr116861-1.c
> new file mode 100644
> index 000..7dcfe664d89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr116861-1.c
> @@ -0,0 +1,76 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +void pm_message(void);
> +struct CmdlineInfo {
> +  _Bool wantCrop[4];
> +  unsigned int margin;
> +};
> +typedef struct {
> +  unsigned int removeSize;
> +} CropOp;
> +typedef struct {
> +  CropOp op[4];
> +} CropSet;
> +static void divideAllBackgroundIntoBorders(unsigned int const totalSz,
> +   _Bool const wantCropSideA,
> +   _Bool const wantCropSideB,
> +   unsigned int const wantMargin,
> +   unsigned int *const sideASzP,
> +   unsigned int *const sideBSzP) {
> +  unsigned int sideASz, sideBSz;
> +  if (wantCropSideA && wantCropSideB)
> +  {
> +sideASz = totalSz / 2;
> +if (wantMargin)
> +  sideBSz = totalSz - sideASz;
> +  }
> +  else if (wantCropSideB)
> +  {
> +sideBSz = 0;
> +  }
> +  *sideASzP = sideASz;
> +  *sideBSzP = sideBSz;
> +}
> +static CropOp oneSideCrop(_Bool const wantCrop, unsigned int const borderSz,
> +  unsigned int const margin) {
> +  CropOp retval;
> +  if (wantCrop)
> +  {
> +if (borderSz >= margin)
> +  retval.removeSize = borderSz - margin;
> +else
> +  retval.removeSize = 0;
> +  }
> +  return retval;
> +}
> +struct CmdlineInfo cmdline1;
> +void f(int rows, int cols) {
> +  struct CmdlineInfo cmdline0 = cmdline1;
> +  CropSet crop;
> +  struct CmdlineInfo cmdline = cmdline0;
> +  CropSet retval;
> +  unsigned int leftBorderSz, rghtBorderSz;
> +  unsigned int topBorderSz, botBorderSz;
> +  divideAllBackgroundIntoBorders(cols, cmdline.wantCrop[0],
> + cmdline.wantCrop[1], cmdline.margin > 0,
> + &leftBorderSz, &rghtBorderSz);
> +  divideAllBackgroundIntoBorders(rows, cmdline.wantCrop[2],
> + cmdline.wantCrop[3], cmdline.margin > 0,
> + &topBorderSz, &botBorderSz);
> +  retval.op[0] =
> +  oneSideCrop(cmdline.wantCrop[0], leftBorderSz, cmdline.margin);
> +  retval.op[1] =
> +  oneSideCrop(cmdline.wantCrop[1], rghtBorderSz, cmdline.margin);
> +  retval.op[2] =
> +  oneSideCrop(cmdline.wantCrop[2], topBorderSz, cmdline.margin);
> +  retval.op[3] =
> +  oneSideCrop(cmdline.wantCrop[3], botBorderSz, cmdline.margin);
> +  crop = retval;
> +  unsigned int i = 0;
> +  for (i = 0; i < 4;

[PATCH v2] arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-09-27 Thread Andre Vieira (lists)

Resending as v2 so CI picks it up.

This patch refactors and fixes an issue where
arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a dec_insn.
This dec_insn is the instruction that decreases the loop counter inside a
decrementing loop and we expect it to have the following form:
(set (reg CONDCOUNT)
  (plus (reg CONDCOUNT)
(const_int)))

Where CONDCOUNT is the loop counter, and const int is the negative constant
used to decrement it.

This patch also improves our search for a valid dec_insn.  Before this patch
we'd only look for a dec_insn inside the loop header if the loop latch was
empty.  We now also search the loop header if the loop latch is not
empty but
the last instruction is not a valid dec_insn.  This could potentially be
improved
to search all instructions inside the loop latch.

gcc/ChangeLog:

 * config/arm/arm.cc (check_dec_insn): New helper function containing
 code hoisted from...
 (arm_mve_dlstp_check_dec_counter): ... here. Use check_dec_insn to
 check the validity of the candidate dec_insn.

gcc/testsuite/ChangeLog:

 * gcc.targer/arm/mve/dlstp-loop-form.c: New test.diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 
92cd168e65937ef7350477464e8b0becf85bceed..363a972170b37275372bb8bf30d510876021c8c0
 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -35214,6 +35214,32 @@ arm_mve_dlstp_check_inc_counter (loop *loop, rtx_insn* 
vctp_insn,
   return vctp_insn;
 }
 
+/* Helper function to 'arm_mve_dlstp_check_dec_counter' to make sure DEC_INSN
+   is of the expected form:
+   (set (reg a) (plus (reg a) (const_int)))
+   where (reg a) is the same as CONDCOUNT.
+   Return a rtx with the set if it is in the right format or NULL_RTX
+   otherwise.  */
+
+static rtx
+check_dec_insn (rtx_insn *dec_insn, rtx condcount)
+{
+  if (!NONDEBUG_INSN_P (dec_insn))
+return NULL_RTX;
+  rtx dec_set = single_set (dec_insn);
+  if (!dec_set
+  || !REG_P (SET_DEST (dec_set))
+  || GET_CODE (SET_SRC (dec_set)) != PLUS
+  || !REG_P (XEXP (SET_SRC (dec_set), 0))
+  || !CONST_INT_P (XEXP (SET_SRC (dec_set), 1))
+  || REGNO (SET_DEST (dec_set))
+ != REGNO (XEXP (SET_SRC (dec_set), 0))
+  || REGNO (SET_DEST (dec_set)) != REGNO (condcount))
+return NULL_RTX;
+
+  return dec_set;
+}
+
 /* Helper function to `arm_mve_loop_valid_for_dlstp`.  In the case of a
counter that is decrementing, ensure that it is decrementing by the
right amount in each iteration and that the target condition is what
@@ -35230,30 +35256,19 @@ arm_mve_dlstp_check_dec_counter (loop *loop, 
rtx_insn* vctp_insn,
  loop latch.  Here we simply need to verify that this counter is the same
  reg that is also used in the vctp_insn and that it is not otherwise
  modified.  */
-  rtx_insn *dec_insn = BB_END (loop->latch);
+  rtx dec_set = check_dec_insn (BB_END (loop->latch), condcount);
   /* If not in the loop latch, try to find the decrement in the loop header.  
*/
-  if (!NONDEBUG_INSN_P (dec_insn))
+  if (dec_set == NULL_RTX)
   {
 df_ref temp = df_bb_regno_only_def_find (loop->header, REGNO (condcount));
 /* If we haven't been able to find the decrement, bail out.  */
 if (!temp)
   return NULL;
-dec_insn = DF_REF_INSN (temp);
-  }
-
-  rtx dec_set = single_set (dec_insn);
+dec_set = check_dec_insn (DF_REF_INSN (temp), condcount);
 
-  /* Next, ensure that it is a PLUS of the form:
- (set (reg a) (plus (reg a) (const_int)))
- where (reg a) is the same as condcount.  */
-  if (!dec_set
-  || !REG_P (SET_DEST (dec_set))
-  || !REG_P (XEXP (SET_SRC (dec_set), 0))
-  || !CONST_INT_P (XEXP (SET_SRC (dec_set), 1))
-  || REGNO (SET_DEST (dec_set))
- != REGNO (XEXP (SET_SRC (dec_set), 0))
-  || REGNO (SET_DEST (dec_set)) != REGNO (condcount))
-return NULL;
+if (dec_set == NULL_RTX)
+  return NULL;
+  }
 
   decrementnum = INTVAL (XEXP (SET_SRC (dec_set), 1));
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/dlstp-loop-form.c 
b/gcc/testsuite/gcc.target/arm/mve/dlstp-loop-form.c
new file mode 100644
index 
..a1b26873d7908035c726e3724c91b186c697bc60
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/dlstp-loop-form.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-options "-Ofast" } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+#pragma GCC arm "arm_mve_types.h"
+#pragma GCC arm "arm_mve.h" false
+typedef __attribute__((aligned(2))) float16x8_t e;
+mve_pred16_t c(long d) { return __builtin_mve_vctp16qv8bi(d); }
+int f();
+void n() {
+  int g, h, *i, j;
+  mve_pred16_t k;
+  e acc;
+  e l;
+  e m;
+  for (;;) {
+j = g;
+acc[g];
+for (; h < g; h += 8) {
+  k = c(j);
+  acc = vfmsq_m(acc, l, m, k);
+  j -= 8;
+}
+i[g] = f(acc);
+  }
+}
+


Re: [RFC PATCH] Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization.

2024-09-27 Thread Kito Cheng
> > So should we adjust very-cheap to allow niter peeling as proposed or
> > should we switch
> > the default at -O2 to cheap?
>
> Any thoughts from other backend maintainers?

No preference from RISC-V since is variable length vector flavor, so
no epilogue for those case, I mean it's already vectorizeable on
RISC-V with -O2 :P

https://godbolt.org/z/v5z8WxdjT


Re: [PATCH] diagnostic: Use vec instead of custom array reallocations for m_classification_history/m_push_list [PR116847]

2024-09-27 Thread David Malcolm
On Thu, 2024-09-26 at 23:24 +0200, Jakub Jelinek wrote:
> Hi!
> 
> diagnostic.h already relies on vec.h, it uses auto_vec in one spot.
> 
> The following patch converts m_classification_history and m_push_list
> hand-managed arrays to vec templates.
> The main advantage is exponential rather than linear reallocation,
> e.g. with current libstdc++ headers if one includes all the standard
> headers there could be ~ 300 reallocations of the
> m_classification_history
> array (sure, not all of them will result in actually copying the
> data, but
> still).
> In addition to that it fixes some formatting issues in the code.
> 
> Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-linux
> and
> i686-linux still pending, ok for trunk if it passes it?

Thanks, yes please.

Dave



Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread David Malcolm
On Thu, 2024-09-26 at 23:28 +0200, Jakub Jelinek wrote:
> Hi!
> 
> The following patch on top of the just posted cleanup patch
> saves/restores the m_classification_history and m_push_list
> vectors for PCH.  Without that as the testcase shows during parsing
> of the templates we don't report ignored diagnostics, but after
> loading
> PCH header when instantiating those templates those warnings can be
> emitted.  This doesn't show up on x86_64-linux build because
> configure
> injects there -fcf-protection -mshstk flags during library build (and
> so
> also during PCH header creation), but make check doesn't use those
> flags
> and so the PCH header is ignored.
> 
> Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-linux
> and
> i686-linux still pending, ok for trunk if it passes it?

Thanks, yes please

Dave



Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread David Malcolm
On Fri, 2024-09-27 at 10:23 -0400, David Malcolm wrote:
> On Fri, 2024-09-27 at 09:54 -0400, Lewis Hyatt wrote:
> > On Fri, Sep 27, 2024 at 9:41 AM David Malcolm 
> > wrote:
> > > 
> > > On Thu, 2024-09-26 at 23:28 +0200, Jakub Jelinek wrote:
> > > > Hi!
> > > > 
> > > > The following patch on top of the just posted cleanup patch
> > > > saves/restores the m_classification_history and m_push_list
> > > > vectors for PCH.  Without that as the testcase shows during
> > > > parsing
> > > > of the templates we don't report ignored diagnostics, but after
> > > > loading
> > > > PCH header when instantiating those templates those warnings
> > > > can
> > > > be
> > > > emitted.  This doesn't show up on x86_64-linux build because
> > > > configure
> > > > injects there -fcf-protection -mshstk flags during library
> > > > build
> > > > (and
> > > > so
> > > > also during PCH header creation), but make check doesn't use
> > > > those
> > > > flags
> > > > and so the PCH header is ignored.
> > > > 
> > > > Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-
> > > > linux
> > > > and
> > > > i686-linux still pending, ok for trunk if it passes it?
> > > 
> > > Thanks, yes please
> > > 
> > > Dave
> > > 
> > 
> > A couple comments that may be helpful...
> > 
> > -This is also PR 64117
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117)
> > 
> > -I submitted a patch last year for that but did not get any
> > response
> > (
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html
> > ).
> > I guess I never pinged it because I am still trying to ping two
> > other
> > ones :). 
> 
> Gahhh, I'm sorry about this.
> 
> What are the other two patches?
> 
> > My patch did not switch to vec so it was not as nice as this
> > one. I wonder though, if some of the testcases I added could be
> > incorporated? In particular the testcase from my patch
> > pragma-diagnostic-3.C I believe will still be failing after this
> > one.
> > There is an issue with C++ because it processes the pragmas twice,
> > once in early mode and once in normal mode, that makes it do the
> > wrong
> > thing for this case:
> > 
> > t.h:
> > 
> >  #pragma GCC diagnostic push
> >  #pragma GCC diagnostic ignored...
> >  //no pop at end of the file
> > 
> > t.c
> > 
> >  #include "t.h"
> >  #pragma GCC diagnostic pop
> >  //expect to be at the initial state here, but won't be if t.h is a
> > PCH
> > 
> > In my patch I had separated the PCH restore from a more general
> > "state
> > restore" logic so that the C++ frontend can restore the state after
> > the first pass through the data.
> 
> It sounds like the ideal here would be to incorporate the test cases
> from Lewis's patch into Jakub's, if the latter can be tweaked to fix 
> pragma-diagnostic-3.C

...and I see that Jakub has pushed his patch (as
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=64072e60b1599ae7d347c2cdee46c3b0e37fc338
so is there a way to do Lewis's patch on top of that?

Sorry about this
Dave



[PATCH v1 2/4] aarch64: add minimal support for GCS build attributes

2024-09-27 Thread Matthieu Longo
From: Srinath Parvathaneni 

GCS (Guarded Control Stack, an Armv9.4-a extension) requires some
caution at runtime. The runtime linker needs to reason about the
compatibility of a set of relocable object files that might not have
been compiled with the same compiler.
Up until now, GNU properties are stored in a ELF section (.note.gnu.property)
and have been used for the previously mentioned runtime checks
performed by the linker. However, GNU properties are limited in
their expressibility, and a long-term commmitment was taken in the
ABI for the Arm architecture [1] to provide build attributes.

This patch adds a first support for AArch64 GCS build attributes.
This support includes generating two new assembler directives:
.aeabi_subsection and .aeabi_attribute. These directives are
generated as per the syntax mentioned in spec "Build Attributes for
the Arm® 64-bit Architecture (AArch64)" available at [1].

gcc/configure.ac now includes a new check to test whether the
assembler being used to build the toolchain supports these new
directives.
Two behaviors can be observed when -mbranch-protection=[gcs|standard]
is passed:
- If the assembler support them, the assembly directives are emitted
along the .note.gnu.property section for backward compatibility.
- If the assembler does not support them, only .note.gnu.property
section will emit the relevant information.

This patch needs to be applied on top of GCC gcs patch series [2].

Bootstrapped on aarch64-none-linux-gnu, and no regression found.

[1]: https://github.com/ARM-software/abi-aa/pull/230
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/vendors/ARM/heads/gcs

gcc/ChangeLog:

* config.in: Regenerated
* config/aarch64/aarch64.cc (aarch64_emit_aeabi_attribute): New
function declaration.
(aarch64_emit_aeabi_subsection): Likewise.
(aarch64_start_file): Emit gcs build attributes.
(aarch64_file_end_indicate_exec_stack): Update gcs bit in
note.gnu.property section.
* configure: Regenerated.
* configure.ac: Add gcc configure check.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/build-attributes/build-attribute-gcs.c: New test.
* gcc.target/aarch64/build-attributes/build-attribute-standard.c: New test.
* gcc.target/aarch64/build-attributes/build-attributes.exp: New DejaGNU 
file.
* gcc.target/aarch64/build-attributes/no-build-attribute-bti.c: New test.
* gcc.target/aarch64/build-attributes/no-build-attribute-gcs.c: New test.
* gcc.target/aarch64/build-attributes/no-build-attribute-pac.c: New test.
* gcc.target/aarch64/build-attributes/no-build-attribute-standard.c: New 
test.
---
 gcc/config.in |  6 +++
 gcc/config/aarch64/aarch64.cc | 41 +
 gcc/configure | 38 +++
 gcc/configure.ac  | 10 
 .../build-attributes/build-attribute-gcs.c| 12 +
 .../build-attribute-standard.c| 12 +
 .../build-attributes/build-attributes.exp | 46 +++
 .../build-attributes/no-build-attribute-bti.c | 12 +
 .../build-attributes/no-build-attribute-gcs.c | 12 +
 .../build-attributes/no-build-attribute-pac.c | 12 +
 .../no-build-attribute-standard.c | 12 +
 11 files changed, 213 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/build-attribute-gcs.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/build-attribute-standard.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/build-attributes.exp
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-bti.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-gcs.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-pac.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-standard.c

diff --git a/gcc/config.in b/gcc/config.in
index 7fcabbe5061..1309ba2b133 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -379,6 +379,12 @@
 #endif
 
 
+/* Define if your assembler supports GCS build attributes. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_BUILD_ATTRIBUTES_GCS
+#endif
+
+
 /* Define to the level of your assembler's compressed debug section support.
*/
 #ifndef USED_FOR_TARGET
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 6d9075011ec..61e0248817f 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24677,6 +24677,33 @@ aarch64_post_cfi_startproc (FILE *f, tree ignored 
ATTRIBUTE_UNUSED)
asm_fprintf (f, "\t.cfi_b_key_frame\n");
 }
 
+/* This function is used to emit an AEABI attribute with tag and its associated
+   value.  We emit the numerical value of the tag and the textual tags as
+   comment so that anyone reading the assembler output will know which tag is
+   b

[PATCH v1 0/4][RFC] aarch64: add minimal support for GCS build attributes

2024-09-27 Thread Matthieu Longo
The primary focus of this patch series is to add support for build attributes 
in the context of GCS (Guarded Control Stack, an Armv9.4-a extension) to the 
AArch64 backend.
It addresses comments from a previous review [1], and proposes a different 
approach compared to the previous implementation of the build attributes.

The series is for now at the RFC stage, and is composed of the following 4 
patches:
1. Patch adding assembly debug comments (-dA) to the existing GNU properties, 
to improve testing and check the correctness of values.
2. The minimal patch adding support for build attributes in the context of GCS.
3. A refactoring of (2) to make things less error-prone and more modular, add 
support for asciz attributes and more debug information.
4. A refactoring of (1) relying partly on (3).
The targeted final state of this series would consist in squashing (2) + (3), 
and (1) + (4).

**Special note regarding (2):** If Gas has support for build attributes, both 
build attributes and GNU properties will be emitted. This behavior is still 
open for discussion. Please, let me know your thoughts regarding this behavior.

Diff with previous revision [1]:
- update the description of (2)
- address the comments related to the tests in (2)
- add new commits (1), (3) and (4)

This patch series needs to be applied on top of the patch series for GCS [2].

Bootstrapped on aarch64-none-linux-gnu, and no regression found.

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662825.html
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/vendors/ARM/heads/gcs

Regards,
Matthieu


Matthieu Longo (3):
  aarch64: add debug comments to feature properties in .note.gnu.property
  aarch64: improve assembly debug comments for build attributes
  aarch64: encapsulate note.gnu.property emission into a class

Srinath Parvathaneni (1):
  aarch64: add minimal support for GCS build attributes

 gcc/config.gcc|   2 +-
 gcc/config.in |   6 +
 gcc/config/aarch64/aarch64-dwarf-metadata.cc  | 120 
 gcc/config/aarch64/aarch64-dwarf-metadata.h   | 266 ++
 gcc/config/aarch64/aarch64.cc |  68 ++---
 gcc/config/aarch64/t-aarch64  |   7 +
 gcc/configure |  38 +++
 gcc/configure.ac  |  10 +
 gcc/testsuite/gcc.target/aarch64/bti-1.c  |   5 +-
 .../build-attributes/build-attribute-gcs.c|  12 +
 .../build-attribute-standard.c|  12 +
 .../build-attributes/build-attributes.exp |  46 +++
 .../build-attributes/no-build-attribute-bti.c |  12 +
 .../build-attributes/no-build-attribute-gcs.c |  12 +
 .../build-attributes/no-build-attribute-pac.c |  12 +
 .../no-build-attribute-standard.c |  12 +
 16 files changed, 594 insertions(+), 46 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-dwarf-metadata.cc
 create mode 100644 gcc/config/aarch64/aarch64-dwarf-metadata.h
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/build-attribute-gcs.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/build-attribute-standard.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/build-attributes.exp
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-bti.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-gcs.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-pac.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-standard.c

-- 
2.46.1



[PATCH v1 4/4] aarch64: encapsulate note.gnu.property emission into a class

2024-09-27 Thread Matthieu Longo
gcc/ChangeLog:

* config.gcc: Add aarch64-dwarf-metadata.o to extra_objs.
* config/aarch64/aarch64-dwarf-metadata.h (class section_note_gnu_property):
Encapsulate GNU properties code into a class.
* config/aarch64/aarch64.cc
(GNU_PROPERTY_AARCH64_FEATURE_1_AND): Define.
(GNU_PROPERTY_AARCH64_FEATURE_1_BTI): Likewise.
(GNU_PROPERTY_AARCH64_FEATURE_1_PAC): Likewise.
(GNU_PROPERTY_AARCH64_FEATURE_1_GCS): Likewise.
(aarch64_file_end_indicate_exec_stack): Move GNU properties code to
aarch64-dwarf-metadata.cc
* config/aarch64/t-aarch64: Declare target aarch64-dwarf-metadata.o
* config/aarch64/aarch64-dwarf-metadata.cc: New file.
---
 gcc/config.gcc   |   2 +-
 gcc/config/aarch64/aarch64-dwarf-metadata.cc | 120 +++
 gcc/config/aarch64/aarch64-dwarf-metadata.h  |  19 +++
 gcc/config/aarch64/aarch64.cc|  87 +-
 gcc/config/aarch64/t-aarch64 |   7 ++
 5 files changed, 153 insertions(+), 82 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-dwarf-metadata.cc

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f09ce9f63a0..b448c2a91d1 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -351,7 +351,7 @@ aarch64*-*-*)
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
d_target_objs="aarch64-d.o"
-   extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o 
aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o 
aarch64-sve-builtins-sve2.o aarch64-sve-builtins-sme.o 
cortex-a57-fma-steering.o aarch64-speculation.o 
falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o 
aarch64-early-ra.o aarch64-ldp-fusion.o"
+   extra_objs="aarch64-builtins.o aarch-common.o aarch64-dwarf-metadata.o 
aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o 
aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o 
aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o aarch64-speculation.o 
falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o 
aarch64-early-ra.o aarch64-ldp-fusion.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.h 
\$(srcdir)/config/aarch64/aarch64-builtins.cc 
\$(srcdir)/config/aarch64/aarch64-sve-builtins.h 
\$(srcdir)/config/aarch64/aarch64-sve-builtins.cc"
target_has_targetm_common=yes
;;
diff --git a/gcc/config/aarch64/aarch64-dwarf-metadata.cc 
b/gcc/config/aarch64/aarch64-dwarf-metadata.cc
new file mode 100644
index 000..36659862b59
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-dwarf-metadata.cc
@@ -0,0 +1,120 @@
+#define INCLUDE_STRING
+#define INCLUDE_ALGORITHM
+#define INCLUDE_MEMORY
+#define INCLUDE_VECTOR
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "output.h"
+
+#include "aarch64-dwarf-metadata.h"
+
+#include 
+
+/* Defined for convenience.  */
+#define POINTER_BYTES (POINTER_SIZE / BITS_PER_UNIT)
+
+namespace aarch64 {
+
+constexpr unsigned GNU_PROPERTY_AARCH64_FEATURE_1_AND = 0xc000;
+constexpr unsigned GNU_PROPERTY_AARCH64_FEATURE_1_BTI = (1U << 0);
+constexpr unsigned GNU_PROPERTY_AARCH64_FEATURE_1_PAC = (1U << 1);
+constexpr unsigned GNU_PROPERTY_AARCH64_FEATURE_1_GCS = (1U << 2);
+
+namespace {
+
+std::string join_s (std::string s1, const std::string &s2)
+{
+  constexpr const char* separator = ", ";
+  return std::move (s1)
+   .append (separator)
+   .append (s2);
+};
+
+std::string gnu_property_features_to_string (unsigned feature_1_and)
+{
+  std::vector feature_bits;
+
+  if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
+feature_bits.push_back ("BTI");
+  if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_PAC)
+feature_bits.push_back ("PAC");
+  if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_GCS)
+feature_bits.push_back ("GCS");
+
+  if (feature_bits.empty ())
+return {};
+
+  return std::accumulate(std::next(feature_bits.cbegin()), feature_bits.cend(),
+feature_bits[0], join_s);
+};
+
+} // namespace anonymous
+
+section_note_gnu_property::section_note_gnu_property():
+  feature_1_and(0)
+{}
+
+void section_note_gnu_property::bti_enabled()
+{
+  feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_BTI;
+}
+
+void section_note_gnu_property::pac_enabled()
+{
+  feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_PAC;
+}
+
+void section_note_gnu_property::gcs_enabled()
+{
+  feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_GCS;
+}
+
+void section_note_gnu_property::write () const
+{
+  if (feature_1_and)
+{
+  /* Generate .note.gnu.property section.  */
+  switch_to_section (get_section (".note.gnu.property",
+ SECTION_NOTYPE, NULL));
+
+  /* PT_NOTE header: namesz, descsz, type.
+namesz = 4 ("GNU\0")
+descsz = 16 (Size of the program property array)
+ [(12 + padding) * Number of array elements]
+type   = 

[PATCH v6] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-27 Thread Qing Zhao
Hi, this is the 6th version of the patch.

Compare to the 5th version, the major changes are (Address Jakub's comments)

1. delete the new global "in_builtin_counted_by_ref"
2. split the "counted_by" specific handling from the routne 
"build_component_ref"
   as a new routine "handle_counted_by_for_component_ref"
3. when handle "__builtin_counted_by_ref" in c-parser.cc, if in_typeof or
   in_alignof, call "handle_counted_by_for_component_ref" to generate
   the call to .ACCESS_WITH_SIZE.
4. Add the following into the testing cases. (in builtin-counted-by-ref-1.c)
   A. __alignof:
  __alignof (*__builtin_counted_by_ref (array_annotated->c))

   B. __typeof:
  __typeof (*__builtin_counted_by_ref (array_annotated->c)) 
  __typeof (char[__builtin_counted_by_ref (array_annotated->c)
  == &array_annotated->b ? 1 : 10]),

bootstrapped and regress tested on both X86 and aarch64. no issue.

Okay for the trunk?

thanks.

Qing.

=


With the addition of the 'counted_by' attribute and its wide roll-out
within the Linux kernel, a use case has been found that would be very
nice to have for object allocators: being able to set the counted_by
counter variable without knowing its name.

For example, given:

  struct foo {
...
int counter;
...
struct bar array[] __attribute__((counted_by (counter)));
  } *p;

The existing Linux object allocators are roughly:

  #define MAX(A, B) (A > B) ? (A) : (B)
  #define alloc(P, FAM, COUNT) ({ \
__auto_type __p = &(P); \
size_t __size = MAX (sizeof(*P),
 __builtin_offsetof (__typeof(*P), FAM)
 + sizeof (*(P->FAM)) * COUNT); \
*__p = kmalloc(__size); \
  })

Right now, any addition of a counted_by annotation must also
include an open-coded assignment of the counter variable after
the allocation:

  p = alloc(p, array, how_many);
  p->counter = how_many;

In order to avoid the tedious and error-prone work of manually adding
the open-coded counted-by intializations everywhere in the Linux
kernel, a new GCC builtin __builtin_counted_by_ref will be very useful
to be added to help the adoption of the counted-by attribute.

 -- Built-in Function: TYPE __builtin_counted_by_ref (PTR)
 The built-in function '__builtin_counted_by_ref' checks whether the
 array object pointed by the pointer PTR has another object
 associated with it that represents the number of elements in the
 array object through the 'counted_by' attribute (i.e.  the
 counted-by object).  If so, returns a pointer to the corresponding
 counted-by object.  If such counted-by object does not exist,
 returns a NULL pointer.

 This built-in function is only available in C for now.

 The argument PTR must be a pointer to an array.  The TYPE of the
 returned value must be a pointer type pointing to the corresponding
 type of the counted-by object or VOID pointer type in case of a
 NULL pointer being returned.

With this new builtin, the central allocator could be updated to:

  #define MAX(A, B) (A > B) ? (A) : (B)
  #define alloc(P, FAM, COUNT) ({ \
__auto_type __p = &(P); \
__auto_type __c = (COUNT); \
size_t __size = MAX (sizeof (*(*__p)),\
 __builtin_offsetof (__typeof(*(*__p)),FAM) \
 + sizeof (*((*__p)->FAM)) * __c); \
if ((*__p = kmalloc(__size))) { \
  __auto_type ret = __builtin_counted_by_ref((*__p)->FAM); \
  *_Generic(ret, void *: &(size_t){0}, default: ret) = __c; \
} \
  })

And then structs can gain the counted_by attribute without needing
additional open-coded counter assignments for each struct, and
unannotated structs could still use the same allocator.

PR c/116016

gcc/c-family/ChangeLog:

* c-common.cc: Add new __builtin_counted_by_ref.
* c-common.h (enum rid): Add RID_BUILTIN_COUNTED_BY_REF.

gcc/c/ChangeLog:

* c-decl.cc (names_builtin_p): Add RID_BUILTIN_COUNTED_BY_REF.
* c-parser.cc (has_counted_by_object): New routine.
(get_counted_by_ref): New routine.
(c_parser_postfix_expression): Handle New RID_BUILTIN_COUNTED_BY_REF.
* c-tree.h: New routine handle_counted_by_for_component_ref.
* c-typeck.cc (handle_counted_by_for_component_ref): New routine.
(build_component_ref): Call the new routine.

gcc/ChangeLog:

* doc/extend.texi: Add documentation for __builtin_counted_by_ref.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-counted-by-ref-1.c: New test.
* gcc.dg/builtin-counted-by-ref.c: New test.
---
 gcc/c-family/c-common.cc  |   1 +
 gcc/c-family/c-common.h   |   1 +
 gcc/c/c-decl.cc   |   1 +
 gcc/c/c-parser.cc |  80 +++
 gcc/c/c-tree.h|   1 +
 gcc/c/c-typeck.cc |  34 +++--
 gcc/doc/extend.texi 

Re: [PATCH 1/2] JSON Dumping of GENERIC trees

2024-09-27 Thread Thor Preimesberger
That's all correct. I think I got it.

There are times where the code is emitting a json::object* that is
contained in another json object. Is it good style to return these
still as a unique_ptr? I'm looking over what I wrote again, and in
some parts I wrap the new json object in a unique_ptr (as a return in
some function calls) and in others I use new and delete.

Thanks,
Thor Preimesberger

On Fri, Sep 27, 2024 at 9:18 AM David Malcolm  wrote:
>
> On Sat, 2024-09-21 at 22:49 -0500, -thor wrote:
> > From: thor 
> >
> > This is the second revision of:
> >
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662849.html
> >
> > I've incorporated the feedback given both by Richard and David - I
> > didn't
> > find any memory leaks when testing in valgrind :)
>
> Thanks for the updated patch.
>
> [...snip...]
>
> > diff --git a/gcc/tree-emit-json.cc b/gcc/tree-emit-json.cc
> > new file mode 100644
> > index 000..df97069b922
> > --- /dev/null
> > +++ b/gcc/tree-emit-json.cc
>
> [...snip...]
>
> Thanks for using std::unique_ptr, but I may have been unclear in my
> earlier email - please use it to indicate ownership of a heap-allocated
> pointer...
>
> > +/* Adds Identifier information to JSON object */
> > +
> > +void
> > +identifier_node_add_json (tree t, std::unique_ptr & json_obj)
> > +  {
> > +const char* buff = IDENTIFIER_POINTER (t);
> > +json_obj->set_string("id_to_locale", identifier_to_locale(buff));
> > +buff = IDENTIFIER_POINTER (t);
> > +json_obj->set_string("id_point", buff);
> > +  }
>
> ...whereas here (and in many other places), the patch has a
>
>std::unique_ptr &json_obj
>
> (expressing a reference to a a unique_ptr i.e. a non-modifiable non-
> null pointer to a pointer that manages the lifetime of a modifiable
> json::object)
>
> where, if I'm reading the code correctly,
>
>json::object &json_obj
>
> (expressing a non-modifiable non-null pointer to a modifiable
> json::object).
>
> would be much clearer and simpler.
>
> [...snip...]
>
> Hope the above makes sense; sorry if I'm being unclear.
> Dave
>


Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread Lewis Hyatt
On Fri, Sep 27, 2024 at 10:23 AM David Malcolm  wrote:
> > -I submitted a patch last year for that but did not get any response
> > (
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html).
> > I guess I never pinged it because I am still trying to ping two other
> > ones :).
>
> Gahhh, I'm sorry about this.
>
> What are the other two patches?
>

Oh thanks, no worries... everyone has a lot to do :). I think, libcpp
patches in particular often don't get a lot of attention, but there is
a lot more going on. I'm happy to help where I can. Anyway the two I
was hoping to get in were:

https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663853.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648297.html

In case there is time for them :).

-Lewis


Re: [PATCH v6] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-27 Thread Qing Zhao
Jakub,

Thanks a lot for your comments, I will fix all these issues in the next version.

Qing

> On Sep 27, 2024, at 10:18, Jakub Jelinek  wrote:
> 
> On Fri, Sep 27, 2024 at 02:01:19PM +, Qing Zhao wrote:
>> +  /* Currently, only when the array_ref is an indirect_ref to a call to the
>> + .ACCESS_WITH_SIZE, return true.
>> + More cases can be included later when the counted_by attribute is
>> + extended to other situations.  */
>> +  if ((TREE_CODE (array_ref) == INDIRECT_REF)
> 
> The ()s around the == are useless.
> 
>> +  && is_access_with_size_p (TREE_OPERAND (array_ref, 0)))
>> +return true;
>> +  return false;
>> +}
>> +
>> +/* Get the reference to the counted-by object associated with the 
>> ARRAY_REF.  */
>> +static tree
>> +get_counted_by_ref (tree array_ref)
>> +{
>> +  /* Currently, only when the array_ref is an indirect_ref to a call to the
>> + .ACCESS_WITH_SIZE, get the corresponding counted_by ref.
>> + More cases can be included later when the counted_by attribute is
>> + extended to other situations.  */
>> +  if ((TREE_CODE (array_ref) == INDIRECT_REF)
> 
> Again.
> 
>> +if (TREE_CODE (TREE_TYPE (ref)) != ARRAY_TYPE)
>> +  {
>> + error_at (loc, "the argument must be an array"
>> +   "%<__builtin_counted_by_ref%>");
>> + expr.set_error ();
>> + break;
>> +  }
>> +
>> +/* if the array ref is inside TYPEOF or ALIGNOF, the call to
> 
> Comments should start with capital letter, i.e. If
> 
>> +   .ACCESS_WITH_SIZE was not genereated by the routine
> 
> s/genereated/generated/
> 
>> +   build_component_ref by default, we should generate it here.  */
>> +if ((in_typeof || in_alignof)
> && TREE_CODE (ref) == COMPONENT_REF)
> 
> The above && ... fits on the same line as the rest of the condition.
> 
>> +  ref = handle_counted_by_for_component_ref (loc, ref);
>> +
>> +if (has_counted_by_object (ref))
>> +  expr.value
>> + = get_counted_by_ref (ref);
> 
> This too.
> 
>> +else
>> +  expr.value
>> + = build_int_cst (build_pointer_type (void_type_node), 0);
> 
>else
>  expr.value = null_pointer_node;
> instead.
> 
>> +/*
>> + * For the COMPONENT_REF ref, check whether it has a counted_by attribute,
>> + * if so, wrap this COMPONENT_REF with the corresponding CALL to the
>> + * function .ACCESS_WITH_SIZE.
>> + * Otherwise, return the ref itself.
>> + */
> 
> We don't use this style of comments.  No *s at the start of each line, /*
> should be immediately followed after space with the first line and */
> should be right after . and two spaces.
> 
> Jakub
> 



Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 09:54:13AM -0400, Lewis Hyatt wrote:
> A couple comments that may be helpful...
> 
> -This is also PR 64117 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117)
> 
> -I submitted a patch last year for that but did not get any response
> (https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html).

Oops, sorry, wasn't aware of that.
Note, I've already committed it.

> I guess I never pinged it because I am still trying to ping two other
> ones :). My patch did not switch to vec so it was not as nice as this
> one. I wonder though, if some of the testcases I added could be
> incorporated? In particular the testcase from my patch

Maybe.

> pragma-diagnostic-3.C I believe will still be failing after this one.
> There is an issue with C++ because it processes the pragmas twice,
> once in early mode and once in normal mode, that makes it do the wrong
> thing for this case:
> 
> t.h:
> 
>  #pragma GCC diagnostic push
>  #pragma GCC diagnostic ignored...
>  //no pop at end of the file
> 
> t.c
> 
>  #include "t.h"
>  #pragma GCC diagnostic pop
>  //expect to be at the initial state here, but won't be if t.h is a PCH
> 
> In my patch I had separated the PCH restore from a more general "state
> restore" logic so that the C++ frontend can restore the state after
> the first pass through the data.

People shouldn't be doing this, without PCH or with it, and especially not
with PCH, that is simply too ugly.
That said, this was the reason why I have saved also the m_push_list
vector and not just the history.  If that isn't enough and there is some
other state on the libcpp side, I'd think we should go with a sorry and tell
the user not to do this with PCH.  It becomes a nightmare already if e.g.
the command line -Werror=something -Wno-error=somethingelse overrides
differ between the PCH creation and PCH reading (my first version of
the patch saved m_classify_diagnostic array but that broke tests relying
on those differences).

Jakub



Re: Fwd: [patch, fortran] Matmul and dot_product for unsigned

2024-09-27 Thread Thomas Koenig

Hi Mikael,


Now for the remaining intrinsics (FINDLOC, MAXLOC,
MINLOC, MAXVAL, MINVAL, CSHIFT and EOSHIFT still missing).

I have one patch series touching (inline) MINLOC and MAXLOC to post in 
the coming days.  Could you please keep away from them for one more week 
or two?


Looking at the previous patches, this will touch only check.cc,
iresolve.cc and simplify.cc (plus the library files).

Will your patches touch those areas?  If not, I think there should
be no conflict.

Best regards

Thomas




[wwwdocs pushed] contribute: link to wiki GettingStarted

2024-09-27 Thread Jason Merrill
IRC discussion noted that the GettingStarted page was hard to find. Committing
as obvious.

---
 htdocs/contribute.html | 4 
 1 file changed, 4 insertions(+)

diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index caff1f2a..53c27c6e 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -14,6 +14,10 @@
 or improved optimizations, bug fixes, documentation updates, web page
 improvements, etc
 
+If you're new to GCC, please also see
+the https://gcc.gnu.org/wiki/GettingStarted";>Getting Started
+section of the GCC Wiki.
+
 There are certain legal requirements and style issues which
 contributions must meet:
 

base-commit: a6e1396a66850f171db2aca29df00999fe2c4ae2
-- 
2.46.1



[PATCH v1 1/4] aarch64: add debug comments to feature properties in .note.gnu.property

2024-09-27 Thread Matthieu Longo
GNU properties are emitted to provide some information about the features
used in the generated code like PAC, BTI, or GCS. However, no debug
comment are emitted in the generated assembly even if -dA is provided.
This makes understanding the information stored in the .note.gnu.property
section more difficult than needed.

This patch adds assembly comments (if -dA is provided) next to the GNU
properties. For instance, if PAC and BTI are enabled, it will emit:
  .word  3  // GNU_PROPERTY_AARCH64_FEATURE_1_AND (BTI, PAC)

gcc/ChangeLog:

* config/aarch64/aarch64.cc
(aarch64_file_end_indicate_exec_stack): Emit assembly comments.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/bti-1.c: Emit assembly comments, and update
  test assertion.
---
 gcc/config/aarch64/aarch64.cc| 41 +++-
 gcc/testsuite/gcc.target/aarch64/bti-1.c |  5 +--
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 4b2e7a690c6..6d9075011ec 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -98,6 +98,8 @@
 #include "ipa-fnsummary.h"
 #include "hash-map.h"
 
+#include 
+
 /* This file should be included last.  */
 #include "target-def.h"
 
@@ -29129,8 +29131,45 @@ aarch64_file_end_indicate_exec_stack ()
 data   = feature_1_and.  */
   assemble_integer (GEN_INT (GNU_PROPERTY_AARCH64_FEATURE_1_AND), 4, 32, 
1);
   assemble_integer (GEN_INT (4), 4, 32, 1);
-  assemble_integer (GEN_INT (feature_1_and), 4, 32, 1);
 
+  if (!flag_debug_asm)
+   assemble_integer (GEN_INT (feature_1_and), 4, 32, 1);
+  else
+   {
+ asm_fprintf (asm_out_file, "\t.word\t%u", feature_1_and);
+
+ auto join_s = [] (std::string s1,
+   const std::string &s2,
+   const std::string &separator = ", ") -> std::string
+ {
+   return std::move (s1)
+ .append (separator)
+ .append (s2);
+ };
+
+ auto features_to_string
+   = [&join_s] (unsigned feature_1_and) -> std::string
+ {
+   std::vector feature_bits;
+   if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
+ feature_bits.push_back ("BTI");
+   if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_PAC)
+ feature_bits.push_back ("PAC");
+   if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_GCS)
+ feature_bits.push_back ("GCS");
+
+   if (feature_bits.empty ())
+ return {};
+   return std::accumulate(std::next(feature_bits.cbegin()),
+  feature_bits.cend(),
+  feature_bits[0],
+  join_s);
+ };
+ auto const& s_features = features_to_string (feature_1_and);
+ asm_fprintf (asm_out_file,
+   "\t%s GNU_PROPERTY_AARCH64_FEATURE_1_AND (%s)\n",
+   ASM_COMMENT_START, s_features.c_str ());
+   }
   /* Pad the size of the note to the required alignment.  */
   assemble_align (POINTER_SIZE);
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/bti-1.c 
b/gcc/testsuite/gcc.target/aarch64/bti-1.c
index 5a556b08ed1..e48017abc35 100644
--- a/gcc/testsuite/gcc.target/aarch64/bti-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/bti-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* -Os to create jump table.  */
-/* { dg-options "-Os" } */
+/* { dg-options "-Os -dA" } */
 /* { dg-require-effective-target lp64 } */
 /* If configured with --enable-standard-branch-protection, don't use
command line option.  */
@@ -61,4 +61,5 @@ lab2:
 }
 /* { dg-final { scan-assembler-times "hint\t34" 1 } } */
 /* { dg-final { scan-assembler-times "hint\t36" 12 } } */
-/* { dg-final { scan-assembler ".note.gnu.property" { target *-*-linux* } } } 
*/
+/* { dg-final { scan-assembler "\.section\t\.note\.gnu\.property" { target 
*-*-linux* } } } */
+/* { dg-final { scan-assembler "\.word\t7\t\/\/ 
GNU_PROPERTY_AARCH64_FEATURE_1_AND \\(BTI, PAC, GCS\\)" { target *-*-linux* } } 
} */
\ No newline at end of file
-- 
2.46.1



[wwwdocs PATCH] style: add IRC link to nav

2024-09-27 Thread Jason Merrill
I noticed that IRC information was difficult to find on the website.  OK?

---
 htdocs/style.mhtml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/htdocs/style.mhtml b/htdocs/style.mhtml
index d015029a..f1aa8214 100644
--- a/htdocs/style.mhtml
+++ b/htdocs/style.mhtml
@@ -67,6 +67,7 @@
   Snapshots
   Mailing lists
   https://gcc.gnu.org/onlinedocs/gcc/Contributors.html";>Contributors
+  https://gcc.gnu.org/wiki/GCConIRC";>IRC
   
   https://twitter.com/gnutools";>
 

[PATCH v1 3/4] aarch64: improve assembly debug comments for build attributes

2024-09-27 Thread Matthieu Longo
The previous implementation to emit build attributes did not support
string values (asciz) in aeabi_subsection, and was not emitting values
associated to tags in the assembly comments.

This new approach provides a more user-friendly interface relying on
typing, and improves the emitted assembly comments:
  * aeabi_attribute:
** Adds the interpreted value next to the tag in the assembly
comment.
** Supports asciz values.
  * aeabi_subsection:
** Adds debug information for its parameters.
** Auto-detects the attribute types when declaring the subsection.

Additionally, it is also interesting to note that the code was moved
to a separate file to improve modularity and "releave" the 1000-lines
long aarch64.cc file from a few lines. Finally, it introduces a new
namespace "aarch64::" for AArch64 backend which reduce the length of
function names by not prepending 'aarch64_' to each of them.

gcc/ChangeLog:

   * config/aarch64/aarch64.cc
   (aarch64_emit_aeabi_attribute): Delete.
   (aarch64_emit_aeabi_subsection): Delete.
   (aarch64_start_file): Use aeabi_subsection.
   * config/aarch64/aarch64-dwarf-metadata.h: New file.

gcc/testsuite/ChangeLog:

   * gcc.target/aarch64/build-attributes/build-attribute-gcs.c:
 Improve test to match debugging comments in assembly.
   * gcc.target/aarch64/build-attributes/build-attribute-standard.c:
 Likewise.
---
 gcc/config/aarch64/aarch64-dwarf-metadata.h   | 247 ++
 gcc/config/aarch64/aarch64.cc |  43 +--
 .../build-attributes/build-attribute-gcs.c|   4 +-
 .../build-attribute-standard.c|   4 +-
 4 files changed, 261 insertions(+), 37 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-dwarf-metadata.h

diff --git a/gcc/config/aarch64/aarch64-dwarf-metadata.h 
b/gcc/config/aarch64/aarch64-dwarf-metadata.h
new file mode 100644
index 000..9638bc7702f
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-dwarf-metadata.h
@@ -0,0 +1,247 @@
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009-2024 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#ifndef GCC_AARCH64_DWARF_METADATA_H
+#define GCC_AARCH64_DWARF_METADATA_H
+
+#include 
+#include 
+
+#include 
+
+namespace aarch64 {
+
+enum attr_val_type: uint8_t
+{
+  uleb128 = 0x0,
+  asciz = 0x1,
+};
+
+enum BA_TagFeature_t: uint8_t
+{
+  Tag_Feature_BTI = 1,
+  Tag_Feature_PAC = 2,
+  Tag_Feature_GCS = 3,
+};
+
+template 
+struct aeabi_attribute
+{
+  T_tag tag;
+  T_val value;
+};
+
+template 
+aeabi_attribute
+make_aeabi_attribute (T_tag tag, T_val val)
+{
+  return aeabi_attribute{tag, val};
+}
+
+namespace details {
+
+constexpr const char*
+to_c_str (bool b)
+{
+  return b ? "true" : "false";
+}
+
+constexpr const char*
+to_c_str (const char *s)
+{
+  return s;
+}
+
+constexpr const char*
+to_c_str (attr_val_type t)
+{
+  const char *s = nullptr;
+  switch (t) {
+case uleb128:
+  s = "ULEB128";
+  break;
+case asciz:
+  s = "asciz";
+  break;
+  }
+  return s;
+}
+
+constexpr const char*
+to_c_str (BA_TagFeature_t feature)
+{
+  const char *s = nullptr;
+  switch (feature) {
+case Tag_Feature_BTI:
+  s = "Tag_Feature_BTI";
+  break;
+case Tag_Feature_PAC:
+  s = "Tag_Feature_PAC";
+  break;
+case Tag_Feature_GCS:
+  s = "Tag_Feature_GCS";
+  break;
+  }
+  return s;
+}
+
+template <
+  typename T,
+  typename = typename std::enable_if::value, T>::type
+>
+constexpr const char*
+aeabi_attr_str_fmt (T phantom __attribute__((unused)))
+{
+return "\t.aeabi_attribute %u, %u";
+}
+
+constexpr const char*
+aeabi_attr_str_fmt (const char *phantom __attribute__((unused)))
+{
+return "\t.aeabi_attribute %u, \"%s\"";
+}
+
+template <
+  typename T,
+  typename = typename std::enable_if::value, T>::type
+>
+constexpr uint8_t
+aeabi_attr_val_for_fmt (T value)
+{
+  return static_cast(value);
+}
+
+constexpr const char*
+aeabi_attr_val_for_fmt (const char *s)
+{
+  return s;
+}
+
+template 
+void write (FILE *out_file, aeabi_attribute const &attr)
+{
+  asm_fprintf (out_file, aeabi_attr_str_fmt(T_val{}),
+  attr.tag, aeabi_attr_val_for_fmt(attr.value));
+  if (flag_debug_asm)
+asm_fprintf (out_file, "\t%s %s: %s", ASM_COMMENT_START,
+to_c_str(attr.tag), to_

[committed] libstdc++: Fix test FAIL due to -Wpointer-arith

2024-09-27 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This fixes a FAIL due to a -Wpointer-arith warning when testing with
c++11 or c++14 dialects. As an extension our std::atomic supports
pointer arithmetic in C++11 and C++14, but due to the system header
changes there is now a warning about it. The warning seems reasonable,
so rather than suppress it we should make the test expect it.

While looking into this I decided to simplify some of the code related
to atomic arithmetic.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (__atomic_base::_M_type_size):
Replace overloaded functions with static _S_type_size.
* include/std/atomic (atomic): Use is_object_v instead of
is_object.
* testsuite/29_atomics/atomic/operators/pointer_partial_void.cc:
Add dg-warning for -Wpointer-arith warning.
---
 libstdc++-v3/include/bits/atomic_base.h   | 33 +--
 libstdc++-v3/include/std/atomic   | 32 +-
 .../atomic/operators/pointer_partial_void.cc  |  1 +
 3 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_base.h 
b/libstdc++-v3/include/bits/atomic_base.h
index 7093d0fc822..72cc4bae6cf 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -687,12 +687,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   __pointer_type   _M_p _GLIBCXX20_INIT(nullptr);
 
-  // Factored out to facilitate explicit specialization.
-  constexpr ptrdiff_t
-  _M_type_size(ptrdiff_t __d) const { return __d * sizeof(_PTp); }
-
-  constexpr ptrdiff_t
-  _M_type_size(ptrdiff_t __d) const volatile { return __d * sizeof(_PTp); }
+  static constexpr ptrdiff_t
+  _S_type_size(ptrdiff_t __d)
+  { return __d * sizeof(_PTp); }
 
 public:
   __atomic_base() noexcept = default;
@@ -742,42 +739,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   __pointer_type
   operator++() noexcept
-  { return __atomic_add_fetch(&_M_p, _M_type_size(1),
+  { return __atomic_add_fetch(&_M_p, _S_type_size(1),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator++() volatile noexcept
-  { return __atomic_add_fetch(&_M_p, _M_type_size(1),
+  { return __atomic_add_fetch(&_M_p, _S_type_size(1),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator--() noexcept
-  { return __atomic_sub_fetch(&_M_p, _M_type_size(1),
+  { return __atomic_sub_fetch(&_M_p, _S_type_size(1),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator--() volatile noexcept
-  { return __atomic_sub_fetch(&_M_p, _M_type_size(1),
+  { return __atomic_sub_fetch(&_M_p, _S_type_size(1),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator+=(ptrdiff_t __d) noexcept
-  { return __atomic_add_fetch(&_M_p, _M_type_size(__d),
+  { return __atomic_add_fetch(&_M_p, _S_type_size(__d),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator+=(ptrdiff_t __d) volatile noexcept
-  { return __atomic_add_fetch(&_M_p, _M_type_size(__d),
+  { return __atomic_add_fetch(&_M_p, _S_type_size(__d),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator-=(ptrdiff_t __d) noexcept
-  { return __atomic_sub_fetch(&_M_p, _M_type_size(__d),
+  { return __atomic_sub_fetch(&_M_p, _S_type_size(__d),
  int(memory_order_seq_cst)); }
 
   __pointer_type
   operator-=(ptrdiff_t __d) volatile noexcept
-  { return __atomic_sub_fetch(&_M_p, _M_type_size(__d),
+  { return __atomic_sub_fetch(&_M_p, _S_type_size(__d),
  int(memory_order_seq_cst)); }
 
   bool
@@ -932,22 +929,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _GLIBCXX_ALWAYS_INLINE __pointer_type
   fetch_add(ptrdiff_t __d,
memory_order __m = memory_order_seq_cst) noexcept
-  { return __atomic_fetch_add(&_M_p, _M_type_size(__d), int(__m)); }
+  { return __atomic_fetch_add(&_M_p, _S_type_size(__d), int(__m)); }
 
   _GLIBCXX_ALWAYS_INLINE __pointer_type
   fetch_add(ptrdiff_t __d,
memory_order __m = memory_order_seq_cst) volatile noexcept
-  { return __atomic_fetch_add(&_M_p, _M_type_size(__d), int(__m)); }
+  { return __atomic_fetch_add(&_M_p, _S_type_size(__d), int(__m)); }
 
   _GLIBCXX_ALWAYS_INLINE __pointer_type
   fetch_sub(ptrdiff_t __d,
memory_order __m = memory_order_seq_cst) noexcept
-  { return __atomic_fetch_sub(&_M_p, _M_type_size(__d), int(__m)); }
+  { return __atomic_fetch_sub(&_M_p, _S_type_size(__d), int(__m)); }
 
   _GLIBCXX_ALWAYS_INLINE __pointer_type
   fetch_sub(ptrdiff_t __d,
memory_order __m = memory_order_seq

Re: [PATCH 1/2] JSON Dumping of GENERIC trees

2024-09-27 Thread Richard Biener
On Sun, Sep 22, 2024 at 5:49 AM -thor  wrote:
>
> From: thor 
>
> This is the second revision of:
>
>   https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662849.html
>
> I've incorporated the feedback given both by Richard and David - I didn't
> find any memory leaks when testing in valgrind :)
>
> As before: This patch allows the compiler to dump GENERIC trees as JSON 
> objects.
>
> The dump flag -fdump-tree-original-json dumps each fndecl node in the
> C frontend's gimplifier as a JSON object and traverses related nodes
> in an analagous manner as to raw-dumping.
>
> Some JSON parsers expect for there to be a single JSON value per file -
> the following shell command makes the output conformant:
>
>   tr -d '\n ' < out.json | sed -e 's/\]\[/,/g' | sed -e 's/}{/},{/g'
>
> The information in the dumped JSON is meant to be an amalgation of
> tree-pretty-print.cc's dump_generic_node and print-tree.cc's debug_tree.
>
> Bootstrapped and tested on x86_64-pc-linux-gnu without issue (again).

When trying to bootstrap this myself I run into

/home/rguenther/src/gcc/gcc/tree-emit-json.cc: In function
‘std::unique_ptr node_emit_json(tree, dump_info_p)’:
/home/rguenther/src/gcc/gcc/tree-emit-json.cc:1711:20: error:
‘it_args’ may be used uninitialized [-Werror=maybe-uninitialized]
 1711 | delete it_args;
  |^~~
/home/rguenther/src/gcc/gcc/tree-emit-json.cc:1683:27: note: ‘it_args’
was declared here
 1683 | json::object* it_args;
  |   ^~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:1195: tree-emit-json.o] Error 1

the code looks a bit suspicious and I have no idea how object lifetime
is supposed to be:

json::object* it_args;
...
if (arg_node && arg_node != void_list_node && arg_node !=
error_mark_node)
  auto it_args = ::make_unique ();
^^^ this it_args shadows it_args above and immediately goes out of scope?

while (arg_node && arg_node != void_list_node && arg_node
!= error_mark_node)
  {
it_args = node_to_json_brief(arg_node, di);
^^^ the last it_args - which will still be appended to args_holder
gets deleted below
args_holder->append(it_args);
arg_node = TREE_CHAIN (arg_node);
  }
json_obj->set("type_uid", _id);
json_obj->set("args", args_holder);
delete _id;
delete it_args;

note this is all in code special-casing pointer or reference to
FUNCTION_TYPE while
I would eventually expected that recursion to FUNCTION_TYPE handling which
exists as well would handle it.  This is probably a case where the
emitted JSON is
too close to what a human would expect rather than representing how the raw tree
object looks like?

Indeed when I html-tree a function pointer I get a segfault.  I've
deleted this special
case in my tree and it then works just fine and fixes the bootstrap issue.

Browsing a bit I did notice a lack of dumping of TREE_TYPE of nodes
like VAR_DECLs.
I've added the following to fix that (but didn't look for then
duplicate tree_type emissions
that could be elided).

diff --git a/gcc/tree-emit-json.cc b/gcc/tree-emit-json.cc
index df97069b922..26a47affd22 100644
--- a/gcc/tree-emit-json.cc
+++ b/gcc/tree-emit-json.cc
@@ -1508,7 +1508,11 @@ node_emit_json(tree t, dump_info_p di)
   if (TYPE_LANG_FLAG_7 (t))
json_obj->set_bool ("type_7", true);
   } //end tcc_type flags
-
+
+  // For nodes with a type output a reference to it
+  if (CODE_CONTAINS_STRUCT (code, TS_TYPED) && TREE_TYPE (t))
+json_obj->set("tree_type", node_to_json_brief(TREE_TYPE(t), di));
+
   // Accessors
   switch (code)
   {

In total I used the attached patch ontop of yours for my testing.

Richard.

> gcc/ChangeLog:
> * gcc/Makefile.in: Link tree-emit-json.o to c-gimplify.o
> * gcc/dumpfile.h (dump_flag): New dump flag TDF_JSON
> * gcc/tree-emit-json.cc: Logic for converting a tree to JSON
> and dumping.
> * gcc/tree-emit-json.h: For the above
> gcc/c-family/ChangeLog
> * gcc/c-family/c-gimplify.cc (c_genericize): Hook for
> -fdump-tree-original-json
>
> Signed-off-by: Thor C Preimesberger 
> ---
>  gcc/Makefile.in|2 +
>  gcc/c-family/c-gimplify.cc |   29 +-
>  gcc/dumpfile.cc|1 +
>  gcc/dumpfile.h |6 +
>  gcc/tree-emit-json.cc  | 3227 
>  gcc/tree-emit-json.h   |   73 +
>  6 files changed, 3327 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/tree-emit-json.cc
>  create mode 100644 gcc/tree-emit-json.h
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 68fda1a7591..b65cc7f0ad5 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1042,6 +1042,7 @@ OPTS_H = $(INPUT_H) $(VEC_H) opts.h $(OBSTACK_H)
>  SYMTAB_H = $(srcdir)/../libcpp/include/symtab.h $(OBSTACK_H)
>  CPP_INTER

Re: [PATCH] arm: Fix missed CE optimization for armv8.1-m.main [PR 116444]

2024-09-27 Thread Andre Vieira (lists)




On 26/09/2024 18:56, Ramana Radhakrishnan wrote:





  +/* Helper function to determine whether SEQ represents a sequence of
+   instructions representing the Armv8.1-M Mainline conditional arithmetic
+   instructions: csinc, csneg and csinv. The cinc instruction is generated
+   using a different mechanism.  */
+
+static bool
+arm_is_v81m_cond_insn (rtx_insn *seq)
+{
+  rtx_insn *curr_insn = seq;
+  rtx set;
+  /* The pattern may start with a simple set with register operands.  Skip
+ through any of those.  */
+  while (curr_insn)
+{
+  set = single_set (curr_insn);
+  if (!set
+   || !REG_P (SET_DEST (set)))
+ return false;
+
+  if (!REG_P (SET_SRC (set)))
+ break;
+  curr_insn = NEXT_INSN (curr_insn);


Too late at night for me - but don’t you want to skip DEBUG_INSNS in some way 
here ?



It's a good point, but this sequence is created by noce as a potential 
replacement for the incoming one and no debug insns are inserted here.


Compiling gcc/gcc/testsuite/gcc.target/arm/csinv-1.c with -g3 for an 
Armv8.1-M Mainline target still generates the csinv.


Either way, I could add code to skip if we don't have a NONDEBUG_INSN_P, 
but that means we should also do so after every NEXT_INSN after the 
while loop and at the end.  It does feel 'more robust', but I also fear 
it might be a bit overkill here?


Kind regards,
Andre




[PATCH] tree-optimization/116785 - relax volatile handling in PTA

2024-09-27 Thread Richard Biener
When there's volatile qualified stores we do not have to treat the
destination as pointing to ANYTHING.  It's only when reading from
it that we want to treat the resulting pointers as pointing to ANYTHING.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

PR tree-optimization/116785
* tree-ssa-structalias.cc (get_constraint_for_1): Only
volatile qualified reads produce ANYTHING.
---
 gcc/tree-ssa-structalias.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-structalias.cc b/gcc/tree-ssa-structalias.cc
index d6a53f801f0..54c4818998d 100644
--- a/gcc/tree-ssa-structalias.cc
+++ b/gcc/tree-ssa-structalias.cc
@@ -3646,7 +3646,7 @@ get_constraint_for_1 (tree t, vec *results, bool 
address_p,
   }
 case tcc_reference:
   {
-   if (TREE_THIS_VOLATILE (t))
+   if (!lhs_p && TREE_THIS_VOLATILE (t))
  /* Fall back to anything.  */
  break;
 
@@ -3751,7 +3751,7 @@ get_constraint_for_1 (tree t, vec *results, bool 
address_p,
   }
 case tcc_declaration:
   {
-   if (VAR_P (t) && TREE_THIS_VOLATILE (t))
+   if (!lhs_p && VAR_P (t) && TREE_THIS_VOLATILE (t))
  /* Fall back to anything.  */
  break;
get_constraint_for_ssa_var (t, results, address_p);
-- 
2.43.0


Re: [PATCH v2 4/6] c++/modules: Check linkage for exported declarations

2024-09-27 Thread Jason Merrill

On 9/27/24 1:59 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu,
OK for trunk?

-- >8 --

By [module.interface] p3, if an exported declaration is not within a
header unit, it shall not declare a name with internal linkage.

Unfortunately we cannot just do this within set_originating_module,
since at the locations its called the linkage for declarations are not
always fully determined yet.  We could move the calls but this causes
the checking assertion to fail as the originating module declaration may
have moved, and in general for some kinds of declarations it's not
always obvious where it should be moved to.

This patch instead introduces a new function to check that the linkage
of a declaration within a module is correct, to be called for all
declarations once their linkage is fully determined.

As a drive-by fix this patch also improves the source location of
namespace aliases to point at the identifier rather than the terminating
semicolon.

@@ -19926,11 +19926,34 @@ set_originating_module (tree decl, bool friend_p 
ATTRIBUTE_UNUSED)
DECL_MODULE_ATTACH_P (decl) = true;
  }
  
-  if (!module_exporting_p ())

+  /* It is illegal to export a declaration with internal linkage.  However, at


s/illegal/ill-formed/


@@ -9249,8 +9251,13 @@ push_namespace (tree name, bool make_inline)
  if (TREE_PUBLIC (ns))
DECL_MODULE_EXPORT_P (ns) = true;
  else if (!header_module_p ())
-   error_at (input_location,
- "exporting namespace with internal linkage");
+   {
+ if (name)
+   error_at (input_location,
+ "exporting namespace %qD with internal linkage", ns);


Since a namespace can only have internal linkage if it's within an 
unnamed namespace, maybe mention that?



+ else
+   error_at (input_location, "exporting anonymous namespace");


Let's move away from using the word "anonymous" for unnamed namespaces.

Jason



Re: [PATCH v2 5/6] c++/modules: Validate external linkage definitions in header units [PR116401]

2024-09-27 Thread Jason Merrill

On 9/27/24 1:59 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu,
OK for trunk?

-- >8 --

[module.import] p6 says "A header unit shall not contain a definition of
a non-inline function or variable whose name has external linkage."

This patch implements this requirement, and cleans up some issues in the
testsuite where this was already violated.  To handle deduction guides
we mark them as inline, since although we give them a definition for
implementation reasons, by the standard they have no definition, and so
we should not error in this case.


Please mention that in a comment.  OK with that change.


PR c++/116401

gcc/cp/ChangeLog:

* decl.cc (grokfndecl): Mark deduction guides as 'inline'.
* module.cc (check_module_decl_linkage): Implement checks for
non-inline external linkage definitions in headers.

gcc/testsuite/ChangeLog:

* g++.dg/modules/macro-4_c.H: Add missing 'inline'.
* g++.dg/modules/pr106761.h: Likewise.
* g++.dg/modules/pr98843_b.H: Likewise.
* g++.dg/modules/pr99468.H: Likewise.
* g++.dg/modules/pragma-1_a.H: Likewise.
* g++.dg/modules/tpl-ary-1.h: Likewise.
* g++.dg/modules/hdr-2.H: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/decl.cc|   1 +
  gcc/cp/module.cc  |  18 +++
  gcc/testsuite/g++.dg/modules/hdr-2.H  | 172 ++
  gcc/testsuite/g++.dg/modules/macro-4_c.H  |   2 +-
  gcc/testsuite/g++.dg/modules/pr106761.h   |   2 +-
  gcc/testsuite/g++.dg/modules/pr98843_b.H  |   2 +-
  gcc/testsuite/g++.dg/modules/pr99468.H|   2 +-
  gcc/testsuite/g++.dg/modules/pragma-1_a.H |   2 +-
  gcc/testsuite/g++.dg/modules/tpl-ary-1.h  |   2 +-
  9 files changed, 197 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/hdr-2.H

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 6a7ba416cf8..5ddb7eafa50 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -10823,6 +10823,7 @@ grokfndecl (tree ctype,
 have one: the restriction that you can't repeat a deduction guide
 makes them more like a definition anyway.  */
DECL_INITIAL (decl) = void_node;
+  DECL_DECLARED_INLINE_P (decl) = true;
break;
  default:
break;
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index a4343044d1a..b4030c62eec 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -19943,6 +19943,24 @@ check_module_decl_linkage (tree decl)
if (!module_has_cmi_p ())
  return;
  
+  /* A header unit shall not contain a definition of a non-inline function

+ or variable (not template) whose name has external linkage.  */
+  if (header_module_p ()
+  && !processing_template_decl
+  && ((TREE_CODE (decl) == FUNCTION_DECL
+  && !DECL_DECLARED_INLINE_P (decl)
+  && DECL_INITIAL (decl))
+ || (TREE_CODE (decl) == VAR_DECL
+ && !DECL_INLINE_VAR_P (decl)
+ && DECL_INITIALIZED_P (decl)
+ && !DECL_IN_AGGR_P (decl)))
+  && !(DECL_LANG_SPECIFIC (decl)
+  && DECL_TEMPLATE_INSTANTIATION (decl))
+  && decl_linkage (decl) == lk_external)
+error_at (DECL_SOURCE_LOCATION (decl),
+ "external linkage definition of %qD in header module must "
+ "be declared %", decl);
+
/* An internal-linkage declaration cannot be generally be exported.
   But it's OK to export any declaration from a header unit, including
   internal linkage declarations.  */
diff --git a/gcc/testsuite/g++.dg/modules/hdr-2.H 
b/gcc/testsuite/g++.dg/modules/hdr-2.H
new file mode 100644
index 000..097546d5667
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/hdr-2.H
@@ -0,0 +1,172 @@
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi !{} }
+// external linkage variables or functions in header units must
+// not have non-inline definitions
+
+int x_err;  // { dg-error "external linkage definition" }
+int y_err = 123;  // { dg-error "external linkage definition" }
+void f_err() {}  // { dg-error "external linkage definition" }
+
+struct Err {
+  Err();
+  void m();
+  static void s();
+  static int x;
+  static int y;
+};
+Err::Err() = default;  // { dg-error "external linkage definition" }
+void Err::m() {}  // { dg-error "external linkage definition" }
+void Err::s() {}  // { dg-error "external linkage definition" }
+int Err::x;  // { dg-error "external linkage definition" }
+int Err::y = 123;  // { dg-error "external linkage definition" }
+
+// No definition, OK
+extern int y_decl;
+void f_decl();
+
+template  struct DeductionGuide {};
+DeductionGuide() -> DeductionGuide;
+
+struct NoDefStatics {
+  enum E { V };
+  static const int x = 123;
+  static const E e = V;
+};
+
+// But these have definitions again (though the error locations aren't great)
+struct YesDefStatics {
+  enum E { V };
+  static const int x = 123;  // { dg-error "e

Re: [PATCH v2 6/6] c++/modules: Add testcase for standard-library exposures [PR115126]

2024-09-27 Thread Jason Merrill

On 9/27/24 2:00 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu,
OK for trunk?

-- >8 --

This adds a new xtreme-header testcase to ensure that we have no
regressions with regards to exposures of TU-local declarations
in the standard library header files.

A more restrictive test would be to do 'export extern "C++"' here, but
unfortunately the system headers on some targets declare TU-local
entities, so we'll make do with checking that at least the C++ standard
library headers don't refer to such entities.


Looks fine, please squash this with the patch that adds the warning.


PR c++/115126

gcc/testsuite/ChangeLog:

* g++.dg/modules/xtreme-header-8.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/testsuite/g++.dg/modules/xtreme-header-8.C | 8 
  1 file changed, 8 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/modules/xtreme-header-8.C

diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header-8.C 
b/gcc/testsuite/g++.dg/modules/xtreme-header-8.C
new file mode 100644
index 000..9da4e01cc68
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header-8.C
@@ -0,0 +1,8 @@
+// PR c++/115126
+// { dg-additional-options "-fmodules-ts -Wignored-exposures" }
+// { dg-module-cmi xstd }
+
+export module xstd;
+extern "C++" {
+  #include "xtreme-header.h"
+}




Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread Lewis Hyatt
On Fri, Sep 27, 2024 at 10:26 AM Jakub Jelinek  wrote:
>
> On Fri, Sep 27, 2024 at 09:54:13AM -0400, Lewis Hyatt wrote:
> > A couple comments that may be helpful...
> >
> > -This is also PR 64117 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117)
> >
> > -I submitted a patch last year for that but did not get any response
> > (https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html).
>
> Oops, sorry, wasn't aware of that.
> Note, I've already committed it.
>
> > I guess I never pinged it because I am still trying to ping two other
> > ones :). My patch did not switch to vec so it was not as nice as this
> > one. I wonder though, if some of the testcases I added could be
> > incorporated? In particular the testcase from my patch
>
> Maybe.
>
> > pragma-diagnostic-3.C I believe will still be failing after this one.
> > There is an issue with C++ because it processes the pragmas twice,
> > once in early mode and once in normal mode, that makes it do the wrong
> > thing for this case:
> >
> > t.h:
> > 
> >  #pragma GCC diagnostic push
> >  #pragma GCC diagnostic ignored...
> >  //no pop at end of the file
> >
> > t.c
> > 
> >  #include "t.h"
> >  #pragma GCC diagnostic pop
> >  //expect to be at the initial state here, but won't be if t.h is a PCH
> >
> > In my patch I had separated the PCH restore from a more general "state
> > restore" logic so that the C++ frontend can restore the state after
> > the first pass through the data.
>
> People shouldn't be doing this, without PCH or with it, and especially not
> with PCH, that is simply too ugly.
> That said, this was the reason why I have saved also the m_push_list
> vector and not just the history.  If that isn't enough and there is some
> other state on the libcpp side, I'd think we should go with a sorry and tell
> the user not to do this with PCH.  It becomes a nightmare already if e.g.
> the command line -Werror=something -Wno-error=somethingelse overrides
> differ between the PCH creation and PCH reading (my first version of
> the patch saved m_classify_diagnostic array but that broke tests relying
> on those differences).
>
> Jakub
>

Definitely agreed that people shouldn't be doing this :). I think I
felt like I should support it only because it was one of my patches
that caused C++ frontend to process the pragmas twice, but yeah I can
see how it's probably not worth the complexity.
One other note is that there's also been requests for the diagnostic
pragmas to make it through the LTO streaming process so that they can
be suppressed in the LTO frontend as well. That would also require
some more general support for saving the state, but it's another
topic.

-Lewis


[committed] libstdc++: Fix more pedwarns in headers for C++98

2024-09-27 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Some tests e.g.  17_intro/headers/c++1998/all_pedantic_errors.cc FAIL
with GLIBCXX_TESTSUITE_STDS=98 due to numerous C++11 extensions still in
use in the library headers. The recent changes to not make them system
headers means we get warnings now.

This change adds more diagnostic pragmas to suppress those warnings.

libstdc++-v3/ChangeLog:

* include/bits/istream.tcc: Add diagnostic pragmas around uses
of long long and extern template.
* include/bits/locale_facets.h: Likewise.
* include/bits/locale_facets.tcc: Likewise.
* include/bits/locale_facets_nonio.tcc: Likewise.
* include/bits/ostream.tcc: Likewise.
* include/bits/stl_algobase.h: Likewise.
* include/c_global/cstdlib: Likewise.
* include/ext/pb_ds/detail/resize_policy/hash_prime_size_policy_imp.hpp:
Likewise.
* include/ext/pointer.h: Likewise.
* include/ext/stdio_sync_filebuf.h: Likewise.
* include/std/istream: Likewise.
* include/std/ostream: Likewise.
* include/tr1/cmath: Likewise.
* include/tr1/type_traits: Likewise.
* include/tr1/functional_hash.h: Likewise. Remove semi-colons
at namespace scope that aren't needed after macro expansion.
* include/tr1/tuple: Remove semi-colon at namespace scope.
* include/bits/vector.tcc: Change LL suffix to just L.
---
 libstdc++-v3/include/bits/istream.tcc | 10 ++
 libstdc++-v3/include/bits/locale_facets.h | 12 +++
 libstdc++-v3/include/bits/locale_facets.tcc   |  6 
 .../include/bits/locale_facets_nonio.tcc  |  4 +++
 libstdc++-v3/include/bits/ostream.tcc |  6 
 libstdc++-v3/include/bits/stl_algobase.h  | 10 ++
 libstdc++-v3/include/bits/vector.tcc  |  2 +-
 libstdc++-v3/include/c_global/cstdlib |  3 ++
 .../hash_prime_size_policy_imp.hpp|  3 ++
 libstdc++-v3/include/ext/pointer.h|  3 ++
 libstdc++-v3/include/ext/stdio_sync_filebuf.h |  3 ++
 libstdc++-v3/include/std/istream  |  3 ++
 libstdc++-v3/include/std/ostream  |  3 ++
 libstdc++-v3/include/tr1/cmath|  4 +++
 libstdc++-v3/include/tr1/functional_hash.h| 32 +++
 libstdc++-v3/include/tr1/tuple|  2 +-
 libstdc++-v3/include/tr1/type_traits  |  6 
 17 files changed, 97 insertions(+), 15 deletions(-)

diff --git a/libstdc++-v3/include/bits/istream.tcc 
b/libstdc++-v3/include/bits/istream.tcc
index e8957fd2c3b..f96d2d4a353 100644
--- a/libstdc++-v3/include/bits/istream.tcc
+++ b/libstdc++-v3/include/bits/istream.tcc
@@ -397,7 +397,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __streambuf_type* __this_sb = this->rdbuf();
  int_type __c = __this_sb->sgetc();
  char_type __c2 = traits_type::to_char_type(__c);
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wlong-long"
  unsigned long long __gcount = 0;
+#pragma GCC diagnostic pop
 
  while (!traits_type::eq_int_type(__c, __eof)
 && !traits_type::eq_int_type(__c, __idelim)
@@ -1122,6 +1125,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Inhibit implicit instantiations for required instantiations,
   // which are defined via explicit instantiations elsewhere.
 #if _GLIBCXX_EXTERN_TEMPLATE
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++11-extensions" // extern template
+#pragma GCC diagnostic ignored "-Wlong-long"
   extern template class basic_istream;
   extern template istream& ws(istream&);
   extern template istream& operator>>(istream&, char&);
@@ -1134,8 +1140,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   extern template istream& istream::_M_extract(unsigned long&);
   extern template istream& istream::_M_extract(bool&);
 #ifdef _GLIBCXX_USE_LONG_LONG
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wlong-long"
   extern template istream& istream::_M_extract(long long&);
   extern template istream& istream::_M_extract(unsigned long long&);
+#pragma GCC diagnostic pop
 #endif
   extern template istream& istream::_M_extract(float&);
   extern template istream& istream::_M_extract(double&);
@@ -1166,6 +1175,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   extern template class basic_iostream;
 #endif
+#pragma GCC diagnostic pop
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/include/bits/locale_facets.h 
b/libstdc++-v3/include/bits/locale_facets.h
index afa239ad96a..0daffc86de2 100644
--- a/libstdc++-v3/include/bits/locale_facets.h
+++ b/libstdc++-v3/include/bits/locale_facets.h
@@ -2063,6 +2063,8 @@ _GLIBCXX_BEGIN_NAMESPACE_LDBL
   { return this->do_get(__in, __end, __io, __err, __v); }
 
 #ifdef _GLIBCXX_USE_LONG_LONG
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wlong-long"
   iter_type
   get(iter_type __in, iter_type __end, ios_base& __io,
  ios_base::iostate& __err

[committed] libstdc++: Refactor experimental::filesystem::path string conversions

2024-09-27 Thread Jonathan Wakely
Tested x86_64-linux and x86_64-w64-mingw32 (just the
experimental::filesystem parts though).

Pushed to trunk.

-- >8 --

I noticed a -Wc++17-extensions warning due to use of if-constexpr in
std::experimental::filesystem::path, which was not protected by
diagnostic pragmas to disable the warning.

While adding the pragmas I noticed that other places in the same file
use tag dispatching and multiple overloads instead of if-constexpr.
Since we're already using it in that file, we might as well just use it
everywhere.

libstdc++-v3/ChangeLog:

* include/experimental/bits/fs_path.h (path::_Cvt): Refactor to
use if-constexpr.
(path::string(const Allocator&)): Likewise.
---
 .../include/experimental/bits/fs_path.h   | 139 +++---
 1 file changed, 54 insertions(+), 85 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h 
b/libstdc++-v3/include/experimental/bits/fs_path.h
index 5008e26af8d..a504aa2492c 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -775,60 +775,38 @@ namespace __detail
  __codecvt_utf8_to_wchar,
  __codecvt_utf8_to_utfNN>;
 
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions" // if constexpr
+  static string_type
+  _S_convert(const _CharT* __f, const _CharT* __l)
+  {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-#ifdef _GLIBCXX_USE_CHAR8_T
-  static string_type
-  _S_wconvert(const char8_t* __f, const char8_t* __l, const char8_t*)
-  {
-   const char* __f2 = (const char*)__f;
-   const char* __l2 = (const char*)__l;
-   std::wstring __wstr;
-   std::codecvt_utf8_utf16 __wcvt;
-   if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
- return __wstr;
-  }
-#endif
-
-  static string_type
-  _S_wconvert(const char* __f, const char* __l, const char*)
-  {
-   std::codecvt_utf8_utf16 __cvt;
-   std::wstring __wstr;
-   if (__str_codecvt_in_all(__f, __l, __wstr, __cvt))
-   return __wstr;
-   _GLIBCXX_THROW_OR_ABORT(filesystem_error(
- "Cannot convert character sequence",
- std::make_error_code(errc::illegal_byte_sequence)));
-  }
-
-  static string_type
-  _S_wconvert(const _CharT* __f, const _CharT* __l, const void*)
-  {
-   __codecvt_utf8_to_wide __cvt;
-   std::string __str;
-   if (__str_codecvt_out_all(__f, __l, __str, __cvt))
+   if constexpr (is_same<_CharT, char>::value)
  {
-   const char* __f2 = __str.data();
-   const char* __l2 = __f2 + __str.size();
-   std::codecvt_utf8_utf16 __wcvt;
+   std::codecvt_utf8_utf16 __cvt;
std::wstring __wstr;
-   if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
+   // Convert char (assumed to be UTF-8) to wchar_t (UTF-16).
+   if (__str_codecvt_in_all(__f, __l, __wstr, __cvt))
  return __wstr;
  }
+#ifdef _GLIBCXX_USE_CHAR8_T
+   else if constexpr (is_same<_CharT, char8_t>::value)
+ return _S_convert((const char*)__f, (const char*)__l);
+#endif
+   else
+ {
+   // Convert from _CharT to char first:
+   __codecvt_utf8_to_utfNN __cvt;
+   std::string __str;
+   if (__str_codecvt_out_all(__f, __l, __str, __cvt))
+ // Then convert char to wchar_t:
+ return _S_convert(__str.c_str(), __str.c_str() + __str.size());
+ }
_GLIBCXX_THROW_OR_ABORT(filesystem_error(
  "Cannot convert character sequence",
  std::make_error_code(errc::illegal_byte_sequence)));
-  }
-
-  static string_type
-  _S_convert(const _CharT* __f, const _CharT* __l)
-  {
-   return _S_wconvert(__f, __l, (const _CharT*)nullptr);
-  }
-#else
-  static string_type
-  _S_convert(const _CharT* __f, const _CharT* __l)
-  {
+#else // ! WINDOWS
 #ifdef _GLIBCXX_USE_CHAR8_T
if constexpr (is_same<_CharT, char8_t>::value)
  return string_type(__f, __l);
@@ -843,8 +821,9 @@ namespace __detail
  "Cannot convert character sequence",
  std::make_error_code(errc::illegal_byte_sequence)));
  }
+#endif // ! WINDOWS
   }
-#endif
+#pragma GCC diagnostic pop
 
   static string_type
   _S_convert(_CharT* __f, _CharT* __l)
@@ -1038,66 +1017,55 @@ namespace __detail
 std::swap(_M_type, __rhs._M_type);
   }
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions" // if constexpr
   template
 inline std::basic_string<_CharT, _Traits, _Allocator>
 path::string(const _Allocator& __a) const
 {
-  if _GLIBCXX17_CONSTEXPR (is_same<_CharT, value_type>::value)
-   return { _M_pathname.begin(), _M_pathname.end(), __a };
-
   using _WString = basic_string<_CharT, _Traits, _Allocator>;
 
   const value_typ

Re: [PATCH v2 4/4] tree-object-size: Fall back to wholesize for non-const offset

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:29PM -0400, Siddhesh Poyarekar wrote:
> Don't bail out early if the offset to a pointer in __builtin_object_size
> is a variable, return the wholesize instead since that is a better
> fallback for maximum estimate.  This should keep checks in place for
> fortified functions to constrain overflows to at lesat some extent.
> 
> gcc/ChangeLog:
> 
>   PR middle-end/77608
>   * tree-object-size.cc (plus_stmt_object_size): Drop check for
>   constant offset.
>   * testsuite/gcc.dg/builtin-object-size-1.c (test14): New test.
> 
> Signed-off-by: Siddhesh Poyarekar 
> ---
>  gcc/testsuite/gcc.dg/builtin-object-size-1.c | 19 +++
>  gcc/tree-object-size.cc  |  7 ---
>  2 files changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-1.c 
> b/gcc/testsuite/gcc.dg/builtin-object-size-1.c
> index 6ffe12be683..5a24087ae1e 100644
> --- a/gcc/testsuite/gcc.dg/builtin-object-size-1.c
> +++ b/gcc/testsuite/gcc.dg/builtin-object-size-1.c
> @@ -791,6 +791,25 @@ test13 (unsigned cond)
>  #endif
>  }
>  
> +void
> +__attribute__ ((noinline))
> +test14 (unsigned off)
> +{
> +  char *buf2 = malloc (10);
> +  char *p;
> +  size_t t;
> +
> +  p = &buf2[off];
> +
> +#ifdef __builtin_object_size
> +  if (__builtin_object_size (p, 0) != 10 - off)
> +FAIL ();
> +#else
> +  if (__builtin_object_size (p, 0) != 10)
> +FAIL ();
> +#endif
> +}
> +
>  int
>  main (void)
>  {
> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index 1b569c3d12b..ebd2a4650aa 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1595,8 +1595,7 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>  op1 = try_collapsing_offset (op1, NULL_TREE, NOP_EXPR, object_size_type);
>  
>/* Handle PTR + OFFSET here.  */
> -  if (size_valid_p (op1, object_size_type)
> -  && (TREE_CODE (op0) == SSA_NAME || TREE_CODE (op0) == ADDR_EXPR))
> +  if ((TREE_CODE (op0) == SSA_NAME || TREE_CODE (op0) == ADDR_EXPR))
>  {
>if (TREE_CODE (op0) == SSA_NAME)
>   {
> @@ -1621,7 +1620,9 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>if (size_unknown_p (bytes, 0))
>   ;
>else if ((object_size_type & OST_DYNAMIC)
> -|| bytes != wholesize || compare_tree_int (op1, offset_limit) <= 
> 0)
> +|| bytes != wholesize
> +|| (size_valid_p (op1, object_size_type)
> +&& compare_tree_int (op1, offset_limit) <= 0))
>   bytes = size_for_offset (bytes, op1, wholesize);
>/* In the static case, with a negative offset, the best estimate for
>minimum size is size_unknown but for maximum size, the wholesize is a

LGTM.

Jakub



[committed] libstdc++: Fix test FAILs due to -Wreturn-local-addr

2024-09-27 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This fixes two FAILs due to -Wpointer-arith warnings when testing with
c++11 or c++14 dialects.

libstdc++-v3/ChangeLog:

* testsuite/20_util/bind/dangling_ref.cc: Add an additional
dg-warning for -Wreturn-local-addr warning.
* testsuite/30_threads/packaged_task/cons/dangling_ref.cc:
Likewise.
---
 libstdc++-v3/testsuite/20_util/bind/dangling_ref.cc  | 1 +
 .../testsuite/30_threads/packaged_task/cons/dangling_ref.cc  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/libstdc++-v3/testsuite/20_util/bind/dangling_ref.cc 
b/libstdc++-v3/testsuite/20_util/bind/dangling_ref.cc
index 70393e4392f..17e7b21c45c 100644
--- a/libstdc++-v3/testsuite/20_util/bind/dangling_ref.cc
+++ b/libstdc++-v3/testsuite/20_util/bind/dangling_ref.cc
@@ -5,5 +5,6 @@ int f();
 auto b = std::bind(f);
 int i = b(); // { dg-error "here" "" { target { c++14_down } } }
 // { dg-error "dangling reference" "" { target { c++14_down } } 0 }
+// { dg-error "reference to temporary" "" { target { c++14_down } } 0 }
 // { dg-error "no matching function" "" { target c++17 } 0 }
 // { dg-error "enable_if" "" { target c++17 } 0 }
diff --git 
a/libstdc++-v3/testsuite/30_threads/packaged_task/cons/dangling_ref.cc 
b/libstdc++-v3/testsuite/30_threads/packaged_task/cons/dangling_ref.cc
index e9edb5edc8b..225b65fe6a7 100644
--- a/libstdc++-v3/testsuite/30_threads/packaged_task/cons/dangling_ref.cc
+++ b/libstdc++-v3/testsuite/30_threads/packaged_task/cons/dangling_ref.cc
@@ -7,5 +7,6 @@
 int f();
 std::packaged_task task(f);
 // { dg-error "dangling reference" "" { target { c++14_down } } 0 }
+// { dg-error "reference to temporary" "" { target { c++14_down } } 0 }
 // { dg-error "no matching function" "" { target c++17 } 0 }
 // { dg-error "enable_if" "" { target c++17 } 0 }
-- 
2.46.1



Re: [PATCH v2 1/4] tree-object-size: use size_for_offset in more cases

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:26PM -0400, Siddhesh Poyarekar wrote:
> When wholesize != size, there is a reasonable opportunity for static
> object sizes also to be computed using size_for_offset, so use that.
> 
> gcc/ChangeLog:
> 
>   * tree-object-size.cc (plus_stmt_object_size): Call
>   SIZE_FOR_OFFSET for some negative offset cases.
>   * testsuite/gcc.dg/builtin-object-size-3.c (test9): Adjust test.
>   * testsuite/gcc.dg/builtin-object-size-4.c (test8): Likewise.
> ---
>  gcc/testsuite/gcc.dg/builtin-object-size-3.c | 6 +++---
>  gcc/testsuite/gcc.dg/builtin-object-size-4.c | 6 +++---
>  gcc/tree-object-size.cc  | 2 +-
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-3.c 
> b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
> index 3f58da3d500..ec2c62c9640 100644
> --- a/gcc/testsuite/gcc.dg/builtin-object-size-3.c
> +++ b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
> @@ -574,7 +574,7 @@ test9 (unsigned cond)
>if (__builtin_object_size (&p[-4], 2) != (cond ? 6 : 10))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (&p[-4], 2) != 0)
> +  if (__builtin_object_size (&p[-4], 2) != 6)
>  FAIL ();
>  #endif
>  
> @@ -585,7 +585,7 @@ test9 (unsigned cond)
>if (__builtin_object_size (p, 2) != ((cond ? 2 : 6) + cond))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 2) != 0)
> +  if (__builtin_object_size (p, 2) != 2)
>  FAIL ();
>  #endif
>  
> @@ -598,7 +598,7 @@ test9 (unsigned cond)
>!= sizeof (y) - __builtin_offsetof (struct A, c) - 8 + cond)
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 2) != 0)
> +  if (__builtin_object_size (p, 2) != sizeof (y) - __builtin_offsetof 
> (struct A, c) - 8)
>  FAIL ();
>  #endif
>  }
> diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-4.c 
> b/gcc/testsuite/gcc.dg/builtin-object-size-4.c
> index b3eb36efb74..7bcd24c4150 100644
> --- a/gcc/testsuite/gcc.dg/builtin-object-size-4.c
> +++ b/gcc/testsuite/gcc.dg/builtin-object-size-4.c
> @@ -482,7 +482,7 @@ test8 (unsigned cond)
>if (__builtin_object_size (&p[-4], 3) != (cond ? 6 : 10))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (&p[-4], 3) != 0)
> +  if (__builtin_object_size (&p[-4], 3) != 6)
>  FAIL ();
>  #endif
>  
> @@ -493,7 +493,7 @@ test8 (unsigned cond)
>if (__builtin_object_size (p, 3) != ((cond ? 2 : 6) + cond))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 3) != 0)
> +  if (__builtin_object_size (p, 3) != 2)
>  FAIL ();
>  #endif
>  
> @@ -505,7 +505,7 @@ test8 (unsigned cond)
>if (__builtin_object_size (p, 3) != sizeof (y.c) - 8 + cond)
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 3) != 0)
> +  if (__builtin_object_size (p, 3) != sizeof (y.c) - 8)
>  FAIL ();
>  #endif
>  }

The testcase changes look reasonable to me.

> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index 6544730e153..f8fae0cbc82 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1527,7 +1527,7 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>if (size_unknown_p (bytes, 0))
>   ;
>else if ((object_size_type & OST_DYNAMIC)
> -|| compare_tree_int (op1, offset_limit) <= 0)
> +|| bytes != wholesize || compare_tree_int (op1, offset_limit) <= 
> 0)
>   bytes = size_for_offset (bytes, op1, wholesize);
>/* In the static case, with a negative offset, the best estimate for
>minimum size is size_unknown but for maximum size, the wholesize is a

The coding conventions say that in cases like this where the whole condition
doesn't fit on a single line, each ||/&& operand should be on a separate
line.
So, the patch should be adding || bytes != wholesize on a separate line.

That said, there is a pre-existing problem, the tree direct comparisons
(bytes != wholesize here, && wholesize != sz in size_for_offset (note,
again, it should be on a separate line), maybe others).

We do INTEGER_CST caching, either using small array for small values or
hash table for larger ones, so INTEGER_CSTs with the same value of the
same type should be pointer equal unless they are TREE_OVERFLOW or similar,
but for anything else, unless you guarantee that in the "same" case
you assign the same tree to size/wholesize rather than say
perform size_binop twice, I'd expect instead comparisons with
operand_equal_p or something similar.

Though, because this patch is solely for the __builtin_object_size case
and the sizes in that case should be solely INTEGER_CSTs, I guess this patch
is ok with the formatting nit fix (and ideally the one in size_for_offset
too).

Jakub



Re: [PATCH v2] libgcc, libstdc++: Make TU-local declarations in headers external linkage [PR115126]

2024-09-27 Thread Nathaniel Shead
On Fri, Sep 27, 2024 at 03:55:14PM -0400, Jason Merrill wrote:
> On 9/27/24 3:38 PM, Jonathan Wakely wrote:
> > On Fri, 27 Sept 2024 at 19:46, Jason Merrill  wrote:
> > > 
> > > On 9/26/24 6:34 AM, Nathaniel Shead wrote:
> > > > On Thu, Sep 26, 2024 at 01:46:27PM +1000, Nathaniel Shead wrote:
> > > > > On Wed, Sep 25, 2024 at 01:30:55PM +0200, Jakub Jelinek wrote:
> > > > > > On Wed, Sep 25, 2024 at 12:18:07PM +0100, Jonathan Wakely wrote:
> > > > > > > > >And whether similarly we couldn't use
> > > > > > > > > __attribute__((__visibility__ ("hidden"))) on the static 
> > > > > > > > > block scope
> > > > > > > > > vars for C++ (again, if compiler supports that), so that the 
> > > > > > > > > changes
> > > > > > > > > don't affect ABI of C++ libraries.
> > > > > > > > 
> > > > > > > > That sounds good too.
> > > > > > > 
> > > > > > > Can you use visibility attributes on a local static? I get a 
> > > > > > > warning
> > > > > > > that it's ignored.
> > > > > > 
> > > > > > Indeed :(
> > > > > > 
> > > > > > And #pragma GCC visibility push(hidden)/#pragma GCC visibility pop 
> > > > > > around
> > > > > > just the static block scope var definition does nothing.
> > > > > > If it is around the whole inline function though, then it seems to 
> > > > > > work.
> > > > > > Though, unsure if we want that around the whole header; wonder what 
> > > > > > it would
> > > > > > do with the weakrefs.
> > > > > > 
> > > > > >  Jakub
> > > > > > 
> > > > > 
> > > > > Thanks for the thoughts.  WRT visibility, it looks like the main 
> > > > > gthr.h
> > > > > surrounds the whole function in a
> > > > > 
> > > > > #ifndef HIDE_EXPORTS
> > > > > #pragma GCC visibility push(default)
> > > > > #endif
> > > > > 
> > > > > block, though I can't quite work out what the purpose of that is here
> > > > > (since everything is currently internal linkage to start with).
> > > > > 
> > > > > But it sounds like doing something like
> > > > > 
> > > > > #ifdef __has_attribute
> > > > > # if __has_attribute(__always_inline__)
> > > > > #  define __GTHREAD_ALWAYS_INLINE 
> > > > > __attribute__((__always_inline__))
> > > > > # endif
> > > > > #endif
> > > > > #ifndef __GTHREAD_ALWAYS_INLINE
> > > > > # define __GTHREAD_ALWAYS_INLINE
> > > > > #endif
> > > > > 
> > > > > #ifdef __cplusplus
> > > > > # define __GTHREAD_INLINE inline __GTHREAD_ALWAYS_INLINE
> > > > > #else
> > > > > # define __GTHREAD_INLINE static inline
> > > > > #endif
> > > > > 
> > > > > and then marking maybe even just the new inline functions with
> > > > > visibility hidden should be OK?
> > > > > 
> > > > > Nathaniel
> > > > 
> > > > Here's a new patch that does this.  Also since v1 it adds another two
> > > > internal linkage declarations I'd missed earlier from libstdc++, in
> > > > pstl; it turns out that  doesn't include .
> > > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu and
> > > > aarch64-unknown-linux-gnu, OK for trunk?
> > > > 
> > > > -- >8 --
> > > > 
> > > > In C++20, modules streaming check for exposures of TU-local entities.
> > > > In general exposing internal linkage functions in a header is liable to
> > > > cause ODR violations in C++, and this is now detected in a module
> > > > context.
> > > > 
> > > > This patch goes through and removes 'static' from many declarations
> > > > exposed through libstdc++ to prevent code like the following from
> > > > failing:
> > > > 
> > > > export module M;
> > > > extern "C++" {
> > > >   #include 
> > > > }
> > > > 
> > > > Since gthreads is used from C as well, we need to choose whether to use
> > > > 'inline' or 'static inline' depending on whether we're compiling for C
> > > > or C++ (since the semantics of 'inline' are different between the
> > > > languages).  Additionally we need to remove static global variables, so
> > > > we migrate these to function-local statics to avoid the ODR issues.
> > > 
> > > Why function-local static rather than inline variable?
> > 
> > We can make that conditional on __cplusplus but can we do that for
> > C++98? With Clang too?
> 
> Yes for both compilers, disabling -Wc++17-extensions.
> 

Ah, I didn't realise that was possible.  I've already merged the above
patch but happy to test another one that changes this if that's
preferred.

> > > > +++ b/libstdc++-v3/include/pstl/algorithm_impl.h
> > > > @@ -2890,7 +2890,7 @@ __pattern_includes(__parallel_tag<_IsVector> 
> > > > __tag, _ExecutionPolicy&& __exec, _
> > > >});
> > > >}
> > > > 
> > > > -constexpr auto __set_algo_cut_off = 1000;
> > > > +inline constexpr auto __set_algo_cut_off = 1000;
> > > > 
> > > > +++ b/libstdc++-v3/include/pstl/unseq_backend_simd.h
> > > > @@ -22,7 +22,7 @@ namespace __unseq_backend
> > > >{
> > > > 
> > > >// Expect vector width up to 64 (or 512 bit)
> > > > -const std::size_t __lane_size = 64;
> > > > +inline const std::size_t __lane_size = 64;
> > > 
> > > Th

[PATCH] expr: Don't clear whole unions [PR116416]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 04:01:33PM +0200, Jakub Jelinek wrote:
> So, I think we should go with (but so far completely untested except
> for pr78687.C which is optimized with Marek's patch and the above testcase
> which doesn't have the clearing anymore) the following patch.

That patch had a bug in type_has_padding_at_level_p and so it didn't
bootstrap.

Here is a full patch which does.  It regressed the infoleak-1.c test
which I've adjusted, but I think the test had undefined behavior.
In particular the question is whether
  union un_b { unsigned char j; unsigned int i; } u = {0};
leaves (or can leave) some bits uninitialized or not.

I believe it can, it is an explicit initialization of the j member
which is just 8-bit (but see my upcoming mail on padding bits in C23/C++)
and nothing in the C standard from what I can see seems to imply the padding
bits in the union beyond the actually initialized field in this case would
be initialized.
Though, looking at godbolt, clang and icc 19 and older gcc all do zero
initialize the whole union before storing the single member in there (if
non-zero, otherwise just clear).

So whether we want to do this or do it by default is another question.

Anyway, bootstrapped/regtested on x86_64-linux and i686-linux successfully.

2024-09-28  Jakub Jelinek  

PR c++/116416
* expr.cc (categorize_ctor_elements_1): Fix up union handling of
*p_complete.  Clear it only if num_fields is 0 and the union has
at least one FIELD_DECL, set to -1 if either union has no fields
and non-zero size, or num_fields is 1 and complete_ctor_at_level_p
returned false.
* gimple-fold.cc (type_has_padding_at_level_p): Fix up UNION_TYPE
handling, return also true for UNION_TYPE with no FIELD_DECLs
and non-zero size, handle QUAL_UNION_TYPE like UNION_TYPE.

* gcc.dg/plugin/infoleak-1.c (test_union_2b, test_union_4b): Expect
diagnostics.

--- gcc/expr.cc.jj  2024-09-04 12:09:22.598808244 +0200
+++ gcc/expr.cc 2024-09-27 15:34:20.929282525 +0200
@@ -7218,7 +7218,36 @@ categorize_ctor_elements_1 (const_tree c
 
   if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor),
num_fields, elt_type))
-*p_complete = 0;
+{
+  if (TREE_CODE (TREE_TYPE (ctor)) == UNION_TYPE
+ || TREE_CODE (TREE_TYPE (ctor)) == QUAL_UNION_TYPE)
+   {
+ /* The union case is more complicated.  */
+ if (num_fields == 0)
+   {
+ /* If the CONSTRUCTOR doesn't have any elts, it is
+incomplete if the union has at least one field.  */
+ for (tree f = TYPE_FIELDS (TREE_TYPE (ctor));
+  f; f = DECL_CHAIN (f))
+   if (TREE_CODE (f) == FIELD_DECL)
+ {
+   *p_complete = 0;
+   break;
+ }
+ /* Otherwise it has padding if the union has non-zero size.  */
+ if (*p_complete > 0
+ && !integer_zerop (TYPE_SIZE (TREE_TYPE (ctor
+   *p_complete = -1;
+   }
+ /* Otherwise the CONSTRUCTOR should have exactly one element.
+It is then never incomplete, but if complete_ctor_at_level_p
+returned false, it has padding.  */
+ else if (*p_complete > 0)
+   *p_complete = -1;
+   }
+  else
+   *p_complete = 0;
+}
   else if (*p_complete > 0
   && type_has_padding_at_level_p (TREE_TYPE (ctor)))
 *p_complete = -1;
--- gcc/gimple-fold.cc.jj   2024-09-09 11:25:43.197048840 +0200
+++ gcc/gimple-fold.cc  2024-09-27 18:51:30.036002116 +0200
@@ -4814,12 +4814,22 @@ type_has_padding_at_level_p (tree type)
return false;
   }
 case UNION_TYPE:
+case QUAL_UNION_TYPE:
+  bool any_fields;
+  any_fields = false;
   /* If any of the fields is smaller than the whole, there is padding.  */
   for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
-   if (TREE_CODE (f) == FIELD_DECL)
- if (simple_cst_equal (TYPE_SIZE (TREE_TYPE (f)),
-   TREE_TYPE (type)) != 1)
-   return true;
+   if (TREE_CODE (f) != FIELD_DECL)
+ continue;
+   else if (simple_cst_equal (TYPE_SIZE (TREE_TYPE (f)),
+  TYPE_SIZE (type)) != 1)
+ return true;
+   else
+ any_fields = true;
+  /* If the union doesn't have any fields and still has non-zero size,
+all of it is padding.  */
+  if (!any_fields && !integer_zerop (TYPE_SIZE (type)))
+   return true;
   return false;
 case ARRAY_TYPE:
 case COMPLEX_TYPE:
--- gcc/testsuite/gcc.dg/plugin/infoleak-1.c.jj 2022-09-11 22:28:56.356164436 
+0200
+++ gcc/testsuite/gcc.dg/plugin/infoleak-1.c2024-09-28 08:20:25.806973480 
+0200
@@ -123,9 +123,12 @@ void test_union_2a (void __user *dst, u8
 
 void test_union_2b (void __user *dst

Re: Performance improvement for std::to_chars(char* first, char* last, /* integer-type */ value, int base = 10 );

2024-09-27 Thread Markus Ehrnsperger

Any update?


Thanks and Regards, Markus



Am 30.07.24 um 09:56 schrieb Jonathan Wakely:



On Tue, 30 Jul 2024, 06:21 Ehrnsperger, Markus, 
 wrote:


On 2024-07-29 12:16, Jonathan Wakely wrote:

> On Mon, 29 Jul 2024 at 10:45, Jonathan Wakely
 wrote:
>> On Mon, 29 Jul 2024 at 09:42, Ehrnsperger, Markus
>>  wrote:
>>> Hi,
>>>
>>>
>>> I'm attaching two files:
>>>
>>> 1.:   to_chars10.h:
>>>
>>> This is intended to be included in libstdc++ / gcc to achieve
performance improvements. It is an implementation of
>>>
>>> to_chars10(char* first, char* last,  /* integer-type */ value);
>>>
>>> Parameters are identical to std::to_chars(char* first, char*
last,  /* integer-type */ value, int base = 10 ); . It only works
for base == 10.
>>>
>>> If it is included in libstdc++, to_chars10(...) could be
renamed to std::to_chars(char* first, char* last,  /* integer-type
*/ value) to provide an overload for the default base = 10
>> Thanks for the email. This isn't in the form of a patch that we can
>> accept as-is, although I see that the license is compatible with
>> libstdc++, so if you are looking to contribute it then that
could be
>> done either by assigning copyright to the FSF or under the DCO
terms.
>> See https://gcc.gnu.org/contribute.html#legal for more details.
>>
>> I haven't looked at the code in detail, but is it a similar
approach
>> to https://jk-jeon.github.io/posts/2022/02/jeaiii-algorithm/ ?
>> How does it compare to the performance of that algorithm?
>>
>> I have an incomplete implementation of that algorithm for libstdc++
>> somewhere, but I haven't looked at it for a while.
> I took a closer look and the reinterpret_casts worried me, so I
tried
> your test code with UBsan. There are a number of errors that would
> need to be fixed before we would consider using this code.

Attached are new versions of to_chars10.cpp, to_chars10.h and the new
file itoa_better_y.h


Thanks! I'll take another look.



Changes:

- I removed all reinterpret_casts, and tested with
-fsanitize=undefined

- I added itoa_better_y.h from
https://jk-jeon.github.io/posts/2022/02/jeaiii-algorithm/ to the
performance test.

Note: There is only one line in the benchmark test
for itoa_better_y due
to limited features of itoa_better_y:

Benchmarking random unsigned 32 bit  itoa_better_y   ...


to_chars10.h: Signed-off-by: Markus Ehrnsperger


The other files are only for performance tests.

>
>
>>
>>> 2.:  to_chars10.cpp:
>>>
>>> This is a test program for to_chars10 verifying the
correctness of the results, and measuring the performance. The
actual performance improvement is system dependent, so please test
on your own system.
>>>
>>> On my system the performance improvement is about factor two,
my results are:
>>>
>>>
>>> Test   int8_t verifying to_chars10 = std::to_chars ... OK
>>> Test  uint8_t verifying to_chars10 = std::to_chars ... OK
>>> Test  int16_t verifying to_chars10 = std::to_chars ... OK
>>> Test uint16_t verifying to_chars10 = std::to_chars ... OK
>>> Test  int32_t verifying to_chars10 = std::to_chars ... OK
>>> Test uint32_t verifying to_chars10 = std::to_chars ... OK
>>> Test  int64_t verifying to_chars10 = std::to_chars ... OK
>>> Test uint64_t verifying to_chars10 = std::to_chars ... OK
>>>
>>> Benchmarking test case               tested method  ...  time
(lower is better)
>>> Benchmarking random unsigned 64 bit to_chars10     ...  0.00957
>>> Benchmarking random unsigned 64 bit std::to_chars  ...  0.01854
>>> Benchmarking random   signed 64 bit to_chars10     ...  0.01018
>>> Benchmarking random   signed 64 bit std::to_chars  ...  0.02297
>>> Benchmarking random unsigned 32 bit to_chars10     ...  0.00620
>>> Benchmarking random unsigned 32 bit std::to_chars  ...  0.01275
>>> Benchmarking random   signed 32 bit to_chars10     ...  0.00783
>>> Benchmarking random   signed 32 bit std::to_chars  ...  0.01606
>>> Benchmarking random unsigned 16 bit to_chars10     ...  0.00536
>>> Benchmarking random unsigned 16 bit std::to_chars  ...  0.00871
>>> Benchmarking random   signed 16 bit to_chars10     ...  0.00664
>>> Benchmarking random   signed 16 bit std::to_chars  ...  0.01154
>>> Benchmarking random unsigned 08 bit to_chars10     ...  0.00393
>>> Benchmarking random unsigned 08 bit std::to_chars  ...  0.00626
>>> Benchmarking random   signed 08 bit to_chars10     ...  0.00465
>>> Benchmarking random   signed 08 bit std::to_chars  ...  0.01089
>>>
>>>
>>> Thanks, Markus
>>>
>>>
>>>


Re: [PATCH 07/10] c++/modules: Implement ignored TU-local exposures

2024-09-27 Thread Nathaniel Shead
On Fri, Sep 27, 2024 at 11:56:27AM -0400, Jason Merrill wrote:
> On 9/23/24 7:46 PM, Nathaniel Shead wrote:
> > Currently I just stream DECL_NAME in TU_LOCAL_ENTITYs for use in 
> > diagnostics,
> > but this feels perhaps insufficient.  Are there any better approached here?
> > Otherwise I don't think it matters too much, as which entity it is will also
> > be hopefully clear from the 'declared here' notes.
> > 
> > I've put the new warning in Wextra, but maybe it would be better to just
> > leave it out of any of the normal warning groups since there's currently
> > no good way to work around the warnings it produces?
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > 
> > -- >8 --
> > 
> > [basic.link] p14 lists a number of circumstances where a declaration
> > naming a TU-local entity is not an exposure, notably the bodies of
> > non-inline templates and friend declarations in classes.  This patch
> > ensures that these references do not error when exporting the module.
> > 
> > We do need to still error on instantiation from a different module,
> > however, in case this refers to a TU-local entity.  As such this patch
> > adds a new tree TU_LOCAL_ENTITY which is used purely as a placeholder to
> > poison any attempted template instantiations that refer to it.
> > 
> > This is also streamed for friend decls so that merging (based on the
> > index of an entity into the friend decl list) doesn't break and to
> > prevent complicating the logic; I imagine this shouldn't ever come up
> > though.
> > 
> > We also add a new warning, '-Wignored-exposures', to handle the case
> > where someone accidentally refers to a TU-local value from within a
> > non-inline function template.  This will compile without errors as-is,
> > but any attempt to instantiate the decl will fail; this warning can be
> > used to ensure that this doesn't happen.  Unfortunately the warning has
> > quite some false positives; for instance, a user could deliberately only
> > call explicit instantiations of the decl, or use 'if constexpr' to avoid
> > instantiating the TU-local entity from other TUs, neither of which are
> > currently detected.
> 
> I disagree with the term "ignored exposure", in the warning name and in the
> rest of the patch; these references are not exposures.  It's the naming of a
> TU-local entity that is ignored in basic.link/14.
> 
> I like the warning, just would change the name. "-Wtemplate-names-tu-local"?
> "-Wtu-local-in-template"?
> 

I wasn't too happy with "ignored exposures" either, I much prefer these
names.  I'll use these in the next version of this patch.

> I'm not too concerned about false positives, as long as it can be
> effectively suppressed with #pragma GCC diagnostic ignored.  If you only use
> explicit instantiations you don't need to have the template body in the
> interface anyway.
> 

The main case I'm thinking of that would be annoying would be

  static void tu_local_fn() { /* ... */ }

  export template 
  void foo() {
if constexpr (std::is_same_v)
  tu_local_fn();
// ...
  }
  template foo();

But this should be able to silenced with #pragma so I suppose that's OK
for Wextra (and is probably fairly unusual code to begin with).  And
even more so if the following is fixed...

> > + /* Ideally we would only warn in cases where there are no explicit
> > +instantiations of the template, but we don't currently track this
> > +in an easy-to-find way.  */
> 
> You can't walk DECL_TEMPLATE_INSTANTIATIONS checking
> DECL_EXPLICIT_INSTANTIATION?
> 

I tried, but it looks like DECL_TEMPLATE_INSTANTIATIONS doesn't include
explicit instantiations for function templates definitions once we get
to module processing: it's only recorded for namespace-scope primary
function templates without definitions.

I think I would be able to update 'register_specialization' to more
eagerly include instantiations for every function template in this list,
but I wasn't sure if we wanted to do that just for this if it's not
already necessary, since that would potentially grow it quite a lot.

It strikes me as I write this though that the fact we currently
correctly mark GMF explicit instantiations as reachable is not because
looking through DECL_TEMPLATE_INSTANTIATIONS works, but because of the
issue we've discussed earlier where GMF instantiations are deferred
until the end of the TU and then get marked as purview... so I suppose
we do need to fix this anyway in anticipation of solving that issue.

See e.g.

  module;
  template  void foo();
  template void foo();
  export module M;

with -fdump-lang-module-graph which shows that we emit the instantiation
despite never referencing it from purview.

> What happens with a non-templated, non-inline function body that names a
> TU-local entity?  I don't see any specific handling or testcase.  Should we
> just omit its definition, like you do in the previous patch for TU-local
> variable initializers?
> 

Non-templat

Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread David Malcolm
On Fri, 2024-09-27 at 09:54 -0400, Lewis Hyatt wrote:
> On Fri, Sep 27, 2024 at 9:41 AM David Malcolm 
> wrote:
> > 
> > On Thu, 2024-09-26 at 23:28 +0200, Jakub Jelinek wrote:
> > > Hi!
> > > 
> > > The following patch on top of the just posted cleanup patch
> > > saves/restores the m_classification_history and m_push_list
> > > vectors for PCH.  Without that as the testcase shows during
> > > parsing
> > > of the templates we don't report ignored diagnostics, but after
> > > loading
> > > PCH header when instantiating those templates those warnings can
> > > be
> > > emitted.  This doesn't show up on x86_64-linux build because
> > > configure
> > > injects there -fcf-protection -mshstk flags during library build
> > > (and
> > > so
> > > also during PCH header creation), but make check doesn't use
> > > those
> > > flags
> > > and so the PCH header is ignored.
> > > 
> > > Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-
> > > linux
> > > and
> > > i686-linux still pending, ok for trunk if it passes it?
> > 
> > Thanks, yes please
> > 
> > Dave
> > 
> 
> A couple comments that may be helpful...
> 
> -This is also PR 64117
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117)
> 
> -I submitted a patch last year for that but did not get any response
> (
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html).
> I guess I never pinged it because I am still trying to ping two other
> ones :). 

Gahhh, I'm sorry about this.

What are the other two patches?

> My patch did not switch to vec so it was not as nice as this
> one. I wonder though, if some of the testcases I added could be
> incorporated? In particular the testcase from my patch
> pragma-diagnostic-3.C I believe will still be failing after this one.
> There is an issue with C++ because it processes the pragmas twice,
> once in early mode and once in normal mode, that makes it do the
> wrong
> thing for this case:
> 
> t.h:
> 
>  #pragma GCC diagnostic push
>  #pragma GCC diagnostic ignored...
>  //no pop at end of the file
> 
> t.c
> 
>  #include "t.h"
>  #pragma GCC diagnostic pop
>  //expect to be at the initial state here, but won't be if t.h is a
> PCH
> 
> In my patch I had separated the PCH restore from a more general
> "state
> restore" logic so that the C++ frontend can restore the state after
> the first pass through the data.

It sounds like the ideal here would be to incorporate the test cases
from Lewis's patch into Jakub's, if the latter can be tweaked to fix 
pragma-diagnostic-3.C

Dave



[pushed] doc: Remove i?86-*-linux* installation note from 2003

2024-09-27 Thread Gerald Pfeifer


gcc:
PR target/69374
* doc/install.texi (Specific) : Remove note
from 2003.
---
 gcc/doc/install.texi | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 08de972c8ec..517d1cbb2fb 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -4274,9 +4274,6 @@ libstdc++-v3 documentation.
 @end html
 @anchor{ix86-x-linux}
 @heading i?86-*-linux*
-As of GCC 3.3, binutils 2.13.1 or later is required for this platform.
-See @uref{https://gcc.gnu.org/PR10877,,bug 10877} for more information.
-
 If you receive Signal 11 errors when building on GNU/Linux, then it is
 possible you have a hardware problem.  Further information on this can be
 found on @uref{https://www.bitwizard.nl/sig11/,,www.bitwizard.nl}.
-- 
2.46.0


  1   2   >