[PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Robin Dapp
Hi,

this is probably rather an RFC than a patch as I'm not sure whether
reassoc is the right place to fix it.  On top, the heuristic might
be a bit "ad-hoc".  Maybe we can also work around it in the vectorizer?

The following function is vectorized in a very inefficient way because we
construct vectors from scalar loads.

uint64_t
foo (uint8_t *pix, int i_stride)
{
  uint32_t sum = 0, sqr = 0;
  int x, y;
  for (y = 0; y < 16; y++)
{
  for (x = 0; x < 16; x++)
sum += pix[x];
  pix += i_stride;
}
  return sum;
}

The reason for this is that reassoc reorders the last three operands of
the summation sequence by rank introducing a temporary in the process
that breaks the "homogeneity" (sum_n = sum_{n - 1} + pix[n]) of the sum.

This patch adds a function likely_vectorizable_p that checks if an
operand vector contains only operands of the same rank except the last
one.  In that case the sequence is likely vectorizable and the easier
vectorization will outweigh any CSE opportunity we can expose by
swapping operands.

Bootstrapped and regtested on x86, regtested on rv64gcv.

Regards
 Robin

gcc/ChangeLog:

* tree-ssa-reassoc.cc (likely_vectorizable_p): New function.
(reassociate_bb): Use new function.
(dump_ops_vector): Change prototype.
(debug_ops_vector): Change prototype.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/reassoc-52.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c |  27 ++
 gcc/tree-ssa-reassoc.cc| 102 -
 2 files changed, 124 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c 
b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c
new file mode 100644
index 000..b117b7519bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3 -fdump-tree-reassoc-details" } */
+
+typedef unsigned char uint8_t;
+typedef unsigned int uint32_t;
+typedef unsigned long uint64_t;
+
+uint64_t
+foo (uint8_t *pix, int i_stride)
+{
+  uint32_t sum = 0, sqr = 0;
+  int x, y;
+  for (y = 0; y < 16; y++)
+{
+  for (x = 0; x < 16; x++)
+   sum += pix[x];
+  pix += i_stride;
+}
+  return sum;
+}
+
+/* Ensure that we only add to sum variables and don't create a temporary that
+   does something else.  In doing so we enable a much more efficient
+   vectorization scheme.  */
+
+/* { dg-final { scan-tree-dump-times "\\s+sum_\\d+\\s+=\\s+sum_\\d+\\s+\\\+" 
16 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\s+sum_\\d+\\s=.*\\\+\\ssum_\\d+;" 1 
"reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.cc b/gcc/tree-ssa-reassoc.cc
index 556ecdebe2d..13f8a1070b6 100644
--- a/gcc/tree-ssa-reassoc.cc
+++ b/gcc/tree-ssa-reassoc.cc
@@ -6992,6 +6992,96 @@ rank_ops_for_fma (vec *ops)
 }
   return mult_num;
 }
+
+/* Check if the operand vector contains operands that all have the same rank
+   apart from the last one.  The last one must be a PHI node which links back
+   to the first operand.
+   This can indicate an easily vectorizable sequence in which case we don't
+   want to reorder the first three elements for rank reasons.
+
+   Consider the following example:
+
+for (int y = 0; y < 16; y++)
+  {
+   for (int x = 0; x < 16; x++)
+ {
+   sum += pix[x];
+ }
+ ...
+  }
+
+# sum_201 = PHI 
+# y_149 = PHI 
+
+_33 = *pix_214;
+_34 = (unsigned intD.4) _33;
+sum_35 = sum_201 + _34;
+
+_46 = MEM[(uint8_tD.2849 *)pix_214 + 1B];
+_47 = (unsigned intD.4) _46;
+sum_48 = sum_35 + _47;
+_49 = (intD.1) _46;
+
+All loads of pix are of the same rank, just the sum PHI has a different
+one in {..., _46, _33, sum_201}.
+swap_ops_for_binary_stmt will move sum_201 before _46 and _33 so that
+the first operation _33 OP _46 is exposed as an optimization opportunity.
+In doing so it will trip up the vectorizer which now needs to work much
+harder because it can't recognize the whole sequence as doing the same
+operation.  */
+
+static bool
+likely_vectorizable_p (vec *ops)
+{
+  operand_entry *oe;
+  unsigned int i;
+  unsigned int rank = 0;
+
+  if (ops->length () < 3)
+return false;
+
+  /* Check if the last operand is a PHI.  */
+  operand_entry *oelast = (*ops)[ops->length () - 1];
+
+  if (TREE_CODE (oelast->op) != SSA_NAME)
+return false;
+
+  gimple *glast = SSA_NAME_DEF_STMT (oelast->op);
+
+  if (gimple_code (glast) != GIMPLE_PHI
+  || gimple_phi_num_args (glast) != 2)
+return false;
+
+  /* If so, check if its first argument is a gimple assign.  */
+  tree phi_arg = gimple_phi_arg_def (glast, 0);
+
+  if (TREE_CODE (phi_arg) != SSA_NAME)
+return false;
+
+  gimple *garg_def = SSA_NAME_DEF_STMT (phi_arg);
+  if (!is_gimple_assign (garg_def))
+return false;
+
+  /* It must link back to the beginning

[PATCH] Ternary operator formatting fixes

2024-10-16 Thread Jakub Jelinek
Hi!

While working on PR117028 C2Y changes, I've noticed weird ternary
operator formatting (operand1 ? operand2: operand3).
The usual formatting is operand1 ? operand2 : operand3
where we have around 18000+ cases of that (counting only what fits
on one line) and
indent -nbad -bap -nbc -bbo -bl -bli2 -bls -ncdb -nce -cp1 -cs -di2 -ndj \
   -nfc1 -nfca -hnl -i2 -ip5 -lp -pcs -psl -nsc -nsob
documented in
https://www.gnu.org/prep/standards/html_node/Formatting.html#Formatting
does the same.
Some code was even trying to save space as much as possible and used
operand1?operand2:operand3 or
operand1 ? operand2:operand3

Today I've grepped for such cases (the grep was '?.*[^ ]:' and I had to
skim through various false positives with that where the : matched e.g.
stuff inside of strings, or *.md pattern macros or :: scope) and the
following patch is a fix for what I found.

Built on x86_64-linux, ok for trunk?

2024-10-16  Jakub Jelinek  

gcc/
* attribs.cc (lookup_scoped_attribute_spec): ?: operator formatting
fixes.
* basic-block.h (FOR_BB_INSNS_SAFE): Likewise.
* cfgcleanup.cc (outgoing_edges_match): Likewise.
* cgraph.cc (cgraph_node::dump): Likewise.
* config/arc/arc.cc (gen_acc1, gen_acc2): Likewise.
* config/arc/arc.h (CLASS_MAX_NREGS, CONSTANT_ADDRESS_P): Likewise.
* config/arm/arm.cc (arm_print_operand): Likewise.
* config/cris/cris.md (*b): Likewise.
* config/darwin.cc (darwin_asm_declare_object_name,
darwin_emit_common): Likewise.
* config/darwin-driver.cc (darwin_driver_init): Likewise.
* config/epiphany/epiphany.md (call, sibcall, call_value,
sibcall_value): Likewise.
* config/i386/i386.cc (gen_push2): Likewise.
* config/i386/i386.h (ix86_cur_cost): Likewise.
* config/i386/openbsdelf.h (FUNCTION_PROFILER): Likewise.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins):
Likewise.
* config/loongarch/loongarch-cpu.cc (fill_native_cpu_config):
Likewise.
* config/riscv/riscv.cc (riscv_union_memmodels): Likewise.
* config/riscv/zc.md (*mva01s, *mvsa01): Likewise.
* config/rs6000/mmintrin.h (_mm_cmpeq_pi8, _mm_cmpgt_pi8,
_mm_cmpeq_pi16, _mm_cmpgt_pi16, _mm_cmpeq_pi32, _mm_cmpgt_pi32):
Likewise.
* config/v850/predicates.md (pattern_is_ok_for_prologue): Likewise.
* config/xtensa/constraints.md (d, C, W): Likewise.
* coverage.cc (coverage_begin_function, build_init_ctor,
build_gcov_exit_decl): Likewise.
* df-problems.cc (df_create_unused_note): Likewise.
* diagnostic.cc (diagnostic_set_caret_max_width): Likewise.
* diagnostic-path.cc (path_summary::path_summary): Likewise.
* expr.cc (expand_expr_divmod): Likewise.
* gcov.cc (format_gcov): Likewise.
* gcov-dump.cc (dump_gcov_file): Likewise.
* genmatch.cc (main): Likewise.
* incpath.cc (remove_duplicates, register_include_chains): Likewise.
* ipa-devirt.cc (dump_odr_type): Likewise.
* ipa-icf.cc (sem_item_optimizer::merge_classes): Likewise.
* ipa-inline.cc (inline_small_functions): Likewise.
* ipa-polymorphic-call.cc (ipa_polymorphic_call_context::dump):
Likewise.
* ipa-sra.cc (create_parameter_descriptors): Likewise.
* ipa-utils.cc (find_always_executed_bbs): Likewise.
* predict.cc (predict_loops): Likewise.
* selftest.cc (read_file): Likewise.
* sreal.h (SREAL_SIGN, SREAL_ABS): Likewise.
* tree-dump.cc (dequeue_and_dump): Likewise.
* tree-ssa-ccp.cc (bit_value_binop): Likewise.
gcc/c-family/
* c-opts.cc (c_common_init_options, c_common_handle_option,
c_common_finish, set_std_c89, set_std_c99, set_std_c11,
set_std_c17, set_std_c23, set_std_cxx98, set_std_cxx11,
set_std_cxx14, set_std_cxx17, set_std_cxx20, set_std_cxx23,
set_std_cxx26): ?: operator formatting fixes.
gcc/cp/
* search.cc (lookup_member): ?: operator formatting fixes.
* typeck.cc (cp_build_modify_expr): Likewise.
libcpp/
* expr.cc (interpret_float_suffix): ?: operator formatting fixes.

--- gcc/attribs.cc.jj   2024-10-01 09:38:57.539968487 +0200
+++ gcc/attribs.cc  2024-10-16 12:22:13.136273474 +0200
@@ -381,7 +381,7 @@ lookup_scoped_attribute_spec (const_tree
   struct substring attr;
   scoped_attributes *attrs;
 
-  const char *ns_str = (ns != NULL_TREE) ? IDENTIFIER_POINTER (ns): NULL;
+  const char *ns_str = (ns != NULL_TREE) ? IDENTIFIER_POINTER (ns) : NULL;
 
   attrs = find_attribute_namespace (ns_str);
 
--- gcc/basic-block.h.jj2024-01-03 11:51:30.094751134 +0100
+++ gcc/basic-block.h   2024-10-16 12:21:59.863461369 +0200
@@ -224,7 +224,7 @@ enum cfg_bb_flags
 /* For iterating over insns in basic block when we might remove the
current insn.  */
 #define FOR_BB_INSNS_SAFE(BB, INSN, CURR)  

Re: [PATCH v15 2/4] gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()

2024-10-16 Thread Alejandro Colomar
On Wed, Oct 16, 2024 at 10:34:21AM GMT, Joseph Myers wrote:
> On Wed, 16 Oct 2024, Alejandro Colomar wrote:
> 
> > The old name was misleading.
> > 
> > While at it, also rename some temporary variables that are used with
> > this function, for consistency.
> 
> This patch is OK.  Note that in ChangeLog entries as in other 
> documentation, a function is referred to just as function_name, not as 
> function_name().

Should I also change the Subject (1st line of the commit message)?

> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 

-- 



signature.asc
Description: PGP signature


Re: [PATCH v15 2/4] gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Alejandro Colomar wrote:

> On Wed, Oct 16, 2024 at 10:34:21AM GMT, Joseph Myers wrote:
> > On Wed, 16 Oct 2024, Alejandro Colomar wrote:
> > 
> > > The old name was misleading.
> > > 
> > > While at it, also rename some temporary variables that are used with
> > > this function, for consistency.
> > 
> > This patch is OK.  Note that in ChangeLog entries as in other 
> > documentation, a function is referred to just as function_name, not as 
> > function_name().
> 
> Should I also change the Subject (1st line of the commit message)?

Yes.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: testsuite: Prepare for -std=gnu23 default

2024-10-16 Thread Jakub Jelinek
On Wed, Oct 16, 2024 at 11:01:49AM +, Joseph Myers wrote:
> --- a/gcc/testsuite/c-c++-common/Wcast-function-type.c
> +++ b/gcc/testsuite/c-c++-common/Wcast-function-type.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-options "-Wcast-function-type" } */
> +/* { dg-options "-std=gnu17 -Wcast-function-type" { target c } } */
> +/* { dg-options "-Wcast-function-type" { target c++ } } */

I think better would be
/* { dg-options "-Wcast-function-type" } */
/* { dg-additional-options "-std=gnu17" { target c } } */

> --- a/gcc/testsuite/c-c++-common/Wformat-pr84258.c
> +++ b/gcc/testsuite/c-c++-common/Wformat-pr84258.c
> @@ -1,4 +1,5 @@
> -/* { dg-options "-Wformat" } */
> +/* { dg-options "-std=gnu17 -Wformat" { target c } } */
> +/* { dg-options "-Wformat" { target c++ } } */

Similarly.

> --- a/gcc/testsuite/c-c++-common/Wvarargs.c
> +++ b/gcc/testsuite/c-c++-common/Wvarargs.c
> @@ -1,4 +1,5 @@
>  /* { dg-do compile } */
> +/* { dg-options "-std=gnu17" { target c } } */

Wasn't the test -pedantic-errors before?
Generally, I'd prefer dg-additional-options for tests
which don't already have dg-options into which one can just add the new
flag, or as in the above cases where a new flag is added only conditionally.
Just am not 100% sure if it works in lto tests...

> --- a/gcc/testsuite/c-c++-common/sizeof-array-argument.c
> +++ b/gcc/testsuite/c-c++-common/sizeof-array-argument.c
> @@ -1,5 +1,6 @@
>  /* PR c/6940 */
>  /* { dg-do compile } */
> +/* { dg-options "-Wno-old-style-definition" { target c } } */

Likewise.

> --- a/gcc/testsuite/gcc.c-torture/compile/20040214-2.c
> +++ b/gcc/testsuite/gcc.c-torture/compile/20040214-2.c
> @@ -1,4 +1,5 @@
>  /* http://gcc.gnu.org/ml/gcc-patches/2004-02/msg01307.html */
> +/* { dg-options "-std=gnu17" } */

I think better use dg-additional-options in gcc.c-torture/
(several times).  Wonder if dg-options e.g. doesn't override the
default -w.

> --- a/gcc/testsuite/gcc.c-torture/compile/pr100241-1.c
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr100241-1.c
> @@ -1,5 +1,6 @@
>  /* { dg-require-visibility "" } */
> -/* { dg-options "-fvisibility=internal -fPIC" { target fpic } } */
> +/* { dg-options "-std=gnu17" { target { ! fpic } } } */
> +/* { dg-options "-std=gnu17 -fvisibility=internal -fPIC" { target fpic } } */

I'd keep it as is + dg-additional-options.  Duplicating the options is
errror-prone.

Otherwise LGTM.

Jakub



Re: [PATCH v2] alpha: Add -mlra option

2024-10-16 Thread John Paul Adrian Glaubitz
On Wed, 2024-10-16 at 09:39 +0200, Richard Biener wrote:
> > Disabling M2 is enough to fix this.
> 
> For practical purposes all reload->LRA conversions should focus on C
> and C++.  Everything
> else is optional and not required to keep a port live (I'd argue it's
> wasting cycles to look
> at anything beyond C/C++ until those work with a mostly clean testsuite).

So far I have found two issues: One is the LRA build failing on non-BWX
targets and the other is the M2 failure. I'll open bug reports for these.

In any case, I'm glad that Alpha bootstraps mostly fine with LRA enabled.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


[PATCH] testsuite: Add -march=x86-64-v3 to AVX10 testcases to slience warning for GCC built with AVX512 arch

2024-10-16 Thread Haochen Jiang
Hi all,

Currently, when build GCC with config --with-arch=native on AVX512
machines, if we run AVX10.2 testcases, we will get vector size warnings.
It is expected but annoying. Simply add -march=x86-64-v3 to override
--with-arch=native to slience all the warnings.

Tested on x86-64-linux-gnu. Ok for trunk?

Thx,
Haochen

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_1-25.c: Add -march=x86-64-v3.
* gcc.target/i386/avx10_1-26.c: Ditto.
* gcc.target/i386/avx10_2-512-bf-vector-cmpp-1.c: Ditto.
* gcc.target/i386/avx10_2-512-bf-vector-fma-1.c: Ditto.
* gcc.target/i386/avx10_2-512-bf-vector-operations-1.c: Ditto.
* gcc.target/i386/avx10_2-512-bf-vector-smaxmin-1.c: Ditto.
* gcc.target/i386/avx10_2-512-bf16-1.c: Ditto.
* gcc.target/i386/avx10_2-512-convert-1.c: Ditto.
* gcc.target/i386/avx10_2-512-media-1.c: Ditto.
* gcc.target/i386/avx10_2-512-minmax-1.c: Ditto.
* gcc.target/i386/avx10_2-512-satcvt-1.c: Ditto.
* gcc.target/i386/avx10_2-512-vaddnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcmppbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtnebf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtnebf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttnebf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttnebf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vdivnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vdpphps-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfmaddXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfmsubXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfnmaddXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfnmsubXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfpclasspbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vgetexppbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vgetmantpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vmaxpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vmpsadbw-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vmulnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbuuds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpwsud-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpwsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpwusd-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpwusds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpwuud-2

Re: [PATCH v15 4/4] c: Add __countof__ operator

2024-10-16 Thread Alejandro Colomar
Hi Joseph,

On Wed, Oct 16, 2024 at 10:30:36AM GMT, Joseph Myers wrote:
> On Wed, 16 Oct 2024, Alejandro Colomar wrote:
> 
> > +  if (type_code != ARRAY_TYPE)
> > +{
> > +  error_at (loc, "invalid application of % to type %qT", 
> > type);
> > +  return error_mark_node;
> > +}
> > +  if (!COMPLETE_TYPE_P (type))
> > +{
> > +  error_at (loc,
> > +   "invalid application of % to incomplete type %qT",
> 
> It is never appropriate, regardless of the actual operator naming, for a 
> diagnostic to refer to an operator name that doesn't exist at all (such as 
> countof instead of __countof__ in this patch).

Thanks.  I'll fix that.

Cheers,
Alex

> 
> > @@ -8992,12 +8992,17 @@ start_struct (location_t loc, enum tree_code code, 
> > tree name,
> >   within a statement expr used within sizeof, et. al.  This is not
> >   terribly serious as C++ doesn't permit statement exprs within
> >   sizeof anyhow.  */
> > -  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
> > +  if (warn_cxx_compat
> > +  && (in_sizeof || in_typeof || in_alignof || in_countof))
> >  warning_at (loc, OPT_Wc___compat,
> > "defining type in %qs expression is invalid in C++",
> > (in_sizeof
> >  ? "sizeof"
> > -: (in_typeof ? "typeof" : "alignof")));
> > +: (in_typeof
> > +   ? "typeof"
> > +   : (in_alignof
> > +  ? "alignof"
> > +  : "countof";
> 
> Likewise.
> 
> > @@ -10135,12 +10140,17 @@ start_enum (location_t loc, struct 
> > c_enum_contents *the_enum, tree name,
> >/* FIXME: This will issue a warning for a use of a type defined
> >   within sizeof in a statement expr.  This is not terribly serious
> >   as C++ doesn't permit statement exprs within sizeof anyhow.  */
> > -  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
> > +  if (warn_cxx_compat
> > +  && (in_sizeof || in_typeof || in_alignof || in_countof))
> >  warning_at (loc, OPT_Wc___compat,
> > "defining type in %qs expression is invalid in C++",
> > (in_sizeof
> >  ? "sizeof"
> > -: (in_typeof ? "typeof" : "alignof")));
> > +: (in_typeof
> > +   ? "typeof"
> > +   : (in_alignof
> > +  ? "alignof"
> > +  : "countof";
> 
> Likewise.
> 
> >  static struct c_expr
> > -c_parser_sizeof_expression (c_parser *parser)
> > +c_parser_sizeof_or_countof_expression (c_parser *parser, enum rid rid)
> >  {
> > +  const char *op_name = (rid == RID_COUNTOF) ? "countof" : "sizeof";
> 
> Likewise.
> 
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 302c3299ede..82f31668e37 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -10555,6 +10555,36 @@ If the operand of the @code{__alignof__} 
> > expression is a function,
> >  the expression evaluates to the alignment of the function which may
> >  be specified by attribute @code{aligned} (@pxref{Common Function 
> > Attributes}).
> >  
> > +@node countof
> > +@section Determining the Number of Elements of Arrays
> > +@cindex countof
> > +@cindex number of elements
> 
> Likewise, for node name and index entry.
> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 

-- 



signature.asc
Description: PGP signature


Re: [PATCH v15 2/4] gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Alejandro Colomar wrote:

> The old name was misleading.
> 
> While at it, also rename some temporary variables that are used with
> this function, for consistency.

This patch is OK.  Note that in ChangeLog entries as in other 
documentation, a function is referred to just as function_name, not as 
function_name().

-- 
Joseph S. Myers
josmy...@redhat.com



[committed] testsuite: Add tests for C23 __STDC_VERSION__

2024-10-16 Thread Joseph Myers
Add some tests for the value of __STDC_VERSION__ in C23 mode.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

* gcc.dg/c23-version-1.c, gcc.dg/c23-version-2.c,
gcc.dg/gnu23-version-1.c: New tests.

diff --git a/gcc/testsuite/gcc.dg/c23-version-1.c 
b/gcc/testsuite/gcc.dg/c23-version-1.c
new file mode 100644
index 000..2145f974297
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-version-1.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION__ for C23.  Test -std=c23.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+
+#if __STDC_VERSION__ == 202311L
+int i;
+#else
+#error "Bad __STDC_VERSION__."
+#endif
diff --git a/gcc/testsuite/gcc.dg/c23-version-2.c 
b/gcc/testsuite/gcc.dg/c23-version-2.c
new file mode 100644
index 000..3d44b7324c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-version-2.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION__ for C23.  Test -std=iso9899:2024.  */
+/* { dg-do compile } */
+/* { dg-options "-std=iso9899:2024 -pedantic-errors" } */
+
+#if __STDC_VERSION__ == 202311L
+int i;
+#else
+#error "Bad __STDC_VERSION__."
+#endif
diff --git a/gcc/testsuite/gcc.dg/gnu23-version-1.c 
b/gcc/testsuite/gcc.dg/gnu23-version-1.c
new file mode 100644
index 000..649f13b54ec
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu23-version-1.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION__ for C23 with GNU extensions.  Test -std=gnu23.  */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu23 -pedantic-errors" } */
+
+#if __STDC_VERSION__ == 202311L
+int i;
+#else
+#error "Bad __STDC_VERSION__."
+#endif

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v15 2/4] gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()

2024-10-16 Thread Alejandro Colomar
Hi Joseph,

On Wed, Oct 16, 2024 at 10:34:21AM GMT, Joseph Myers wrote:
> On Wed, 16 Oct 2024, Alejandro Colomar wrote:
> 
> > The old name was misleading.
> > 
> > While at it, also rename some temporary variables that are used with
> > this function, for consistency.
> 
> This patch is OK.  Note that in ChangeLog entries as in other 
> documentation, a function is referred to just as function_name, not as 
> function_name().

Thanks!  I'll fix that in v16.

Cheers,
Alex

-- 



signature.asc
Description: PGP signature


Re: [PATCH] sparc: drop -mlra

2024-10-16 Thread Sam James
Eric Botcazou  writes:

>> Let's finish the transition by dropping -mlra entirely.
>> 
>> Tested on sparc64-unknown-linux-gnu with no regressions.
>> 
>> gcc/ChangeLog:
>>  PR target/113952
>> 
>>  * config/sparc/sparc.cc (sparc_lra_p): Delete.
>>  (TARGET_LRA_P): Ditto.
>>  (sparc_option_override): Don't use MASK_LRA.
>>  * config/sparc/sparc.md (disabled,enabled): Drop lra attribute.
>>  * config/sparc/sparc.opt: Delete -mlra.
>>  * config/sparc/sparc.opt.urls: Ditto.
>>  * doc/invoke.texi (SPARC options): Drop -mlra and -mno-lra.
>
> OK, thanks! (modulo the blank line in the ChangeLog)

Thank you! (Will fix.)


Re: [PATCH 5/5] arm: [MVE intrinsics] Rework MVE vld/vst intrinsics

2024-10-16 Thread Christophe Lyon




On 10/15/24 16:30, Richard Earnshaw (lists) wrote:

On 16/09/2024 10:38, Christophe Lyon wrote:

From: Alfie Richards 

Implement the mve vld and vst intrinsics using the MVE builtins framework.

The main part of the patch is to reimplement to vstr/vldr patterns
such that we now have much fewer of them:
- non-truncating stores
- predicated non-truncating stores
- truncating stores
- predicated truncating stores
- non-extending loads
- predicated non-extending loads
- extending loads
- predicated extending loads

This enables us to update the implementation of vld1/vst1 and use the
new vldr/vstr builtins.

The patch also adds support for the predicated vld1/vst1 versions.

2024-09-11  Alfie Richards  
Christophe Lyon  

gcc/

* config/arm/arm-mve-builtins-base.cc (vld1q_impl): Add support
for predicated version.
(vst1q_impl): Likewise.
(vstrq_impl): New class.
(vldrq_impl): New class.
(vldrbq): New.
(vldrhq): New.
(vldrwq): New.
(vstrbq): New.
(vstrhq): New.
(vstrwq): New.
* config/arm/arm-mve-builtins-base.def (vld1q): Add predicated
version.
(vldrbq): New.
(vldrhq): New.
(vldrwq): New.
(vst1q): Add predicated version.
(vstrbq): New.
(vstrhq): New.
(vstrwq): New.
(vrev32q): Update types to float_16.
* config/arm/arm-mve-builtins-base.h (vldrbq): New.
(vldrhq): New.
(vldrwq): New.
(vstrbq): New.
(vstrhq): New.
(vstrwq): New.
* config/arm/arm-mve-builtins-functions.h (memory_vector_mode):
Remove conversion of floating point vectors to integer.
* config/arm/arm-mve-builtins.cc (TYPES_float16): Change to...
(TYPES_float_16): ...this.
(TYPES_float_32): New.
(float16): Change to...
(float_16): ...this.
(float_32): New.
(preds_z_or_none): New.
(function_resolver::check_gp_argument): Add support for _z
predicate.
* config/arm/arm_mve.h (vstrbq): Remove.
(vstrbq_p): Likewise.
(vstrhq): Likewise.
(vstrhq_p): Likewise.
(vstrwq): Likewise.
(vstrwq_p): Likewise.
(vst1q_p): Likewise.
(vld1q_z): Likewise.
(vldrbq_s8): Likewise.
(vldrbq_u8): Likewise.
(vldrbq_s16): Likewise.
(vldrbq_u16): Likewise.
(vldrbq_s32): Likewise.
(vldrbq_u32): Likewise.
(vstrbq_p_s8): Likewise.
(vstrbq_p_s32): Likewise.
(vstrbq_p_s16): Likewise.
(vstrbq_p_u8): Likewise.
(vstrbq_p_u32): Likewise.
(vstrbq_p_u16): Likewise.
(vldrbq_z_s16): Likewise.
(vldrbq_z_u8): Likewise.
(vldrbq_z_s8): Likewise.
(vldrbq_z_s32): Likewise.
(vldrbq_z_u16): Likewise.
(vldrbq_z_u32): Likewise.
(vldrhq_s32): Likewise.
(vldrhq_s16): Likewise.
(vldrhq_u32): Likewise.
(vldrhq_u16): Likewise.
(vldrhq_z_s32): Likewise.
(vldrhq_z_s16): Likewise.
(vldrhq_z_u32): Likewise.
(vldrhq_z_u16): Likewise.
(vldrwq_s32): Likewise.
(vldrwq_u32): Likewise.
(vldrwq_z_s32): Likewise.
(vldrwq_z_u32): Likewise.
(vldrhq_f16): Likewise.
(vldrhq_z_f16): Likewise.
(vldrwq_f32): Likewise.
(vldrwq_z_f32): Likewise.
(vstrhq_f16): Likewise.
(vstrhq_s32): Likewise.
(vstrhq_s16): Likewise.
(vstrhq_u32): Likewise.
(vstrhq_u16): Likewise.
(vstrhq_p_f16): Likewise.
(vstrhq_p_s32): Likewise.
(vstrhq_p_s16): Likewise.
(vstrhq_p_u32): Likewise.
(vstrhq_p_u16): Likewise.
(vstrwq_f32): Likewise.
(vstrwq_s32): Likewise.
(vstrwq_u32): Likewise.
(vstrwq_p_f32): Likewise.
(vstrwq_p_s32): Likewise.
(vstrwq_p_u32): Likewise.
(vst1q_p_u8): Likewise.
(vst1q_p_s8): Likewise.
(vld1q_z_u8): Likewise.
(vld1q_z_s8): Likewise.
(vst1q_p_u16): Likewise.
(vst1q_p_s16): Likewise.
(vld1q_z_u16): Likewise.
(vld1q_z_s16): Likewise.
(vst1q_p_u32): Likewise.
(vst1q_p_s32): Likewise.
(vld1q_z_u32): Likewise.
(vld1q_z_s32): Likewise.
(vld1q_z_f16): Likewise.
(vst1q_p_f16): Likewise.
(vld1q_z_f32): Likewise.
(vst1q_p_f32): Likewise.
(__arm_vstrbq_s8): Likewise.
(__arm_vstrbq_s32): Likewise.
(__arm_vstrbq_s16): Likewise.
(__arm_vstrbq_u8): Likewise.
(__arm_vstrbq_u32): Likewise.
(__arm_vstrbq_u16): Likewise.
(__arm_vldrbq_s8): Likewise.
(__arm_vldrbq_u8): Likewise.
(__arm_vldrbq_s16): Likewise.
(__arm_vldrbq_u16): Likewise.
(__arm_vldrbq_s32): Likewise.
(__arm_vldrbq_u32): Likewise.
(__arm_vstrbq_p_s8): Likewise.
(__arm_vst

Re: [PATCH v2 06/36] arm: [MVE intrinsics] rework vcvtq

2024-10-16 Thread Christophe Lyon




On 10/14/24 19:05, Richard Earnshaw (lists) wrote:

On 04/09/2024 14:26, Christophe Lyon wrote:

Implement vcvtq using the new MVE builtins framework.

In config/arm/arm-mve-builtins-base.def, the patch also restores the
alphabetical order.

2024-07-11  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (class vcvtq_impl): New.
(vcvtq): New.
* config/arm/arm-mve-builtins-base.def (vcvtq): New.
* config/arm/arm-mve-builtins-base.h (vcvtq): New.
* config/arm/arm-mve-builtins.cc (cvt): New type.
* config/arm/arm_mve.h (vcvtq): Delete.
(vcvtq_n): Delete.
(vcvtq_m): Delete.
(vcvtq_m_n): Delete.
(vcvtq_x): Delete.
(vcvtq_x_n): Delete.
(vcvtq_f16_s16): Delete.
(vcvtq_f32_s32): Delete.
(vcvtq_f16_u16): Delete.
(vcvtq_f32_u32): Delete.
(vcvtq_s16_f16): Delete.
(vcvtq_s32_f32): Delete.
(vcvtq_u16_f16): Delete.
(vcvtq_u32_f32): Delete.
(vcvtq_n_f16_s16): Delete.
(vcvtq_n_f32_s32): Delete.
(vcvtq_n_f16_u16): Delete.
(vcvtq_n_f32_u32): Delete.
(vcvtq_n_s16_f16): Delete.
(vcvtq_n_s32_f32): Delete.
(vcvtq_n_u16_f16): Delete.
(vcvtq_n_u32_f32): Delete.
(vcvtq_m_f16_s16): Delete.
(vcvtq_m_f16_u16): Delete.
(vcvtq_m_f32_s32): Delete.
(vcvtq_m_f32_u32): Delete.
(vcvtq_m_s16_f16): Delete.
(vcvtq_m_u16_f16): Delete.
(vcvtq_m_s32_f32): Delete.
(vcvtq_m_u32_f32): Delete.
(vcvtq_m_n_f16_u16): Delete.
(vcvtq_m_n_f16_s16): Delete.
(vcvtq_m_n_f32_u32): Delete.
(vcvtq_m_n_f32_s32): Delete.
(vcvtq_m_n_s32_f32): Delete.
(vcvtq_m_n_s16_f16): Delete.
(vcvtq_m_n_u32_f32): Delete.
(vcvtq_m_n_u16_f16): Delete.
(vcvtq_x_f16_u16): Delete.
(vcvtq_x_f16_s16): Delete.
(vcvtq_x_f32_s32): Delete.
(vcvtq_x_f32_u32): Delete.
(vcvtq_x_n_f16_s16): Delete.
(vcvtq_x_n_f16_u16): Delete.
(vcvtq_x_n_f32_s32): Delete.
(vcvtq_x_n_f32_u32): Delete.
(vcvtq_x_s16_f16): Delete.
(vcvtq_x_s32_f32): Delete.
(vcvtq_x_u16_f16): Delete.
(vcvtq_x_u32_f32): Delete.
(vcvtq_x_n_s16_f16): Delete.
(vcvtq_x_n_s32_f32): Delete.
(vcvtq_x_n_u16_f16): Delete.
(vcvtq_x_n_u32_f32): Delete.
(__arm_vcvtq_f16_s16): Delete.
(__arm_vcvtq_f32_s32): Delete.
(__arm_vcvtq_f16_u16): Delete.
(__arm_vcvtq_f32_u32): Delete.
(__arm_vcvtq_s16_f16): Delete.
(__arm_vcvtq_s32_f32): Delete.
(__arm_vcvtq_u16_f16): Delete.
(__arm_vcvtq_u32_f32): Delete.
(__arm_vcvtq_n_f16_s16): Delete.
(__arm_vcvtq_n_f32_s32): Delete.
(__arm_vcvtq_n_f16_u16): Delete.
(__arm_vcvtq_n_f32_u32): Delete.
(__arm_vcvtq_n_s16_f16): Delete.
(__arm_vcvtq_n_s32_f32): Delete.
(__arm_vcvtq_n_u16_f16): Delete.
(__arm_vcvtq_n_u32_f32): Delete.
(__arm_vcvtq_m_f16_s16): Delete.
(__arm_vcvtq_m_f16_u16): Delete.
(__arm_vcvtq_m_f32_s32): Delete.
(__arm_vcvtq_m_f32_u32): Delete.
(__arm_vcvtq_m_s16_f16): Delete.
(__arm_vcvtq_m_u16_f16): Delete.
(__arm_vcvtq_m_s32_f32): Delete.
(__arm_vcvtq_m_u32_f32): Delete.
(__arm_vcvtq_m_n_f16_u16): Delete.
(__arm_vcvtq_m_n_f16_s16): Delete.
(__arm_vcvtq_m_n_f32_u32): Delete.
(__arm_vcvtq_m_n_f32_s32): Delete.
(__arm_vcvtq_m_n_s32_f32): Delete.
(__arm_vcvtq_m_n_s16_f16): Delete.
(__arm_vcvtq_m_n_u32_f32): Delete.
(__arm_vcvtq_m_n_u16_f16): Delete.
(__arm_vcvtq_x_f16_u16): Delete.
(__arm_vcvtq_x_f16_s16): Delete.
(__arm_vcvtq_x_f32_s32): Delete.
(__arm_vcvtq_x_f32_u32): Delete.
(__arm_vcvtq_x_n_f16_s16): Delete.
(__arm_vcvtq_x_n_f16_u16): Delete.
(__arm_vcvtq_x_n_f32_s32): Delete.
(__arm_vcvtq_x_n_f32_u32): Delete.
(__arm_vcvtq_x_s16_f16): Delete.
(__arm_vcvtq_x_s32_f32): Delete.
(__arm_vcvtq_x_u16_f16): Delete.
(__arm_vcvtq_x_u32_f32): Delete.
(__arm_vcvtq_x_n_s16_f16): Delete.
(__arm_vcvtq_x_n_s32_f32): Delete.
(__arm_vcvtq_x_n_u16_f16): Delete.
(__arm_vcvtq_x_n_u32_f32): Delete.
(__arm_vcvtq): Delete.
(__arm_vcvtq_n): Delete.
(__arm_vcvtq_m): Delete.
(__arm_vcvtq_m_n): Delete.
(__arm_vcvtq_x): Delete.
(__arm_vcvtq_x_n): Delete.
---
  gcc/config/arm/arm-mve-builtins-base.cc  | 113 
  gcc/config/arm/arm-mve-builtins-base.def |  19 +-
  gcc/config/arm/arm-mve-builtins-base.h   |   1 +
  gcc/config/arm/arm-mve-builtins.cc   |  15 +
  gcc/config/arm/arm_mve.h | 666 ---
  5 files changed, 139 insertions(+), 675 deletions(-)

diff --git a/gcc/config/arm/ar

Re: [PATCH v2 02/36] arm: [MVE intrinsics] remove useless resolve from create shape

2024-10-16 Thread Christophe Lyon




On 10/14/24 18:39, Richard Earnshaw (lists) wrote:

On 04/09/2024 14:26, Christophe Lyon wrote:

vcreateq have no overloaded forms, so there's no need for resolve ().

2024-07-11  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (create_def): Remove
resolve.


Wouldn't it be more usual to write (create_def::resolve): Delete function?

Otherwise OK.



Sure, I'll change that.


R.


---
  gcc/config/arm/arm-mve-builtins-shapes.cc | 6 --
  1 file changed, 6 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index e01939469e3..0520a8331db 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1408,12 +1408,6 @@ struct create_def : public nonoverloaded_base
{
  build_all (b, "v0,su64,su64", group, MODE_none, preserve_user_namespace);
}
-
-  tree
-  resolve (function_resolver &r) const override
-  {
-return r.resolve_uniform (0, 2);
-  }
  };
  SHAPE (create)
  




Re: [PATCH v2 34/36] arm: [MVE intrinsics] rework vadcq

2024-10-16 Thread Christophe Lyon




On 10/15/24 11:18, Richard Earnshaw wrote:

On 14/10/2024 19:18, Richard Earnshaw (lists) wrote:

On 04/09/2024 14:26, Christophe Lyon wrote:

Implement vadcq using the new MVE builtins framework.

We re-use most of the code introduced by the previous patch to support
vadciq: we just need to initialize carry from the input parameter.

2024-08-28  Christophe Lyon  

gcc/

* config/arm/arm-mve-builtins-base.cc (vadcq_vsbc): Add support
for vadcq.
* config/arm/arm-mve-builtins-base.def (vadcq): New.
* config/arm/arm-mve-builtins-base.h (vadcq): New.
* config/arm/arm_mve.h (vadcq): Delete.
(vadcq_m): Delete.
(vadcq_s32): Delete.
(vadcq_u32): Delete.
(vadcq_m_s32): Delete.
(vadcq_m_u32): Delete.
(__arm_vadcq_s32): Delete.
(__arm_vadcq_u32): Delete.
(__arm_vadcq_m_s32): Delete.
(__arm_vadcq_m_u32): Delete.
(__arm_vadcq): Delete.
(__arm_vadcq_m): Delete.



+if (!m_init_carry)
+  {
+   /* Prepare carry in:
+  set_fpscr ( (fpscr & ~0x2000u)
+  | ((*carry & 1u) << 29) )  */
+   rtx carry_in = gen_reg_rtx (SImode);
+   rtx fpscr = gen_reg_rtx (SImode);
+   emit_insn (gen_get_fpscr_nzcvqc (fpscr));
+   emit_insn (gen_rtx_SET (carry_in, gen_rtx_MEM (SImode, carry_ptr)));
+
+   emit_insn (gen_rtx_SET (carry_in,
+   gen_rtx_ASHIFT (SImode,
+   carry_in,
+   GEN_INT (29;
+   emit_insn (gen_rtx_SET (carry_in,
+   gen_rtx_AND (SImode,
+carry_in,
+GEN_INT (0x2000;
+   emit_insn (gen_rtx_SET (fpscr,
+   gen_rtx_AND (SImode,
+fpscr,
+GEN_INT (~0x2000;
+   emit_insn (gen_rtx_SET (carry_in,
+   gen_rtx_IOR (SImode,
+carry_in,
+fpscr)));
+   emit_insn (gen_set_fpscr_nzcvqc (carry_in));
+  }


What's the logic here?  Are we just trying to set the C flag to *carry != 0 (is 
carry a bool?)?  Do we really need to preserve all the other bits in NZCV?  I 
wouldn't have thought so, suggesting that:

CMP *carry, #1  // Set C if *carry != 0

ought to be enough, without having to generate a read-modify-write sequence.


I realised last night that this is setting up the fpsr not the cpsr, so my 
suggestion won't work.  I am concerned that expanding this too early will leave 
something that we can't optimize away if we have back-to-back vadcq intrinsics 
that chain the carry, but I guess this is no different from what we have 
already.



Indeed, this is just replicating what the previous implementation is doing.


On that basis, this patch is also OK.  We may need to revisit this sequence 
later to check that we are removing redundant reads + sets.

R.



Thanks,

Christophe



R.


---
  gcc/config/arm/arm-mve-builtins-base.cc  | 61 +++--
  gcc/config/arm/arm-mve-builtins-base.def |  1 +
  gcc/config/arm/arm-mve-builtins-base.h   |  1 +
  gcc/config/arm/arm_mve.h | 87 
  4 files changed, 56 insertions(+), 94 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 6f3b18c2915..9c2e11356ef 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -559,10 +559,19 @@ public:
  class vadc_vsbc_impl : public function_base
  {
  public:
+  CONSTEXPR vadc_vsbc_impl (bool init_carry)
+: m_init_carry (init_carry)
+  {}
+
+  /* Initialize carry with 0 (vadci).  */
+  bool m_init_carry;
+
unsigned int
call_properties (const function_instance &) const override
{
  unsigned int flags = CP_WRITE_MEMORY | CP_READ_FPCR;
+if (!m_init_carry)
+  flags |= CP_READ_MEMORY;
  return flags;
}
  
@@ -605,22 +614,59 @@ public:

  carry_ptr = e.args[carry_out_arg_no];
  e.args.ordered_remove (carry_out_arg_no);
  
+if (!m_init_carry)

+  {
+   /* Prepare carry in:
+  set_fpscr ( (fpscr & ~0x2000u)
+  | ((*carry & 1u) << 29) )  */
+   rtx carry_in = gen_reg_rtx (SImode);
+   rtx fpscr = gen_reg_rtx (SImode);
+   emit_insn (gen_get_fpscr_nzcvqc (fpscr));
+   emit_insn (gen_rtx_SET (carry_in, gen_rtx_MEM (SImode, carry_ptr)));
+
+   emit_insn (gen_rtx_SET (carry_in,
+   gen_rtx_ASHIFT (SImode,
+   carry_in,
+   GEN_INT (29;
+   emit_insn (gen_rtx_SET (carry_in,
+

[PATCH] SVE intrinsics: Add constant folding for svindex.

2024-10-16 Thread Jennifer Schmitz
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
index   z0.d, #10, #3
mul z0.d, z0.d, #5
ret
Now, it is compiled to
f2:
mov x0, 50
index   z0.d, x0, #15
ret

For non-constant arguments, build_vec_series produces a VEC_SERIES_EXPR,
which is translated back at RTL level to an index instruction without codegen
changes.

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz 

gcc/
* config/aarch64/aarch64-sve-builtins-base.cc
(svindex_impl::fold): Add constant folding.

gcc/testsuite/
* gcc.target/aarch64/sve/index_const_fold.c: New test.
---
 .../aarch64/aarch64-sve-builtins-base.cc  | 12 +++
 .../gcc.target/aarch64/sve/index_const_fold.c | 35 +++
 2 files changed, 47 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 1c17149e1f0..f6b1657ecbb 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -1304,6 +1304,18 @@ public:
 
 class svindex_impl : public function_base
 {
+public:
+  gimple *
+  fold (gimple_folder &f) const override
+  {
+tree vec_type = TREE_TYPE (f.lhs);
+tree base = gimple_call_arg (f.call, 0);
+tree step = gimple_call_arg (f.call, 1);
+
+return gimple_build_assign (f.lhs,
+   build_vec_series (vec_type, base, step));
+  }
+
 public:
   rtx
   expand (function_expander &e) const override
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c 
b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
new file mode 100644
index 000..f5e6c0f7a85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+#include 
+
+#define INDEX_CONST(TYPE, TY)  \
+  sv##TYPE f_##TY##_index_const () \
+  {\
+return svindex_##TY (10, 3);   \
+  }
+
+#define MULT_INDEX(TYPE, TY)   \
+  sv##TYPE f_##TY##_mult_index ()  \
+  {\
+return svmul_x (svptrue_b8 (), \
+   svindex_##TY (10, 3),   \
+   5); \
+  }
+
+#define ALL_TESTS(TYPE, TY)\
+  INDEX_CONST (TYPE, TY)   \
+  MULT_INDEX (TYPE, TY)
+
+ALL_TESTS (uint8_t, u8)
+ALL_TESTS (uint16_t, u16)
+ALL_TESTS (uint32_t, u32)
+ALL_TESTS (uint64_t, u64)
+ALL_TESTS (int8_t, s8)
+ALL_TESTS (int16_t, s16)
+ALL_TESTS (int32_t, s32)
+ALL_TESTS (int64_t, s64)
+
+/* { dg-final { scan-tree-dump "return \\{ 10, 13, 16, ... \\}" 8 "optimized" 
} } */
+/* { dg-final { scan-tree-dump "return \\{ 50, 65, 80, ... \\}" 8 "optimized" 
} } */
-- 
2.44.0

smime.p7s
Description: S/MIME cryptographic signature


[PATCH] c, libcpp, v2: Partially implement C2Y N3353 paper [PR117028]

2024-10-16 Thread Jakub Jelinek
On Tue, Oct 15, 2024 at 05:40:58PM +, Joseph Myers wrote:
> > --- gcc/testsuite/gcc.dg/cpp/c23-delimited-escape-seq-1.c.jj
> > 2024-10-14 17:58:54.436815339 +0200
> > +++ gcc/testsuite/gcc.dg/cpp/c23-delimited-escape-seq-1.c   2024-10-14 
> > 17:59:05.032666716 +0200
> > @@ -0,0 +1,87 @@
> > +/* P2290R3 - Delimited escape sequences */
> 
> I don't think the comments on this and other C tests should reference a 
> C++ paper.

Ok, changed.

> I think there should also be tests using digit separators with the 0o / 0O 
> prefixes (both valid cases, and testing the error for having the digit 
> separator immediately after 0o / 0O).

Done.  Also added test for 0b1'01 because we only had test for invalid 0b'0
and 0B'1.

Tested on x86_64-linux and i686-linux, ok for trunk?

2024-10-16  Jakub Jelinek  

PR c/117028
libcpp/
* include/cpplib.h (struct cpp_options): Add named_uc_escape_seqs,
octal_constants and cpp_warn_c23_c2y_compat members.
(enum cpp_warning_reason): Add CPP_W_C23_C2Y_COMPAT enumerator.
* init.cc (struct lang_flags): Add named_uc_escape_seqs and
octal_constants bit-fields.
(lang_defaults): Add initializers for them into the table.
(cpp_set_lang): Initialize named_uc_escape_seqs and octal_constants.
(cpp_create_reader): Initialize cpp_warn_c23_c2y_compat to -1.
* charset.cc (_cpp_valid_ucn): Test
CPP_OPTION (pfile, named_uc_escape_seqs) rather than
CPP_OPTION (pfile, delimited_escape_seqs) in \N{} related tests.
Change wording of C cpp_pedwarning for \u{} and emit
-Wc23-c2y-compat warning for it too if needed.  Formatting fixes.
(convert_hex): Change wording of C cpp_pedwarning for \u{} and emit
-Wc23-c2y-compat warning for it too if needed.
(convert_oct): Likewise.
* expr.cc (cpp_classify_number): Handle C2Y 0o or 0O prefixed
octal constants.
(cpp_interpret_integer): Likewise.
gcc/c-family/
* c.opt (Wc23-c2y-compat): Add CPP and CppReason parameters.
* c-opts.cc (set_std_c2y): Use CLK_STDC2Y or CLK_GNUC2Y rather
than CLK_STDC23 and CLK_GNUC23.  Formatting fix.
* c-lex.cc (interpret_integer): Handle C2Y 0o or 0O prefixed
and wb/WB/uwb/UWB suffixed octal constants.
gcc/testsuite/
* gcc.dg/bitint-112.c: New test.
* gcc.dg/c23-digit-separators-1.c: Add _Static_assert for
valid binary constant with digit separator.
* gcc.dg/c23-octal-constants-1.c: New test.
* gcc.dg/c23-octal-constants-2.c: New test.
* gcc.dg/c2y-digit-separators-1.c: New test.
* gcc.dg/c2y-digit-separators-2.c: New test.
* gcc.dg/c2y-octal-constants-1.c: New test.
* gcc.dg/c2y-octal-constants-2.c: New test.
* gcc.dg/c2y-octal-constants-3.c: New test.
* gcc.dg/cpp/c23-delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/c23-delimited-escape-seq-2.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-2.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-3.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-4.c: New test.
* gcc.dg/octal-constants-1.c: New test.
* gcc.dg/octal-constants-2.c: New test.
* gcc.dg/octal-constants-3.c: New test.
* gcc.dg/octal-constants-4.c: New test.
* gcc.dg/system-octal-constants-1.c: New test.
* gcc.dg/system-octal-constants-1.h: New file.

--- libcpp/include/cpplib.h.jj  2024-10-16 10:32:27.338945464 +0200
+++ libcpp/include/cpplib.h 2024-10-16 11:37:45.179128796 +0200
@@ -548,6 +548,12 @@ struct cpp_options
   /* Nonzero for C++23 delimited escape sequences.  */
   unsigned char delimited_escape_seqs;
 
+  /* Nonzero for C++23 named universal character escape sequences.  */
+  unsigned char named_uc_escape_seqs;
+
+  /* Nonzero for C2Y 0o prefixed octal integer constants.  */
+  unsigned char octal_constants;
+
   /* Nonzero for 'true' and 'false' in #if expressions.  */
   unsigned char true_false;
 
@@ -579,6 +585,9 @@ struct cpp_options
   /* True if warn about differences between C11 and C23.  */
   signed char cpp_warn_c11_c23_compat;
 
+  /* True if warn about differences between C23 and C2Y.  */
+  signed char cpp_warn_c23_c2y_compat;
+
   /* True if warn about differences between C++98 and C++11.  */
   bool cpp_warn_cxx11_compat;
 
@@ -716,6 +725,7 @@ enum cpp_warning_reason {
   CPP_W_PEDANTIC,
   CPP_W_C90_C99_COMPAT,
   CPP_W_C11_C23_COMPAT,
+  CPP_W_C23_C2Y_COMPAT,
   CPP_W_CXX11_COMPAT,
   CPP_W_CXX20_COMPAT,
   CPP_W_CXX14_EXTENSIONS,
--- libcpp/init.cc.jj   2024-10-15 13:48:46.172160067 +0200
+++ libcpp/init.cc  2024-10-16 11:37:45.179128796 +0200
@@ -108,46 +108,48 @@ struct lang_flags
   unsigned int elifdef : 1;
   unsigned int warning_directive : 1;
   unsigned int delimited_escape_seqs : 1;
+  unsigned int named_uc_escape_seqs : 1;
+  unsigned int octa

[PATCH v15 2/4] gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()

2024-10-16 Thread Alejandro Colomar
The old name was misleading.

While at it, also rename some temporary variables that are used with
this function, for consistency.

Link: 


gcc/ChangeLog:

* tree.cc (array_type_nelts, array_type_nelts_minus_one)
* tree.h (array_type_nelts, array_type_nelts_minus_one)
* expr.cc (count_type_elements)
* config/aarch64/aarch64.cc
(pure_scalable_type_info::analyze_array)
* config/i386/i386.cc (ix86_canonical_va_list_type):
Rename array_type_nelts() => array_type_nelts_minus_one()
The old name was misleading.

gcc/c/ChangeLog:

* c-decl.cc (one_element_array_type_p, get_parm_array_spec)
* c-fold.cc (c_fold_array_ref):
Rename array_type_nelts() => array_type_nelts_minus_one()

gcc/cp/ChangeLog:

* decl.cc (reshape_init_array)
* init.cc
(build_zero_init_1)
(build_value_init_noctor)
(build_vec_init)
(build_delete)
* lambda.cc (add_capture)
* tree.cc (array_type_nelts_top):
Rename array_type_nelts() => array_type_nelts_minus_one()

gcc/fortran/ChangeLog:

* trans-array.cc (structure_alloc_comps)
* trans-openmp.cc
(gfc_walk_alloc_comps)
(gfc_omp_clause_linear_ctor):
Rename array_type_nelts() => array_type_nelts_minus_one()

gcc/rust/ChangeLog:

* backend/rust-tree.cc (array_type_nelts_top):
Rename array_type_nelts() => array_type_nelts_minus_one()

Cc: Gabriel Ravier 
Cc: Martin Uecker 
Cc: Joseph Myers 
Cc: Xavier Del Campo Romero 
Cc: Jakub Jelinek 
Suggested-by: Richard Biener 
Signed-off-by: Alejandro Colomar 
---
 gcc/c/c-decl.cc   | 10 +-
 gcc/c/c-fold.cc   |  7 ---
 gcc/config/aarch64/aarch64.cc |  2 +-
 gcc/config/i386/i386.cc   |  2 +-
 gcc/cp/decl.cc|  2 +-
 gcc/cp/init.cc|  8 
 gcc/cp/lambda.cc  |  3 ++-
 gcc/cp/tree.cc|  2 +-
 gcc/expr.cc   |  8 
 gcc/fortran/trans-array.cc|  2 +-
 gcc/fortran/trans-openmp.cc   |  4 ++--
 gcc/rust/backend/rust-tree.cc |  2 +-
 gcc/tree.cc   |  4 ++--
 gcc/tree.h|  2 +-
 14 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 224c015cd6d..a03f756bf33 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5358,7 +5358,7 @@ one_element_array_type_p (const_tree type)
 {
   if (TREE_CODE (type) != ARRAY_TYPE)
 return false;
-  return integer_zerop (array_type_nelts (type));
+  return integer_zerop (array_type_nelts_minus_one (type));
 }
 
 /* Determine whether TYPE is a zero-length array type "[0]".  */
@@ -6306,15 +6306,15 @@ get_parm_array_spec (const struct c_parm *parm, tree 
attrs)
  for (tree type = parm->specs->type; TREE_CODE (type) == ARRAY_TYPE;
   type = TREE_TYPE (type))
{
- tree nelts = array_type_nelts (type);
- if (error_operand_p (nelts))
+ tree nelts_minus_one = array_type_nelts_minus_one (type);
+ if (error_operand_p (nelts_minus_one))
return attrs;
- if (TREE_CODE (nelts) != INTEGER_CST)
+ if (TREE_CODE (nelts_minus_one) != INTEGER_CST)
{
  /* Each variable VLA bound is represented by the dollar
 sign.  */
  spec += "$";
- tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
+ tpbnds = tree_cons (NULL_TREE, nelts_minus_one, tpbnds);
}
}
  tpbnds = nreverse (tpbnds);
diff --git a/gcc/c/c-fold.cc b/gcc/c/c-fold.cc
index 57b67c74bd8..9ea174f79c4 100644
--- a/gcc/c/c-fold.cc
+++ b/gcc/c/c-fold.cc
@@ -73,11 +73,12 @@ c_fold_array_ref (tree type, tree ary, tree index)
   unsigned elem_nchars = (TYPE_PRECISION (elem_type)
  / TYPE_PRECISION (char_type_node));
   unsigned len = (unsigned) TREE_STRING_LENGTH (ary) / elem_nchars;
-  tree nelts = array_type_nelts (TREE_TYPE (ary));
+  tree nelts_minus_one = array_type_nelts_minus_one (TREE_TYPE (ary));
   bool dummy1 = true, dummy2 = true;
-  nelts = c_fully_fold_internal (nelts, true, &dummy1, &dummy2, false, false);
+  nelts_minus_one = c_fully_fold_internal (nelts_minus_one, true, &dummy1,
+  &dummy2, false, false);
   unsigned HOST_WIDE_INT i = tree_to_uhwi (index);
-  if (!tree_int_cst_le (index, nelts)
+  if (!tree_int_cst_le (index, nelts_minus_one)
   || i >= len
   || i + elem_nchars > len)
 return NULL_TREE;
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 102680a0efc..443e0525f6f 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -1081,7 +1081,7 @@ pure_scalable

[PATCH v15 3/4] Merge definitions of array_type_nelts_top()

2024-10-16 Thread Alejandro Colomar
There were two identical definitions, and none of them are available
where they are needed for implementing __nelementsof__.  Merge them, and
provide the single definition in gcc/tree.{h,cc}, where it's available
for __nelementsof__, which will be added in the following commit.

gcc/ChangeLog:

* tree.h (array_type_nelts_top)
* tree.cc (array_type_nelts_top):
Define function (moved from gcc/cp/).

gcc/cp/ChangeLog:

* cp-tree.h (array_type_nelts_top)
* tree.cc (array_type_nelts_top):
Remove function (move to gcc/).

gcc/rust/ChangeLog:

* backend/rust-tree.h (array_type_nelts_top)
* backend/rust-tree.cc (array_type_nelts_top):
Remove function.

Signed-off-by: Alejandro Colomar 
---
 gcc/cp/cp-tree.h  |  1 -
 gcc/cp/tree.cc| 13 -
 gcc/rust/backend/rust-tree.cc | 13 -
 gcc/rust/backend/rust-tree.h  |  2 --
 gcc/tree.cc   | 13 +
 gcc/tree.h|  1 +
 6 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index dc153a97dc4..c89f5ea905a 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8120,7 +8120,6 @@ extern tree build_exception_variant   (tree, 
tree);
 extern void fixup_deferred_exception_variants   (tree, tree);
 extern tree bind_template_template_parm(tree, tree);
 extern tree array_type_nelts_total (tree);
-extern tree array_type_nelts_top   (tree);
 extern bool array_of_unknown_bound_p   (const_tree);
 extern tree break_out_target_exprs (tree, bool = false);
 extern tree build_ctor_subob_ref   (tree, tree, tree);
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 3cac8ac4df1..c80ee068958 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -3076,19 +3076,6 @@ cxx_print_statistics (void)
 depth_reached);
 }
 
-/* Return, as an INTEGER_CST node, the number of elements for TYPE
-   (which is an ARRAY_TYPE).  This counts only elements of the top
-   array.  */
-
-tree
-array_type_nelts_top (tree type)
-{
-  return fold_build2_loc (input_location,
- PLUS_EXPR, sizetype,
- array_type_nelts_minus_one (type),
- size_one_node);
-}
-
 /* Return, as an INTEGER_CST node, the number of elements for TYPE
(which is an ARRAY_TYPE).  This one is a recursive count of all
ARRAY_TYPEs that are clumped together.  */
diff --git a/gcc/rust/backend/rust-tree.cc b/gcc/rust/backend/rust-tree.cc
index 8d32e5203ae..3dc6b076711 100644
--- a/gcc/rust/backend/rust-tree.cc
+++ b/gcc/rust/backend/rust-tree.cc
@@ -859,19 +859,6 @@ is_empty_class (tree type)
   return CLASSTYPE_EMPTY_P (type);
 }
 
-// forked from gcc/cp/tree.cc array_type_nelts_top
-
-/* Return, as an INTEGER_CST node, the number of elements for TYPE
-   (which is an ARRAY_TYPE).  This counts only elements of the top
-   array.  */
-
-tree
-array_type_nelts_top (tree type)
-{
-  return fold_build2_loc (input_location, PLUS_EXPR, sizetype,
- array_type_nelts_minus_one (type), size_one_node);
-}
-
 // forked from gcc/cp/tree.cc builtin_valid_in_constant_expr_p
 
 /* Test whether DECL is a builtin that may appear in a
diff --git a/gcc/rust/backend/rust-tree.h b/gcc/rust/backend/rust-tree.h
index 26c8b653ac6..e597c3ab81d 100644
--- a/gcc/rust/backend/rust-tree.h
+++ b/gcc/rust/backend/rust-tree.h
@@ -2993,8 +2993,6 @@ extern location_t rs_expr_location (const_tree);
 extern int
 is_empty_class (tree type);
 
-extern tree array_type_nelts_top (tree);
-
 extern bool
 is_really_empty_class (tree, bool);
 
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 582368934ef..aef8e19ec67 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -3729,6 +3729,19 @@ array_type_nelts_minus_one (const_tree type)
  ? max
  : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
 }
+
+/* Return, as an INTEGER_CST node, the number of elements for TYPE
+   (which is an ARRAY_TYPE).  This counts only elements of the top
+   array.  */
+
+tree
+array_type_nelts_top (tree type)
+{
+  return fold_build2_loc (input_location,
+ PLUS_EXPR, sizetype,
+ array_type_nelts_minus_one (type),
+ size_one_node);
+}
 
 /* If arg is static -- a reference to an object in static storage -- then
return the object.  This is not the same as the C meaning of `static'.
diff --git a/gcc/tree.h b/gcc/tree.h
index 921c3f61da8..d8203f1c7d1 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4922,6 +4922,7 @@ extern tree build_method_type (tree, tree);
 extern tree build_offset_type (tree, tree);
 extern tree build_complex_type (tree, bool named = false);
 extern tree array_type_nelts_minus_one (const_tree);
+extern tree array_type_nelts_top (tree);
 
 extern tree value_member (tree, tree);
 extern tree purpose_member (const_tree, tree);
-- 
2.45.2



signature.asc

[PATCH v15 1/4] contrib/: Add support for Cc: and Link: tags

2024-10-16 Thread Alejandro Colomar
contrib/ChangeLog:

* gcc-changelog/git_commit.py (GitCommit):
Add support for 'Cc: ' and 'Link: ' tags.

Cc: Jason Merrill 
Signed-off-by: Alejandro Colomar 
---
 contrib/gcc-changelog/git_commit.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 87ecb9e1a17..64fb986b74c 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -182,7 +182,8 @@ CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
 
 REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
'acked-by: ', 'tested-by: ', 'reported-by: ',
-   'suggested-by: ')
+   'suggested-by: ', 'cc: ')
+LINK_PREFIXES = ('link: ')
 DATE_FORMAT = '%Y-%m-%d'
 
 
@@ -524,6 +525,8 @@ class GitCommit:
 continue
 elif lowered_line.startswith(REVIEW_PREFIXES):
 continue
+elif lowered_line.startswith(LINK_PREFIXES):
+continue
 else:
 m = cherry_pick_regex.search(line)
 if m:
-- 
2.45.2



signature.asc
Description: PGP signature


[PATCH v15 4/4] c: Add __countof__ operator

2024-10-16 Thread Alejandro Colomar
This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

-  Fix support for [0].

gcc/ChangeLog:

* doc/extend.texi: Document __countof__ operator.

gcc/c-family/ChangeLog:

* c-common.h
* c-common.def
* c-common.cc (c_countof_type): Add __countof__ operator.

gcc/c/ChangeLog:

* c-tree.h
(c_expr_countof_expr, c_expr_countof_type)
* c-decl.cc
(start_struct, finish_struct)
(start_enum, finish_enum)
* c-parser.cc
(c_parser_sizeof_expression)
(c_parser_countof_expression)
(c_parser_sizeof_or_countof_expression)
(c_parser_unary_expression)
* c-typeck.cc
(build_external_ref)
(record_maybe_used_decl)
(pop_maybe_used)
(is_top_array_vla)
(c_expr_countof_expr, c_expr_countof_type):
Add __countof__operator.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c
* gcc.dg/countof-vla.c
* gcc.dg/countof.c: Add tests for __countof__ operator.

Link: 
Link: 
Link: 

Link: 
Link: 
Link: 
Link: 
Link: 
Suggested-by: Xavier Del Campo Romero 
Co-authored-by: Martin Uecker 
Acked-by: "James K. Lowden" 
Cc: Joseph Myers 
Cc: Gabriel Ravier 
Cc: Jakub Jelinek 
Cc: Kees Cook 
Cc: Qing Zhao 
Cc: Jens Gustedt 
Cc: David Brown 
Cc: Florian Weimer 
Cc: Andreas Schwab 
Cc: Timm Baeder 
Cc: Daniel Plakosh 
Cc: "A. Jiang" 
Cc: Eugene Zelenko 
Cc: Aaron Ballman 
Cc: Paul Koning 
Cc: Daniel Lundin 
Cc: Nikolaos Strimpas 
Cc: JeanHeyd Meneide 
Cc: Fernando Borretti 
Cc: Jonathan Protzenko 
Cc: Chris Bazley 
Cc: Ville Voutilainen 
Cc: Alex Celeste 
Cc: Jakub Łukasiewicz 
Cc: Douglas McIlroy 
Cc: Jason Merrill 
Cc: "Gustavo A. R. Silva" 
Cc: Patrizia Kaye 
Cc: Ori Bernstein 
Cc: Robert Seacord 
Cc: Marek Polacek 
Cc: Sam James 
Signed-off-by: Alejandro Colomar 
---
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 118 ++-
 gcc/doc/extend.texi|  30 +
 gcc/testsuite/gcc.dg/countof-compile.c | 115 +++
 gcc/testsuite/gcc.dg/countof-vla.c |  46 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 11 files changed, 554 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index ec6a5da892d..5e374d56a87 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -465,6 +465,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__inline",RID_INLINE, 0 },
   { "__inline__",  RID_INLINE, 0 },
   { "__label__",   RID_LABEL,  0 },
+  { "__countof__", RID_COUNTOF,0 },
   { "__null",  RID_NULL,   0 },
   { "__real",  RID_REALPART,   0 },
   { "__real__",RID_REALPART,   0 },
@@ -4070,6 +4071,31 @@ c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement the countof keyword:
+   Return the number of elements of an array.  */
+
+tree
+c_countof_type (location_t loc, tree type)
+{
+  enum tree_code type_code;
+
+  type_code = TREE_CODE (type);
+  if (type_code != ARRAY_TYPE)
+{
+  error_at (loc, "invalid application of % to type %qT", type);
+  return error_mark_node;
+}
+  if (!COMPLETE_TYPE_P (type))
+{
+  error_at (loc,
+   "invalid application of % to incomplete type %qT",
+   type);
+  return error_mark_node;
+}
+
+  return array_type_nelts_top (type);
+}
 
 /* Handle C and C++ default attributes.  */
 
diff --git a/gcc/c-family/c-common.def b/gcc/c-family/c-common.def
index 5de96e5d4a8..3d882e70cc0 100644
--- a/gcc/c-family/c-common.def
+++ b/gcc/c-family/c-common.def
@@ -50,6 +50,9 @@ DEFTREECODE (EXCESS_PRECISION_EXPR, "excess_precision_expr", 
tc

[PATCH v15 0/4] c: Add __countof__ operator

2024-10-16 Thread Alejandro Colomar
v15 changes:

-  Rebase.

-  Remove (unused) changes under  and .  It was
   dead code.

-  Fix typos in comments.

-  Make format of changelog more consistent.

-  Add some links and CCs to the commit message.

-  Add Acked-by James K. Lowden.  Quoting him:

> Just to say, [...], you're absolutely right.  "Size" has a meaning.
> "Count" has a meaning.  They're equal only if the unit size is 1.
>
> "Length" is ambiguous.  Often, it means "size within capacity", as
> strlen(3).  I cannot think of a single example in C++ where "length"
> means "number of elements allocated".  The STL uses size to mean 
count,
> including std::size, afaik.
>
> It is said that in 1956, someone once told Adlai Stevenson,
> the Democratic presidential candidate, ?Every thinking person in 
America
> will vote for you.? Stevenson supposed replied, "That's not enough.
> I need a majority.?
>
> 'Twas always thus.

> I would go with __countof__().  It's short and unambiguous.

-  Remove some remanent uses of length in documentation and comments.

Alejandro Colomar (4):
  contrib/: Add support for Cc: and Link: tags
  gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()
  Merge definitions of array_type_nelts_top()
  c: Add __countof__ operator

 contrib/gcc-changelog/git_commit.py|   5 +-
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  32 --
 gcc/c/c-fold.cc|   7 +-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 118 ++-
 gcc/config/aarch64/aarch64.cc  |   2 +-
 gcc/config/i386/i386.cc|   2 +-
 gcc/cp/cp-tree.h   |   1 -
 gcc/cp/decl.cc |   2 +-
 gcc/cp/init.cc |   8 +-
 gcc/cp/lambda.cc   |   3 +-
 gcc/cp/tree.cc |  13 ---
 gcc/doc/extend.texi|  30 +
 gcc/expr.cc|   8 +-
 gcc/fortran/trans-array.cc |   2 +-
 gcc/fortran/trans-openmp.cc|   4 +-
 gcc/rust/backend/rust-tree.cc  |  13 ---
 gcc/rust/backend/rust-tree.h   |   2 -
 gcc/testsuite/gcc.dg/countof-compile.c | 115 +++
 gcc/testsuite/gcc.dg/countof-vla.c |  46 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 gcc/tree.cc|  17 ++-
 gcc/tree.h |   3 +-
 27 files changed, 600 insertions(+), 80 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

Range-diff against v14:
1:  d7fca49888a = 1:  b6da2185675 contrib/: Add support for Cc: and Link: tags
2:  e65245ac294 = 2:  a0fa3f139f9 gcc/: Rename array_type_nelts() => 
array_type_nelts_minus_one()
3:  03de2d67bb1 = 3:  43a2e18c6a2 Merge definitions of array_type_nelts_top()
4:  6714852dd93 ! 4:  8a6959d2d38 c: Add __countof__ operator
@@ Commit message
 gcc/ChangeLog:
 
 * doc/extend.texi: Document __countof__ operator.
-* target.h (enum type_context_kind): Add __countof__ operator.
 
 gcc/c-family/ChangeLog:
 
 * c-common.h
-* c-common.def:
+* c-common.def
 * c-common.cc (c_countof_type): Add __countof__ operator.
 
 gcc/c/ChangeLog:
@@ Commit message
 (c_parser_unary_expression)
 * c-typeck.cc
 (build_external_ref)
-(record_maybe_used_decl, pop_maybe_used)
+(record_maybe_used_decl)
+(pop_maybe_used)
 (is_top_array_vla)
 (c_expr_countof_expr, c_expr_countof_type):
 Add __countof__operator.
 
-gcc/cp/ChangeLog:
-
-* operators.def: Add __countof__ operator.
-
 gcc/testsuite/ChangeLog:
 
 * gcc.dg/countof-compile.c
 * gcc.dg/countof-vla.c
 * gcc.dg/countof.c: Add tests for __countof__ operator.
 
+Link: 
+Link: 
+Link: 

 Link: 
 Link: 
 Link: 
-Link: 

Re: [PATCH v15 4/4] c: Add __countof__ operator

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Alejandro Colomar wrote:

> +  if (type_code != ARRAY_TYPE)
> +{
> +  error_at (loc, "invalid application of % to type %qT", type);
> +  return error_mark_node;
> +}
> +  if (!COMPLETE_TYPE_P (type))
> +{
> +  error_at (loc,
> + "invalid application of % to incomplete type %qT",

It is never appropriate, regardless of the actual operator naming, for a 
diagnostic to refer to an operator name that doesn't exist at all (such as 
countof instead of __countof__ in this patch).

> @@ -8992,12 +8992,17 @@ start_struct (location_t loc, enum tree_code code, 
> tree name,
>   within a statement expr used within sizeof, et. al.  This is not
>   terribly serious as C++ doesn't permit statement exprs within
>   sizeof anyhow.  */
> -  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
> +  if (warn_cxx_compat
> +  && (in_sizeof || in_typeof || in_alignof || in_countof))
>  warning_at (loc, OPT_Wc___compat,
>   "defining type in %qs expression is invalid in C++",
>   (in_sizeof
>? "sizeof"
> -  : (in_typeof ? "typeof" : "alignof")));
> +  : (in_typeof
> + ? "typeof"
> + : (in_alignof
> +? "alignof"
> +: "countof";

Likewise.

> @@ -10135,12 +10140,17 @@ start_enum (location_t loc, struct c_enum_contents 
> *the_enum, tree name,
>/* FIXME: This will issue a warning for a use of a type defined
>   within sizeof in a statement expr.  This is not terribly serious
>   as C++ doesn't permit statement exprs within sizeof anyhow.  */
> -  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
> +  if (warn_cxx_compat
> +  && (in_sizeof || in_typeof || in_alignof || in_countof))
>  warning_at (loc, OPT_Wc___compat,
>   "defining type in %qs expression is invalid in C++",
>   (in_sizeof
>? "sizeof"
> -  : (in_typeof ? "typeof" : "alignof")));
> +  : (in_typeof
> + ? "typeof"
> + : (in_alignof
> +? "alignof"
> +: "countof";

Likewise.

>  static struct c_expr
> -c_parser_sizeof_expression (c_parser *parser)
> +c_parser_sizeof_or_countof_expression (c_parser *parser, enum rid rid)
>  {
> +  const char *op_name = (rid == RID_COUNTOF) ? "countof" : "sizeof";

Likewise.

> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 302c3299ede..82f31668e37 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -10555,6 +10555,36 @@ If the operand of the @code{__alignof__} expression 
> is a function,
>  the expression evaluates to the alignment of the function which may
>  be specified by attribute @code{aligned} (@pxref{Common Function 
> Attributes}).
>  
> +@node countof
> +@section Determining the Number of Elements of Arrays
> +@cindex countof
> +@cindex number of elements

Likewise, for node name and index entry.

-- 
Joseph S. Myers
josmy...@redhat.com



[PATCH] Fix gcc.dg/vect/vect-early-break_39.c FAIL with forced SLP

2024-10-16 Thread Richard Biener
The testcases shows single-element interleaving of size three
being exempted from permutation lowering via heuristics
(see also PR116973).  But it wasn't supposed to apply to
non-power-of-two sizes so this amends the check to ensure
the sub-group is aligned even when the number of lanes is one.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

* tree-vect-slp.cc (vect_lower_load_permutations): Avoid
exempting non-power-of-two group sizes from lowering.
---
 gcc/tree-vect-slp.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 629c4b433ab..d35c2ea02dc 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -4427,6 +4427,7 @@ vect_lower_load_permutations (loop_vec_info loop_vinfo,
  && contiguous
  && (SLP_TREE_LANES (load) > 1 || loads.size () == 1)
  && pow2p_hwi (SLP_TREE_LANES (load))
+ && pow2p_hwi (group_lanes)
  && SLP_TREE_LOAD_PERMUTATION (load)[0] % SLP_TREE_LANES (load) == 0
  && group_lanes % SLP_TREE_LANES (load) == 0)
{
-- 
2.43.0


[PATCH] vax: fixup vax.opt.urls

2024-10-16 Thread Sam James
Needed after r15-4373-gb388f65abc71c9.

gcc/ChangeLog:

* config/vax/vax.opt.urls: Adjust index for -mlra.
---
Pushed.

 gcc/config/vax/vax.opt.urls | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/vax/vax.opt.urls b/gcc/config/vax/vax.opt.urls
index 10bee25d8336..7813b886baa2 100644
--- a/gcc/config/vax/vax.opt.urls
+++ b/gcc/config/vax/vax.opt.urls
@@ -19,5 +19,5 @@ munix
 UrlSuffix(gcc/VAX-Options.html#index-munix)
 
 mlra
-UrlSuffix(gcc/VAX-Options.html#index-mlra-4)
+UrlSuffix(gcc/VAX-Options.html#index-mlra-3)
 
-- 
2.47.0



[PATCH] Enhance gather fallback for PR65518 with SLP

2024-10-16 Thread Richard Biener
With SLP forced we fail to use gather for PR65518 on RISC-V as expected
because we're failing due to not effective peeling for gaps.  The
following appropriately moves the memory_access_type adjustment before
doing all the overrun checking since using VMAT_ELEMENTWISE means
there's no overrun.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

* tree-vect-stmts.cc (get_group_load_store_type): Move
VMAT_ELEMENTWISE fallback for single-element interleaving
of too large groups before overrun checking.

* gcc.dg/vect/pr65518.c: Adjust.
---
 gcc/testsuite/gcc.dg/vect/pr65518.c | 109 ++--
 gcc/tree-vect-stmts.cc  |  58 ---
 2 files changed, 85 insertions(+), 82 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr65518.c 
b/gcc/testsuite/gcc.dg/vect/pr65518.c
index 189a65534f6..6d851506169 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65518.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65518.c
@@ -1,54 +1,55 @@
-#include "tree-vect.h"
-
-#if VECTOR_BITS > 256
-#define NINTS (VECTOR_BITS / 32)
-#else
-#define NINTS 8
-#endif
-
-#define N (NINTS * 2)
-#define RESULT (NINTS * (NINTS - 1) / 2 * N + NINTS)
-
-extern void abort (void);
-
-typedef struct giga
-{
-  unsigned int g[N];
-} giga;
-
-unsigned long __attribute__((noinline,noclone))
-addfst(giga const *gptr, int num)
-{
-  unsigned int retval = 0;
-  int i;
-  for (i = 0; i < num; i++)
-retval += gptr[i].g[0];
-  return retval;
-}
-
-int main ()
-{
-  struct giga g[NINTS];
-  unsigned int n = 1;
-  int i, j;
-  check_vect ();
-  for (i = 0; i < NINTS; ++i)
-for (j = 0; j < N; ++j)
-  {
-   g[i].g[j] = n++;
-   __asm__ volatile ("");
-  }
-  if (addfst (g, NINTS) != RESULT)
-abort ();
-  return 0;
-}
-
-/* We don't want to vectorize the single-element interleaving in the way
-   we currently do that (without ignoring not needed vectors in the
-   gap between gptr[0].g[0] and gptr[1].g[0]), because that's very
-   sub-optimal and causes memory explosion (even though the cost model
-   should reject that in the end).  */
-
-/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 "vect" 
{ target {! riscv*-*-* } } } } */
-/* We end up using gathers for the strided load on RISC-V which would be OK.  
*/
-/* { dg-final { scan-tree-dump "using gather/scatter for strided/grouped 
access" "vect" { target { riscv*-*-* } } } } */
+#include "tree-vect.h"
+
+#if VECTOR_BITS > 256
+#define NINTS (VECTOR_BITS / 32)
+#else
+#define NINTS 8
+#endif
+
+#define N (NINTS * 2)
+#define RESULT (NINTS * (NINTS - 1) / 2 * N + NINTS)
+
+extern void abort (void);
+
+typedef struct giga
+{
+  unsigned int g[N];
+} giga;
+
+unsigned long __attribute__((noinline,noclone))
+addfst(giga const *gptr, int num)
+{
+  unsigned int retval = 0;
+  int i;
+  for (i = 0; i < num; i++)
+retval += gptr[i].g[0];
+  return retval;
+}
+
+int main ()
+{
+  struct giga g[NINTS];
+  unsigned int n = 1;
+  int i, j;
+  check_vect ();
+  for (i = 0; i < NINTS; ++i)
+for (j = 0; j < N; ++j)
+  {
+   g[i].g[j] = n++;
+   __asm__ volatile ("");
+  }
+  if (addfst (g, NINTS) != RESULT)
+abort ();
+  return 0;
+}
+
+/* We don't want to vectorize the single-element interleaving in the way
+   we currently do that (without ignoring not needed vectors in the
+   gap between gptr[0].g[0] and gptr[1].g[0]), because that's very
+   sub-optimal and causes memory explosion (even though the cost model
+   should reject that in the end).  */
+
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 "vect" 
{ target {! riscv*-*-* } } } } */
+/* We should end up using gathers for the strided load on RISC-V.  */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 1 "vect" 
{ target { riscv*-*-* } } } } */
+/* { dg-final { scan-tree-dump "using gather/scatter for strided/grouped 
access" "vect" { target { riscv*-*-* } } } } */
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 13a825319ca..14723c4dbac 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2081,6 +2081,35 @@ get_group_load_store_type (vec_info *vinfo, 
stmt_vec_info stmt_info,
  else
*memory_access_type = VMAT_CONTIGUOUS;
 
+ /* If this is single-element interleaving with an element
+distance that leaves unused vector loads around punt - we
+at least create very sub-optimal code in that case (and
+blow up memory, see PR65518).  */
+ if (loop_vinfo
+ && *memory_access_type == VMAT_CONTIGUOUS
+ && single_element_p
+ && maybe_gt (group_size, TYPE_VECTOR_SUBPARTS (vectype)))
+   {
+ if (SLP_TREE_LANES (slp_node) == 1)
+   {
+ *memory_access_type = VMAT_ELEMENTWISE;
+ overrun_p = false;
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vec

Re: [Ping, Fortran, Patch, PR80235, v1] Fix ICE when coarray from module is referenced in submodule.

2024-10-16 Thread Andre Vehreschild
Hi all,

PING.

Re-Regtested ok again on x86_64-pc-linux-gnu. Ok for mainline?

Regards,
Andre

On Wed, 25 Sep 2024 12:29:21 +0200
Andre Vehreschild  wrote:

> Hi all and esp. Paul,
>
> the attached patch fixes an ICE with coarrays defined in modules and then used
> in submodules. Referencing the variable relied on the curr_module being set in
> the gfc_build_qualified_array routine, which it was not. I therefore took the
> name from the symbol. I don't know if this is correct. Paul any idea? You have
> done the submodule part in the past. I am still not sure, if the coarray is
> exported into the .mod file correctly, because it gets host_assoc set, while I
> would expect it to have use_assoc set. So may be this patch is only healing
> the symptom and not the cause. Could you have a look, please?
>
> Regtests ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline?
>
> Regards,
>   Andre
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


--
Andre Vehreschild * Email: vehre ad gmx dot de
From 3f9876347c6229270f18743ff17bc55175b49a57 Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Tue, 24 Sep 2024 14:30:52 +0200
Subject: [PATCH] [Fortran] Fix ICE with coarrays and submodules [PR80235]

Exposing a variable in a module and referencing it in a submodule made
the compiler ICE, because the external variable was not sorted into the
correct module.  In fact the module name was not set where the variable
got built.

gcc/fortran/ChangeLog:

	PR fortran/80235

	* trans-decl.cc (gfc_build_qualified_array): Make sure the array
	is associated to the correct module and being marked as extern.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray/add_sources/submodule_1_sub.f90: New test.
	* gfortran.dg/coarray/submodule_1.f90: New test.
---
 gcc/fortran/trans-decl.cc |  7 +++--
 .../coarray/add_sources/submodule_1_sub.f90   | 22 ++
 .../gfortran.dg/coarray/submodule_1.f90   | 29 +++
 3 files changed, 56 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/coarray/add_sources/submodule_1_sub.f90
 create mode 100644 gcc/testsuite/gfortran.dg/coarray/submodule_1.f90

diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 56b6202510e..9cced7c02e4 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -1066,7 +1066,8 @@ gfc_build_qualified_array (tree decl, gfc_symbol * sym)
 			IDENTIFIER_POINTER (gfc_sym_mangled_identifier (sym;
 	  token = build_decl (DECL_SOURCE_LOCATION (decl), VAR_DECL, token_name,
 			  token_type);
-	  if (sym->attr.use_assoc)
+	  if (sym->attr.use_assoc
+	  || (sym->attr.host_assoc && sym->attr.used_in_submodule))
 	DECL_EXTERNAL (token) = 1;
 	  else
 	TREE_STATIC (token) = 1;
@@ -1091,9 +1092,11 @@ gfc_build_qualified_array (tree decl, gfc_symbol * sym)

   if (sym->module && !sym->attr.use_assoc)
 	{
+	  module_htab_entry *mod
+	= cur_module ? cur_module : gfc_find_module (sym->module);
 	  pushdecl (token);
 	  DECL_CONTEXT (token) = sym->ns->proc_name->backend_decl;
-	  gfc_module_add_decl (cur_module, token);
+	  gfc_module_add_decl (mod, token);
 	}
   else if (sym->attr.host_assoc
 	   && TREE_CODE (DECL_CONTEXT (current_function_decl))
diff --git a/gcc/testsuite/gfortran.dg/coarray/add_sources/submodule_1_sub.f90 b/gcc/testsuite/gfortran.dg/coarray/add_sources/submodule_1_sub.f90
new file mode 100644
index 000..fd177fcda29
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray/add_sources/submodule_1_sub.f90
@@ -0,0 +1,22 @@
+! This test belongs to submodule_1.f90
+! It is references as additional source in that test.
+! The two code fragments need to be in separate files to show
+! the error of pr80235.
+
+submodule (pr80235) pr80235_sub
+
+contains
+  module subroutine test()
+implicit none
+if (var%v /= 42) stop 1
+  end subroutine
+end submodule pr80235_sub
+
+program pr80235_prg
+  use pr80235
+
+  implicit none
+
+  var%v = 42
+  call test()
+end program
diff --git a/gcc/testsuite/gfortran.dg/coarray/submodule_1.f90 b/gcc/testsuite/gfortran.dg/coarray/submodule_1.f90
new file mode 100644
index 000..d0faef93ba7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray/submodule_1.f90
@@ -0,0 +1,29 @@
+!{ dg-do run }
+!{ dg-additional-sources add_sources/submodule_1_sub.f90 }
+
+! Separating the module and the submodule is needed to show the error.
+! Having all code pieces in one file does not show the error.
+
+module pr80235
+  implicit none
+
+  private
+  public :: test, var
+
+  type T
+integer :: v
+  end type T
+
+interface
+
+  module subroutine test()
+  end subroutine
+
+end interface
+
+  type(T) :: var[*]
+
+end module pr80235
+
+
+
--
2.46.2



[committed] libstdc++: Fix Python deprecation warning in printers.py

2024-10-16 Thread Jonathan Wakely
Tested x86_64-linux with gdb-15.1 and Python 3.12.

Pushed to trunk, backports to follow.

-- >8 --

python/libstdcxx/v6/printers.py:1355: DeprecationWarning: 'count' is passed as 
positional argument

The Python docs say:

  Deprecated since version 3.13: Passing count and flags as positional
  arguments is deprecated. In future Python versions they will be
  keyword-only parameters.

Using a keyword argument for count only became possible with Python 3.1
so introduce a new function to do the substitution.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (strip_fundts_namespace): New.
(StdExpAnyPrinter, StdExpOptionalPrinter): Use it.
---
 libstdc++-v3/python/libstdcxx/v6/printers.py | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 92104937862..d05b79762fd 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -220,6 +220,16 @@ def strip_versioned_namespace(typename):
 return typename.replace(_versioned_namespace, '')
 
 
+def strip_fundts_namespace(typ):
+"""Remove "fundamentals_vN" inline namespace from qualified type name."""
+pattern = r'^std::experimental::fundamentals_v\d::'
+repl = 'std::experimental::'
+if sys.version_info[0] == 2:
+return re.sub(pattern, repl, typ, 1)
+else: # Technically this needs Python 3.1 but nobody should be using 3.0
+return re.sub(pattern, repl, typ, count=1)
+
+
 def strip_inline_namespaces(type_str):
 """Remove known inline namespaces from the canonical name of a type."""
 type_str = strip_versioned_namespace(type_str)
@@ -1355,8 +1365,7 @@ class StdExpAnyPrinter(SingleObjContainerPrinter):
 
 def __init__(self, typename, val):
 self._typename = strip_versioned_namespace(typename)
-self._typename = re.sub(r'^std::experimental::fundamentals_v\d::',
-'std::experimental::', self._typename, 1)
+self._typename = strip_fundts_namespace(self._typename)
 self._val = val
 self._contained_type = None
 contained_value = None
@@ -1449,10 +1458,8 @@ class StdExpOptionalPrinter(SingleObjContainerPrinter):
 """Print a std::optional or std::experimental::optional."""
 
 def __init__(self, typename, val):
-typename = strip_versioned_namespace(typename)
-self._typename = re.sub(
-r'^std::(experimental::|)(fundamentals_v\d::|)(.*)',
-r'std::\1\3', typename, 1)
+self._typename = strip_versioned_namespace(typename)
+self._typename = strip_fundts_namespace(self._typename)
 payload = val['_M_payload']
 if self._typename.startswith('std::experimental'):
 engaged = val['_M_engaged']
-- 
2.46.2



Re: [PATCH v2] alpha: Add -mlra option

2024-10-16 Thread Richard Biener
On Wed, Oct 16, 2024 at 4:40 AM John Paul Adrian Glaubitz
 wrote:
>
> On Tue, 2024-10-15 at 16:18 +0200, John Paul Adrian Glaubitz wrote:
> > On Tue, 2024-10-15 at 07:56 -0600, Jeff Law wrote:
> > > Also note if we think it's basically working I can flip my tester to
> > > default to LRA.  It bootstraps and regtests alpha once a week via qemu.
> > >
> > > I think it's testing the baseline configuration, so presumably non-BWX
> > > variants.  That can probably be adjusted if necessary.
> >
> > It does seem to fail when enabling M2 though:
> >
> > m2/pge -k -l ../../gcc/m2/gm2-compiler/P2Build.bnf -o 
> > m2/gm2-compiler-boot/P2Build.mod
> > m2/pge -k -l ../../gcc/m2/gm2-compiler/P3Build.bnf -o 
> > m2/gm2-compiler-boot/P3Build.mod
> > m2/pge -k -l ../../gcc/m2/gm2-compiler/PHBuild.bnf -o 
> > m2/gm2-compiler-boot/PHBuild.mod
> > m2/pge -k -l ../../gcc/m2/gm2-compiler/PCBuild.bnf -o 
> > m2/gm2-compiler-boot/PCBuild.mod
> > m2/pge -k -l ../../gcc/m2/gm2-compiler/P1Build.bnf -o 
> > m2/gm2-compiler-boot/P1Build.mod
> > m2/pge -k -l ../../gcc/m2/gm2-compiler/P0SyntaxCheck.bnf -o 
> > m2/gm2-compiler-boot/P0SyntaxCheck.mod
> > terminate called after throwing an instance of 'unsigned int'
> > make[3]: *** [../../gcc/m2/Make-lang.in:1778: 
> > m2/gm2-compiler-boot/P2Build.mod] Aborted
> > make[3]: *** Deleting file 'm2/gm2-compiler-boot/P2Build.mod'
> > make[3]: *** Waiting for unfinished jobs
> > terminate called after throwing an instance of 'unsigned int'
> > make[3]: *** [../../gcc/m2/Make-lang.in:1778: 
> > m2/gm2-compiler-boot/P0SyntaxCheck.mod] Aborted
> > make[3]: *** Deleting file 'm2/gm2-compiler-boot/P0SyntaxCheck.mod'
> > terminate called after throwing an instance of 'unsigned int'
> > make[3]: *** [../../gcc/m2/Make-lang.in:1778: 
> > m2/gm2-compiler-boot/P1Build.mod] Aborted
> > make[3]: *** Deleting file 'm2/gm2-compiler-boot/P1Build.mod'
> > terminate called after throwing an instance of 'unsigned int'
> > make[3]: *** [../../gcc/m2/Make-lang.in:1778: 
> > m2/gm2-compiler-boot/P3Build.mod] Aborted
> > make[3]: *** Deleting file 'm2/gm2-compiler-boot/P3Build.mod'
> > terminate called after throwing an instance of 'unsigned int'
> > make[3]: *** [../../gcc/m2/Make-lang.in:1778: 
> > m2/gm2-compiler-boot/PHBuild.mod] Aborted
> > make[3]: *** Deleting file 'm2/gm2-compiler-boot/PHBuild.mod'
> > terminate called after throwing an instance of 'unsigned int'
> > make[3]: *** [../../gcc/m2/Make-lang.in:1778: 
> > m2/gm2-compiler-boot/PCBuild.mod] Aborted
> > make[3]: *** Deleting file 'm2/gm2-compiler-boot/PCBuild.mod'
>
> Disabling M2 is enough to fix this.

For practical purposes all reload->LRA conversions should focus on C
and C++.  Everything
else is optional and not required to keep a port live (I'd argue it's
wasting cycles to look
at anything beyond C/C++ until those work with a mostly clean testsuite).

Richard.

> Thus, GCC bootstraps fine with LRA if one disables M2 and uses EV56 as the 
> baseline.
>
> I will run the testsuite now for both Reload and LRA.
>
> Adrian
>
> --
>  .''`.  John Paul Adrian Glaubitz
> : :' :  Debian Developer
> `. `'   Physicist
>   `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


Re: [PATCH] PR116510: Add missing fold_converts into tree switch if conversion

2024-10-16 Thread Richard Biener
On Tue, Oct 15, 2024 at 10:18 PM Andi Kleen  wrote:
>
> From: Andi Kleen 
>
> Passes test suite. Ok to commit?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR middle-end/116510
> * tree-if-conv.cc (predicate_bbs): Add missing fold_converts.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/vect-switch-ifcvt-3.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-3.c | 12 
>  gcc/tree-if-conv.cc |  9 ++---
>  2 files changed, 18 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-3.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-3.c 
> b/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-3.c
> new file mode 100644
> index ..41bc8a1cf129
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-3.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +// PR116510
> +
> +char excmap_def_0;
> +int gg_strescape_i;
> +void gg_strescape() {
> +  for (; gg_strescape_i; gg_strescape_i++)
> +switch ((unsigned char)gg_strescape_i)
> +case '\\':
> +case '"':
> +  excmap_def_0 = 0;
> +}
> diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
> index 90c754a48147..376a4642954d 100644
> --- a/gcc/tree-if-conv.cc
> +++ b/gcc/tree-if-conv.cc
> @@ -1477,10 +1477,12 @@ predicate_bbs (loop_p loop)
> {
>   tree low = build2_loc (loc, GE_EXPR,
>  boolean_type_node,
> -index, CASE_LOW (label));
> +index, fold_convert_loc (loc, 
> TREE_TYPE (index),
> +CASE_LOW (label)));
>   tree high = build2_loc (loc, LE_EXPR,
>   boolean_type_node,
> - index, CASE_HIGH (label));
> + index, fold_convert_loc (loc, 
> TREE_TYPE (index),
> + CASE_HIGH (label)));
>   case_cond = build2_loc (loc, TRUTH_AND_EXPR,
>   boolean_type_node,
>   low, high);
> @@ -1489,7 +1491,8 @@ predicate_bbs (loop_p loop)
> case_cond = build2_loc (loc, EQ_EXPR,
> boolean_type_node,
> index,
> -   CASE_LOW (gimple_switch_label (sw, 
> i)));
> +   fold_convert_loc (loc, TREE_TYPE 
> (index),
> + CASE_LOW (label)));
>   if (i > 1)
> switch_cond = build2_loc (loc, TRUTH_OR_EXPR,
>   boolean_type_node,
> --
> 2.46.2
>


Re: [PATCH] sparc: drop -mlra

2024-10-16 Thread Eric Botcazou
> Let's finish the transition by dropping -mlra entirely.
> 
> Tested on sparc64-unknown-linux-gnu with no regressions.
> 
> gcc/ChangeLog:
>   PR target/113952
> 
>   * config/sparc/sparc.cc (sparc_lra_p): Delete.
>   (TARGET_LRA_P): Ditto.
>   (sparc_option_override): Don't use MASK_LRA.
>   * config/sparc/sparc.md (disabled,enabled): Drop lra attribute.
>   * config/sparc/sparc.opt: Delete -mlra.
>   * config/sparc/sparc.opt.urls: Ditto.
>   * doc/invoke.texi (SPARC options): Drop -mlra and -mno-lra.

OK, thanks! (modulo the blank line in the ChangeLog)

-- 
Eric Botcazou




[PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-16 Thread Evgeny Karpov
Thursday, September 19, 2024
Richard Sandiford  wrote:

>> For instance:
>> float __attribute__((aligned (32))) large_aligned_array[3];
>>
>> BIGGEST_ALIGNMENT could be up to 512 bits on x64.
>> This patch has been added to cover this case without needing to
>> change the FFmpeg code.
>
> What goes wrong if we don't do this?  I'm not sure from the description
> whether it's a correctness fix, a performance fix, or whether it's about
> avoiding wasted space.

It is a correctness fix.

>> +#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGNMENT)  \
>> +  { \
>> +    unsigned HOST_WIDE_INT rounded = MAX ((SIZE), 1); \
>> +    unsigned HOST_WIDE_INT alignment = MAX ((ALIGNMENT), 
>> BIGGEST_ALIGNMENT); \
>> +    rounded += (alignment / BITS_PER_UNIT) - 1; \
>> +    rounded = (rounded / (alignment / BITS_PER_UNIT) \
>> +  * (alignment / BITS_PER_UNIT)); \
>
> There's a ROUND_UP macro that could be used here.

Here is the patch after applying ROUND_UP.

Regards,
Evgeny

diff --git a/gcc/config/aarch64/aarch64-coff.h 
b/gcc/config/aarch64/aarch64-coff.h
index 3c8aed806c9..1a45256ebfe 100644
--- a/gcc/config/aarch64/aarch64-coff.h
+++ b/gcc/config/aarch64/aarch64-coff.h
@@ -58,6 +58,13 @@
   assemble_name ((FILE), (NAME)),  \
   fprintf ((FILE), "," HOST_WIDE_INT_PRINT_DEC  "\n", (ROUNDED)))

+#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGNMENT)  \
+  {\
+unsigned rounded = ROUND_UP (MAX ((SIZE), 1),  \
+  MAX ((ALIGNMENT), BIGGEST_ALIGNMENT) / BITS_PER_UNIT);   \
+ASM_OUTPUT_LOCAL (FILE, NAME, SIZE, rounded);  \
+  }
+
 #define ASM_OUTPUT_SKIP(STREAM, NBYTES)\
   fprintf (STREAM, "\t.space\t%d  // skip\n", (int) (NBYTES))


Re: [PATCH RFC] build: update bootstrap req to C++14

2024-10-16 Thread Jason Merrill

On 9/20/24 11:51 AM, Jason Merrill wrote:

On 9/19/24 4:37 PM, Jakub Jelinek wrote:

On Thu, Sep 19, 2024 at 10:21:15AM -0400, Jason Merrill wrote:

On 9/19/24 7:57 AM, Richard Biener wrote:

On Wed, Sep 18, 2024 at 6:22 PM Jason Merrill  wrote:


Tested x86_64-pc-linux-gnu with 5.5.0 bootstrap compiler.  Thoughts?


I'm fine with this in general - do we have needs of bumping the 
requirement for

GCC 15 though?  IMO we should bump once we are requiring actual C++14
in some place.


Jakub's dwarf2asm patch yesterday uses C++14 if available, and I 
remember


And libcpp too.


seeing a couple of other patches that would have been simpler with C++14
available.


It was just a few lines and if I removed the now never true
HAVE_DESIGNATED_INITIALIZERS cases, it wouldn't even add any new 
lines, just
change some to others.  Both of those patches were just minor 
optimizations,

it is fine if they don't happen during stage1.

We also have some spots with
#if __cpp_inline_variables < 201606L
#else
#endif
conditionals but that doesn't mean we need to bump to C++17.

Sure, bumping the required C++ version means we can remove the 
corresponding

conditionals, and more importantly stop worrying about working around GCC
4.8.x/4.9 bugs (I think that is actually more important).
The price is stopping to use some of the cfarm machines for testing or
using IBM Advanced Toolchain or hand-built GCC 14 there as the system
compiler there.


Looks like the affected cfarm machines would be the PPC CentOS 7 boxes: 
cfarm110, 112, and 135, which are still on gcc 4.8.5.  Currently /opt/ 
cfarm/gcc-latest is a dead symlink on all three.  I've now installed 9.5 
on cfarm110 and am building 11.5 on cfarm135.


At some point we certainly want to do that, the question is if the 
benefits

right now overweight the pain.


Absolutely, the question is what pain there is at this point.

As of the version requirement as you say only some minor versions of 
the GCC 5

series are OK I would suggest to say we recommend using GCC 6 or later
but GCC 5.5 should also work?


Aren't we already specifying a minor revision with 4.8.3 for C++11?

Another possibility would be to just say GCC 5, and adjust that 
upward if we

run into problems.


I think for the oldest supported version we need some CFarm machines 
around

with that compiler so that all people can actually test issues with it.
Dunno which distros shipped GCC 5 in long term support versions if any 
and

at which minor those are.


Dongsheng Song just pointed out that Ubuntu 16.04 LTS is on GCC 5.4, 
though the cfarm doesn't have such a system.


But I don't think it's necessary to have the oldest supported version be 
a system compiler, IMO it should be fine to use a custom-built compiler 
in /opt/cfarm for testing.  I started installing 5.* on cfarm187, but 
now I see it has no system gnat.  I started over on cfarm186, which has 
system gnat, but Ada 5.5 bootstrap fails.


On 9/20/24 8:27 AM, Richard Biener wrote:
So I'm fine with raising the requirement now and documenting the 
oldest working
release;  we'd just have to double-check that really does it - for 
example when
we document 5.4 works that might suggest people should go and download 
& build
5.4 while of course they should instead go and download the newest 
release that
had the same build requirement as 5.4 had - that's why I suggested to 
document
a _recommended_ version plus the oldest version that's known to work 
if readily

available.


As Iain was suggesting, 9.5 seems like a good universal transition 
between very old compilers and trunk, as it's the minimum requirement to 
build current D, and also the last version that just requires some 
version of GNAT to build Ada rather than a minimum version (currently 
5.1).  9 was also the version in which we declared C++17 support stable 
(though it was feature-complete in 7).


Alternatively, systems (that care about Ada and D) running 4.7 could 
build 10.5, systems running 4.8 could build 11.5.


Here's an updated patch.  I tested C++14 bootstrap again with 5.x 
compilers, and Jakub's dwarf2asm change breaks on 5.3 due to PR69995, 
while 5.4 successfully bootstraps.


I also added the 9.5 recommendation.From 87e90d3677a6211b5bb9fc6865b987203a819108 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Tue, 17 Sep 2024 17:38:35 -0400
Subject: [PATCH] build: update bootstrap req to C++14
To: gcc-patches@gcc.gnu.org

This implements my proposal to update our bootstrap requirement to C++14.
The big benefit of the change is the greater constexpr power, but C++14 also
added variable templates, generic lambdas, lambda init-capture, binary
literals, and numeric literal digit separators.

C++14 was feature-complete in GCC 5, and became the default in GCC 6.  5.4.0
bootstraps trunk correctly; trunk stage1 built with 5.3.0 breaks in
eh_data_format_name due to PR69995.

gcc/ChangeLog:

	* doc/install.texi (Prerequisites): Update to C++14.

ChangeLog:

	* configure.ac: Update requirement

[PATCH v16b 2/4] gcc/: Rename array_type_nelts => array_type_nelts_minus_one

2024-10-16 Thread Alejandro Colomar
The old name was misleading.

While at it, also rename some temporary variables that are used with
this function, for consistency.

Link: 


gcc/ChangeLog:

* tree.cc (array_type_nelts, array_type_nelts_minus_one)
* tree.h (array_type_nelts, array_type_nelts_minus_one)
* expr.cc (count_type_elements)
* config/aarch64/aarch64.cc
(pure_scalable_type_info::analyze_array)
* config/i386/i386.cc (ix86_canonical_va_list_type):
Rename array_type_nelts => array_type_nelts_minus_one
The old name was misleading.

gcc/c/ChangeLog:

* c-decl.cc (one_element_array_type_p, get_parm_array_spec)
* c-fold.cc (c_fold_array_ref):
Rename array_type_nelts => array_type_nelts_minus_one

gcc/cp/ChangeLog:

* decl.cc (reshape_init_array)
* init.cc
(build_zero_init_1)
(build_value_init_noctor)
(build_vec_init)
(build_delete)
* lambda.cc (add_capture)
* tree.cc (array_type_nelts_top):
Rename array_type_nelts => array_type_nelts_minus_one

gcc/fortran/ChangeLog:

* trans-array.cc (structure_alloc_comps)
* trans-openmp.cc
(gfc_walk_alloc_comps)
(gfc_omp_clause_linear_ctor):
Rename array_type_nelts => array_type_nelts_minus_one

gcc/rust/ChangeLog:

* backend/rust-tree.cc (array_type_nelts_top):
Rename array_type_nelts => array_type_nelts_minus_one

Cc: Gabriel Ravier 
Cc: Martin Uecker 
Cc: Joseph Myers 
Cc: Xavier Del Campo Romero 
Cc: Jakub Jelinek 
Suggested-by: Richard Biener 
Signed-off-by: Alejandro Colomar 
---
 gcc/c/c-decl.cc   | 10 +-
 gcc/c/c-fold.cc   |  7 ---
 gcc/config/aarch64/aarch64.cc |  2 +-
 gcc/config/i386/i386.cc   |  2 +-
 gcc/cp/decl.cc|  2 +-
 gcc/cp/init.cc|  8 
 gcc/cp/lambda.cc  |  3 ++-
 gcc/cp/tree.cc|  2 +-
 gcc/expr.cc   |  8 
 gcc/fortran/trans-array.cc|  2 +-
 gcc/fortran/trans-openmp.cc   |  4 ++--
 gcc/rust/backend/rust-tree.cc |  2 +-
 gcc/tree.cc   |  4 ++--
 gcc/tree.h|  2 +-
 14 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 888966cb710..c91edaa8975 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5367,7 +5367,7 @@ one_element_array_type_p (const_tree type)
 {
   if (TREE_CODE (type) != ARRAY_TYPE)
 return false;
-  return integer_zerop (array_type_nelts (type));
+  return integer_zerop (array_type_nelts_minus_one (type));
 }
 
 /* Determine whether TYPE is a zero-length array type "[0]".  */
@@ -6315,15 +6315,15 @@ get_parm_array_spec (const struct c_parm *parm, tree 
attrs)
  for (tree type = parm->specs->type; TREE_CODE (type) == ARRAY_TYPE;
   type = TREE_TYPE (type))
{
- tree nelts = array_type_nelts (type);
- if (error_operand_p (nelts))
+ tree nelts_minus_one = array_type_nelts_minus_one (type);
+ if (error_operand_p (nelts_minus_one))
return attrs;
- if (TREE_CODE (nelts) != INTEGER_CST)
+ if (TREE_CODE (nelts_minus_one) != INTEGER_CST)
{
  /* Each variable VLA bound is represented by the dollar
 sign.  */
  spec += "$";
- tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
+ tpbnds = tree_cons (NULL_TREE, nelts_minus_one, tpbnds);
}
}
  tpbnds = nreverse (tpbnds);
diff --git a/gcc/c/c-fold.cc b/gcc/c/c-fold.cc
index 57b67c74bd8..9ea174f79c4 100644
--- a/gcc/c/c-fold.cc
+++ b/gcc/c/c-fold.cc
@@ -73,11 +73,12 @@ c_fold_array_ref (tree type, tree ary, tree index)
   unsigned elem_nchars = (TYPE_PRECISION (elem_type)
  / TYPE_PRECISION (char_type_node));
   unsigned len = (unsigned) TREE_STRING_LENGTH (ary) / elem_nchars;
-  tree nelts = array_type_nelts (TREE_TYPE (ary));
+  tree nelts_minus_one = array_type_nelts_minus_one (TREE_TYPE (ary));
   bool dummy1 = true, dummy2 = true;
-  nelts = c_fully_fold_internal (nelts, true, &dummy1, &dummy2, false, false);
+  nelts_minus_one = c_fully_fold_internal (nelts_minus_one, true, &dummy1,
+  &dummy2, false, false);
   unsigned HOST_WIDE_INT i = tree_to_uhwi (index);
-  if (!tree_int_cst_le (index, nelts)
+  if (!tree_int_cst_le (index, nelts_minus_one)
   || i >= len
   || i + elem_nchars > len)
 return NULL_TREE;
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 5770491b30c..c3bd6f3339d 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -1081,7 +1081,7 @@ pure_scalable_type_info::analyze_

Re: [PATCH] c: Fix up uninitialized next.original_type use in #embed optimization

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Jakub Jelinek wrote:

> Hi!
> 
> Jonathan pointed me at a diagnostic from an unnamed static analyzer
> which found that next.original_type isn't initialized for the CPP_EMBED
> case when it is parsed in a comma expression, yet
>   expr.original_type = next.original_type;
> is done a few lines later and the expr is returned.
> 
> Fixed thusly, tested on x86_64-linux, ok for trunk?
> 
> 2024-10-16  Jakub Jelinek  
> 
>   * c-parser.cc (c_parser_expression): Initialize next.original_type
>   to integer_type_node for the CPP_EMBED case.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: testsuite: Prepare for -std=gnu23 default

2024-10-16 Thread Jakub Jelinek
On Wed, Oct 16, 2024 at 12:21:49PM +, Joseph Myers wrote:
> On Wed, 16 Oct 2024, Jakub Jelinek wrote:
> 
> > Generally, I'd prefer dg-additional-options for tests
> > which don't already have dg-options into which one can just add the new
> > flag, or as in the above cases where a new flag is added only conditionally.
> > Just am not 100% sure if it works in lto tests...
> 
> Here is a version preferring dg-additional-options to avoid adding 
> dg-options where not present or where the number of dg-options lines would 
> have to increase because of conditionals.  OK to commit subject to testing 
> (including making sure the lto tests do get built with the given options)?

Yes, thanks.

Jakub



Re: [PATCH] Ternary operator formatting fixes

2024-10-16 Thread Richard Biener
On Wed, 16 Oct 2024, Jakub Jelinek wrote:

> Hi!
> 
> While working on PR117028 C2Y changes, I've noticed weird ternary
> operator formatting (operand1 ? operand2: operand3).
> The usual formatting is operand1 ? operand2 : operand3
> where we have around 18000+ cases of that (counting only what fits
> on one line) and
> indent -nbad -bap -nbc -bbo -bl -bli2 -bls -ncdb -nce -cp1 -cs -di2 -ndj \
>-nfc1 -nfca -hnl -i2 -ip5 -lp -pcs -psl -nsc -nsob
> documented in
> https://www.gnu.org/prep/standards/html_node/Formatting.html#Formatting
> does the same.
> Some code was even trying to save space as much as possible and used
> operand1?operand2:operand3 or
> operand1 ? operand2:operand3
> 
> Today I've grepped for such cases (the grep was '?.*[^ ]:' and I had to
> skim through various false positives with that where the : matched e.g.
> stuff inside of strings, or *.md pattern macros or :: scope) and the
> following patch is a fix for what I found.
> 
> Built on x86_64-linux, ok for trunk?

OK.

Richard.

> 2024-10-16  Jakub Jelinek  
> 
> gcc/
>   * attribs.cc (lookup_scoped_attribute_spec): ?: operator formatting
>   fixes.
>   * basic-block.h (FOR_BB_INSNS_SAFE): Likewise.
>   * cfgcleanup.cc (outgoing_edges_match): Likewise.
>   * cgraph.cc (cgraph_node::dump): Likewise.
>   * config/arc/arc.cc (gen_acc1, gen_acc2): Likewise.
>   * config/arc/arc.h (CLASS_MAX_NREGS, CONSTANT_ADDRESS_P): Likewise.
>   * config/arm/arm.cc (arm_print_operand): Likewise.
>   * config/cris/cris.md (*b): Likewise.
>   * config/darwin.cc (darwin_asm_declare_object_name,
>   darwin_emit_common): Likewise.
>   * config/darwin-driver.cc (darwin_driver_init): Likewise.
>   * config/epiphany/epiphany.md (call, sibcall, call_value,
>   sibcall_value): Likewise.
>   * config/i386/i386.cc (gen_push2): Likewise.
>   * config/i386/i386.h (ix86_cur_cost): Likewise.
>   * config/i386/openbsdelf.h (FUNCTION_PROFILER): Likewise.
>   * config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins):
>   Likewise.
>   * config/loongarch/loongarch-cpu.cc (fill_native_cpu_config):
>   Likewise.
>   * config/riscv/riscv.cc (riscv_union_memmodels): Likewise.
>   * config/riscv/zc.md (*mva01s, *mvsa01): Likewise.
>   * config/rs6000/mmintrin.h (_mm_cmpeq_pi8, _mm_cmpgt_pi8,
>   _mm_cmpeq_pi16, _mm_cmpgt_pi16, _mm_cmpeq_pi32, _mm_cmpgt_pi32):
>   Likewise.
>   * config/v850/predicates.md (pattern_is_ok_for_prologue): Likewise.
>   * config/xtensa/constraints.md (d, C, W): Likewise.
>   * coverage.cc (coverage_begin_function, build_init_ctor,
>   build_gcov_exit_decl): Likewise.
>   * df-problems.cc (df_create_unused_note): Likewise.
>   * diagnostic.cc (diagnostic_set_caret_max_width): Likewise.
>   * diagnostic-path.cc (path_summary::path_summary): Likewise.
>   * expr.cc (expand_expr_divmod): Likewise.
>   * gcov.cc (format_gcov): Likewise.
>   * gcov-dump.cc (dump_gcov_file): Likewise.
>   * genmatch.cc (main): Likewise.
>   * incpath.cc (remove_duplicates, register_include_chains): Likewise.
>   * ipa-devirt.cc (dump_odr_type): Likewise.
>   * ipa-icf.cc (sem_item_optimizer::merge_classes): Likewise.
>   * ipa-inline.cc (inline_small_functions): Likewise.
>   * ipa-polymorphic-call.cc (ipa_polymorphic_call_context::dump):
>   Likewise.
>   * ipa-sra.cc (create_parameter_descriptors): Likewise.
>   * ipa-utils.cc (find_always_executed_bbs): Likewise.
>   * predict.cc (predict_loops): Likewise.
>   * selftest.cc (read_file): Likewise.
>   * sreal.h (SREAL_SIGN, SREAL_ABS): Likewise.
>   * tree-dump.cc (dequeue_and_dump): Likewise.
>   * tree-ssa-ccp.cc (bit_value_binop): Likewise.
> gcc/c-family/
>   * c-opts.cc (c_common_init_options, c_common_handle_option,
>   c_common_finish, set_std_c89, set_std_c99, set_std_c11,
>   set_std_c17, set_std_c23, set_std_cxx98, set_std_cxx11,
>   set_std_cxx14, set_std_cxx17, set_std_cxx20, set_std_cxx23,
>   set_std_cxx26): ?: operator formatting fixes.
> gcc/cp/
>   * search.cc (lookup_member): ?: operator formatting fixes.
>   * typeck.cc (cp_build_modify_expr): Likewise.
> libcpp/
>   * expr.cc (interpret_float_suffix): ?: operator formatting fixes.
> 
> --- gcc/attribs.cc.jj 2024-10-01 09:38:57.539968487 +0200
> +++ gcc/attribs.cc2024-10-16 12:22:13.136273474 +0200
> @@ -381,7 +381,7 @@ lookup_scoped_attribute_spec (const_tree
>struct substring attr;
>scoped_attributes *attrs;
>  
> -  const char *ns_str = (ns != NULL_TREE) ? IDENTIFIER_POINTER (ns): NULL;
> +  const char *ns_str = (ns != NULL_TREE) ? IDENTIFIER_POINTER (ns) : NULL;
>  
>attrs = find_attribute_namespace (ns_str);
>  
> --- gcc/basic-block.h.jj  2024-01-03 11:51:30.094751134 +0100
> +++ gcc/basic-block.h 2024-10-16 12:21:59.863461369 +0200
> @@ -224,7 +224,7 @@ enum cfg_bb_flags
>  /* For itera

[PATCH v1] contrib/: Configure git-format-patch(1) to add To: gcc-patches@gcc.gnu.org

2024-10-16 Thread Alejandro Colomar
Just like we already do for git-send-email(1).  In some cases, patches
are prepared with git-format-patch(1), but are sent with a different
program, or some flags to git-send-email(1) may accidentally inhibit the
configuration.  By adding the TO in the email file, we make sure that
gcc-patches@ will receive the patch.

contrib/ChangeLog:

* gcc-git-customization.sh: Configure git-format-patch(1) to add
'To: gcc-patches@gcc.gnu.org'.

Signed-off-by: Alejandro Colomar 
---
 contrib/gcc-git-customization.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 54bd35ea1aa..1cd1b4472f0 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -43,6 +43,7 @@ git config diff.md.xfuncname '^\(define.*$'
 
 # Tell git send-email where patches go.
 # ??? Maybe also set sendemail.tocmd to guess from MAINTAINERS?
+git config format.to 'gcc-patches@gcc.gnu.org'
 git config sendemail.to 'gcc-patches@gcc.gnu.org'
 
 set_user=$(git config --get "user.name")
-- 
2.45.2



signature.asc
Description: PGP signature


[PATCH] SVE intrinsics: Add constant folding for svindex.

2024-10-16 Thread Jennifer Schmitz
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
index   z0.d, #10, #3
mul z0.d, z0.d, #5
ret
Now, it is compiled to
f2:
mov x0, 50
index   z0.d, x0, #15
ret

For non-constant arguments, build_vec_series produces a VEC_SERIES_EXPR,
which is translated back at RTL level to an index instruction without codegen
changes.

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz 

gcc/
* config/aarch64/aarch64-sve-builtins-base.cc
(svindex_impl::fold): Add constant folding.

gcc/testsuite/
* gcc.target/aarch64/sve/index_const_fold.c: New test.
---
 .../aarch64/aarch64-sve-builtins-base.cc  | 12 +++
 .../gcc.target/aarch64/sve/index_const_fold.c | 35 +++
 2 files changed, 47 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 1c17149e1f0..f6b1657ecbb 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -1304,6 +1304,18 @@ public:
 
 class svindex_impl : public function_base
 {
+public:
+  gimple *
+  fold (gimple_folder &f) const override
+  {
+tree vec_type = TREE_TYPE (f.lhs);
+tree base = gimple_call_arg (f.call, 0);
+tree step = gimple_call_arg (f.call, 1);
+
+return gimple_build_assign (f.lhs,
+   build_vec_series (vec_type, base, step));
+  }
+
 public:
   rtx
   expand (function_expander &e) const override
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c 
b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
new file mode 100644
index 000..7abb803f58b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+#include 
+
+#define INDEX_CONST(TYPE, TY)  \
+  sv##TYPE f_##TY##_index_const () \
+  {\
+return svindex_##TY (10, 3);   \
+  }
+
+#define MULT_INDEX(TYPE, TY)   \
+  sv##TYPE f_##TY##_mult_index ()  \
+  {\
+return svmul_x (svptrue_b8 (), \
+   svindex_##TY (10, 3),   \
+   5); \
+  }
+
+#define ALL_TESTS(TYPE, TY)\
+  INDEX_CONST (TYPE, TY)   \
+  MULT_INDEX (TYPE, TY)
+
+ALL_TESTS (uint8_t, u8)
+ALL_TESTS (uint16_t, u16)
+ALL_TESTS (uint32_t, u32)
+ALL_TESTS (uint64_t, u64)
+ALL_TESTS (int8_t, s8)
+ALL_TESTS (int16_t, s16)
+ALL_TESTS (int32_t, s32)
+ALL_TESTS (int64_t, s64)
+
+/* { dg-final { scan-tree-dump-times "return \\{ 10, 13, 16, ... \\}" 8 
"optimized" } } */
+/* { dg-final { scan-tree-dump-times "return \\{ 50, 65, 80, ... \\}" 8 
"optimized" } } */
-- 
2.44.0



smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH] SVE intrinsics: Add constant folding for svindex.

2024-10-16 Thread Jennifer Schmitz
I resubmitted an corrected version of this patch in 
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665648.html

> On 16 Oct 2024, at 10:32, Jennifer Schmitz  wrote:
> 
> This patch folds svindex with constant arguments into a vector series.
> We implemented this in svindex_impl::fold using the function build_vec_series.
> For example,
> svuint64_t f1 ()
> {
>  return svindex_u642 (10, 3);
> }
> compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
> in the gimple pass lower.
> This optimization benefits cases where svindex is used in combination with
> other gimple-level optimizations.
> For example,
> svuint64_t f2 ()
> {
>return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
> }
> has previously been compiled to
> f2:
>index   z0.d, #10, #3
>mul z0.d, z0.d, #5
>ret
> Now, it is compiled to
> f2:
>mov x0, 50
>index   z0.d, x0, #15
>ret
> 
> For non-constant arguments, build_vec_series produces a VEC_SERIES_EXPR,
> which is translated back at RTL level to an index instruction without codegen
> changes.
> 
> We added test cases checking
> - the application of the transform during gimple for constant arguments,
> - the interaction with another gimple-level optimization.
> 
> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
> OK for mainline?
> 
> Signed-off-by: Jennifer Schmitz 
> 
> gcc/
> * config/aarch64/aarch64-sve-builtins-base.cc
> (svindex_impl::fold): Add constant folding.
> 
> gcc/testsuite/
> * gcc.target/aarch64/sve/index_const_fold.c: New test.
> ---
> .../aarch64/aarch64-sve-builtins-base.cc  | 12 +++
> .../gcc.target/aarch64/sve/index_const_fold.c | 35 +++
> 2 files changed, 47 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
> 
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> index 1c17149e1f0..f6b1657ecbb 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> @@ -1304,6 +1304,18 @@ public:
> 
> class svindex_impl : public function_base
> {
> +public:
> +  gimple *
> +  fold (gimple_folder &f) const override
> +  {
> +tree vec_type = TREE_TYPE (f.lhs);
> +tree base = gimple_call_arg (f.call, 0);
> +tree step = gimple_call_arg (f.call, 1);
> +
> +return gimple_build_assign (f.lhs,
> + build_vec_series (vec_type, base, step));
> +  }
> +
> public:
>   rtx
>   expand (function_expander &e) const override
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
> new file mode 100644
> index 000..f5e6c0f7a85
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +#include 
> +#include 
> +
> +#define INDEX_CONST(TYPE, TY) \
> +  sv##TYPE f_##TY##_index_const () \
> +  { \
> +return svindex_##TY (10, 3); \
> +  }
> +
> +#define MULT_INDEX(TYPE, TY) \
> +  sv##TYPE f_##TY##_mult_index () \
> +  { \
> +return svmul_x (svptrue_b8 (), \
> +svindex_##TY (10, 3), \
> +5); \
> +  }
> +
> +#define ALL_TESTS(TYPE, TY) \
> +  INDEX_CONST (TYPE, TY) \
> +  MULT_INDEX (TYPE, TY)
> +
> +ALL_TESTS (uint8_t, u8)
> +ALL_TESTS (uint16_t, u16)
> +ALL_TESTS (uint32_t, u32)
> +ALL_TESTS (uint64_t, u64)
> +ALL_TESTS (int8_t, s8)
> +ALL_TESTS (int16_t, s16)
> +ALL_TESTS (int32_t, s32)
> +ALL_TESTS (int64_t, s64)
> +
> +/* { dg-final { scan-tree-dump "return \\{ 10, 13, 16, ... \\}" 8 
> "optimized" } } */
> +/* { dg-final { scan-tree-dump "return \\{ 50, 65, 80, ... \\}" 8 
> "optimized" } } */
> -- 
> 2.44.0




smime.p7s
Description: S/MIME cryptographic signature


[PATCH v8] c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

2024-10-16 Thread Simon Martin
Hi Jason,

On 12 Oct 2024, at 4:51, Jason Merrill wrote:

> On 10/11/24 7:02 AM, Simon Martin wrote:
>> Hi Jason,
>>
>> On 11 Oct 2024, at 0:35, Jason Merrill wrote:
>>
>>> On 10/7/24 3:35 PM, Simon Martin wrote:
 On 7 Oct 2024, at 18:58, Jason Merrill wrote:
> On 10/7/24 11:27 AM, Simon Martin wrote:
>>>
>>  /* Now give a warning for all base functions without overriders,
>> as they are hidden.  */
>>  for (tree base_fndecl : base_fndecls)
>> +  {
>> +if (!base_fndecl || overriden_base_fndecls.contains
>> (base_fndecl))
>> +  continue;
>> +tree *hider = hidden_base_fndecls.get (base_fndecl);
>> +if (hider)
>
> How about looping over hidden_base_fndecls instead of base_fndecls?
>>>
 Unfortunately it does not work because a given base method can be
 hidden
 by one overload and overriden by another, in which case we don’t
 want
 to warn (see for example AA:foo(int) in Woverloaded-virt7.C). So we

 need
 to take both collections into account.
>>>
>>> Yes, you'd still need to check overridden_base_fndecls.contains, but

>>> that doesn't seem any different iterating over hidden_base_fndecls
>>> instead of base_fndecls.
>> Sure, and I guess iterating over hidden_base_fndecls is more coherent

>>
>> with what the warning is about. Changed in the attached updated patch,
>> successfully tested on x86_64-pc-linux-gnu. OK?
>
> OK, thanks.
As you know the patch had to be reverted due to PR117114, that
highlighted a bunch of issues with comparing DECL_VINDEXes: it might
give false positives in case of multiple inheritance (the case in
PR117114), but also if there’s single inheritance by the hierarchy has
more than two levels (another issue I found while bootstrapping with
rust enabled).

The attached updated patch introduces an overrides_p function, based on
the existing check_final_overrider, and uses it when the signatures match.

It’s been successfully tested on x86_64-pc-linux-gnu, and bootstrap
works fine with —enable-languages=all (and rust properly configured, so
included here). OK for trunk?

Thanks, Simon

From f7dd5910423b4d09d06e07c9c2d29086d09edc30 Mon Sep 17 00:00:00 2001
From: Simon Martin 
Date: Tue, 15 Oct 2024 15:18:30 +0200
Subject: [PATCH] c++: Fix overeager Woverloaded-virtual with conversion 
operators [PR109918]

We currently emit an incorrect -Woverloaded-virtual warning upon the
following test case

=== cut here ===
struct A {
  virtual operator int() { return 42; }
  virtual operator char() = 0;
};
struct B : public A {
  operator char() { return 'A'; }
};
=== cut here ===

The problem is that when iterating over ovl_range (fns), warn_hidden
gets confused by the conversion operator marker, concludes that
seen_non_override is true and therefore emits a warning for all
conversion operators in A that do not convert to char, even if
-Woverloaded-virtual is 1 (e.g. with -Wall, the case reported).

A second set of problems is highlighted when -Woverloaded-virtual is 2.

First, with the same test case, since base_fndecls contains all
conversion operators in A (except the one to char, that's been removed
when iterating over ovl_range (fns)), we emit a spurious warning for
the conversion operator to int, even though it's unrelated.

Second, in case there are several conversion operators with different
cv-qualifiers to the same type in A, we rightfully emit a warning,
however the note uses the location of the conversion operator marker
instead of the right one; location_of should go over conv_op_marker.

This patch fixes all these by explicitly keeping track of (1) base
methods that are overriden, as well as (2) base methods that are hidden
but not overriden (and by what), and warning about methods that are in
(2) but not (1). It also ignores non virtual base methods, per
"definition" of -Woverloaded-virtual.

Successfully tested on x86_64-pc-linux-gnu.

PR c++/117114
PR c++/109918

gcc/cp/ChangeLog:

* class.cc (warn_hidden): Keep track of overloaded and of hidden
base methods. Mention the actual hiding function in the warning,
not the first overload.
* cp-tree.h (overrides_p): New.
* error.cc (location_of): Skip over conv_op_marker.
* search.cc (check_final_overrider): Add parameter to control
whether diagnostics should be emitted.
(overrides_p): New.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Woverloaded-virt1.C: Check that no warning is
emitted for non virtual base methods.
* g++.dg/warn/Woverloaded-virt10.C: New test.
* g++.dg/warn/Woverloaded-virt11.C: New test.
* g++.dg/warn/Woverloaded-virt12.C: New test.
* g++.dg/warn/Woverloaded-virt13.C: New test.
* g++.dg/warn/Woverloaded-virt5.C: New test.
* g++.dg/warn/Woverloaded-virt6.C: New test.
* g++.dg/warn/Woverloaded-virt7.C: New test.
 

[PATCH] c: Add u{,l,ll,imax}abs builtins [PR117024]

2024-10-16 Thread Jakub Jelinek
Hi!

The following patch adds u{,l,ll,imax}abs builtins, which just fold
to ABSU_EXPR, similarly to how {,l,ll,imax}abs builtins fold to
ABS_EXPR.

Tested on x86_64-linux, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux and i686-linux?

2024-10-16  Jakub Jelinek  

PR c/117024
gcc/
* coretypes.h (enum function_class): Add function_c2y_misc
enumerator.
* builtin-types.def (BT_FN_UINTMAX_INTMAX, BT_FN_ULONG_LONG,
BT_FN_ULONGLONG_LONGLONG): New DEF_FUNCTION_TYPE_1s.
* builtins.def (DEF_C2Y_BUILTIN): Define.
(BUILT_IN_UABS, BUILT_IN_UIMAXABS, BUILT_IN_ULABS,
BUILT_IN_ULLABS): New builtins.
* builtins.cc (fold_builtin_abs): Handle also folding of u*abs
to ABSU_EXPR.
(fold_builtin_1): Handle BUILT_IN_U{,L,LL,IMAX}ABS.
gcc/lto/ChangeLog:
* lto-lang.cc (flag_isoc2y): New variable.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (flag_isoc2y): New variable.
gcc/testsuite/
* gcc.c-torture/execute/builtins/lib/abs.c (uintmax_t): New typedef.
(uabs, ulabs, ullabs, uimaxabs): New functions.
* gcc.c-torture/execute/builtins/uabs-1.c: New test.
* gcc.c-torture/execute/builtins/uabs-1.x: New file.
* gcc.c-torture/execute/builtins/uabs-1-lib.c: New file.
* gcc.c-torture/execute/builtins/uabs-2.c: New test.
* gcc.c-torture/execute/builtins/uabs-2.x: New file.
* gcc.c-torture/execute/builtins/uabs-2-lib.c: New file.
* gcc.c-torture/execute/builtins/uabs-3.c: New test.
* gcc.c-torture/execute/builtins/uabs-3.x: New test.
* gcc.c-torture/execute/builtins/uabs-3-lib.c: New test.

--- gcc/coretypes.h.jj  2024-09-24 11:31:48.672622744 +0200
+++ gcc/coretypes.h 2024-10-16 16:05:51.329632572 +0200
@@ -421,7 +421,8 @@ enum function_class {
   function_c99_math_complex,
   function_sincos,
   function_c11_misc,
-  function_c23_misc
+  function_c23_misc,
+  function_c2y_misc
 };
 
 /* Enumerate visibility settings.  This is deliberately ordered from most
--- gcc/builtin-types.def.jj2024-01-03 11:51:37.930642379 +0100
+++ gcc/builtin-types.def   2024-10-16 16:17:42.424573438 +0200
@@ -252,6 +252,7 @@ DEF_FUNCTION_TYPE_0 (BT_FN_DFLOAT128, BT
 
 DEF_FUNCTION_TYPE_1 (BT_FN_LONG_LONG, BT_LONG, BT_LONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_LONGLONG_LONGLONG, BT_LONGLONG, BT_LONGLONG)
+DEF_FUNCTION_TYPE_1 (BT_FN_UINTMAX_INTMAX, BT_UINTMAX, BT_INTMAX)
 DEF_FUNCTION_TYPE_1 (BT_FN_INTMAX_INTMAX, BT_INTMAX, BT_INTMAX)
 DEF_FUNCTION_TYPE_1 (BT_FN_FLOAT_FLOAT, BT_FLOAT, BT_FLOAT)
 DEF_FUNCTION_TYPE_1 (BT_FN_DOUBLE_DOUBLE, BT_DOUBLE, BT_DOUBLE)
@@ -396,7 +397,9 @@ DEF_FUNCTION_TYPE_1 (BT_FN_UINT_CONST_PT
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_PTR, BT_ULONG, BT_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_CONST_PTR, BT_ULONG, BT_CONST_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_ULONG, BT_ULONG, BT_ULONG)
+DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_LONG, BT_ULONG, BT_LONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
+DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_LONGLONG, BT_ULONGLONG, BT_LONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT8_FLOAT, BT_INT8, BT_FLOAT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT16_FLOAT, BT_INT16, BT_FLOAT)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_FLOAT, BT_UINT32, BT_FLOAT)
--- gcc/builtins.def.jj 2024-08-12 10:49:12.043616424 +0200
+++ gcc/builtins.def2024-10-16 16:34:26.228365799 +0200
@@ -164,6 +164,14 @@ along with GCC; see the file COPYING3.
   true, true, !flag_isoc23, ATTRS, \
   targetm.libc_has_function (function_c23_misc, NULL_TREE), true)
 
+/* Like DEF_LIB_BUILTIN, except that the function is only a part of
+   the standard in C2Y or above.  */
+#undef DEF_C2Y_BUILTIN
+#define DEF_C2Y_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
+  DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
+  true, true, !flag_isoc2y, ATTRS, \
+  targetm.libc_has_function (function_c2y_misc, NULL_TREE), true)
+
 /* Like DEF_C99_BUILTIN, but for complex math functions.  */
 #undef DEF_C99_COMPL_BUILTIN
 #define DEF_C99_COMPL_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
@@ -1069,6 +1077,10 @@ DEF_GCC_BUILTIN(BUILT_IN_SETJMP,
 DEF_EXT_LIB_BUILTIN(BUILT_IN_STRFMON, "strfmon", 
BT_FN_SSIZE_STRING_SIZE_CONST_STRING_VAR, ATTR_FORMAT_STRFMON_NOTHROW_3_4)
 DEF_LIB_BUILTIN(BUILT_IN_STRFTIME, "strftime", 
BT_FN_SIZE_STRING_SIZE_CONST_STRING_CONST_TM_PTR, 
ATTR_FORMAT_STRFTIME_NOTHROW_3_0)
 DEF_GCC_BUILTIN(BUILT_IN_TRAP, "trap", BT_FN_VOID, 
ATTR_NORETURN_NOTHROW_LEAF_COLD_LIST)
+DEF_C2Y_BUILTIN(BUILT_IN_UABS, "uabs", BT_FN_UINT_INT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_C2Y_BUILTIN(BUILT_IN_UIMAXABS, "uimaxabs", BT_FN_UINTMAX_INTMAX, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_C2Y_BUILTIN(BUILT_IN_ULABS, "ulabs", BT_FN_ULONG_LONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_C2Y_BUILTIN(BUILT_IN_ULLABS, "ullabs", BT_FN_ULONGLONG_LONGLONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 

[PATCH] libcpp: Fix ICE lexing invalid raw string in a deferred pragma [PR117118]

2024-10-16 Thread Lewis Hyatt
Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117118

This fixes an old regression from GCC 11. Is it OK for trunk and all
backports please? Bootstrap + regtested all languages on x86-64 Linux.
Thanks!

-Lewis

-- >8 --

The PR shows that we ICE after lexing an invalid unterminated raw string,
because lex_raw_string() pops the main buffer unexpectedly. Resolve by
handling this case the same way as for other directives.

libcpp/ChangeLog:
PR preprocessor/117118
* lex.cc (lex_raw_string): Treat an unterminated raw string the same
way for a deferred pragma as is done for other directives.

gcc/testsuite/ChangeLog:
PR preprocessor/117118
* c-c++-common/raw-string-directive-3.c: New test.
* c-c++-common/raw-string-directive-4.c: New test.
---
 libcpp/lex.cc   | 3 ++-
 gcc/testsuite/c-c++-common/raw-string-directive-3.c | 8 
 gcc/testsuite/c-c++-common/raw-string-directive-4.c | 8 
 3 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/raw-string-directive-3.c
 create mode 100644 gcc/testsuite/c-c++-common/raw-string-directive-4.c

diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index bb5cd394ab4..f9c5a070410 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -2655,7 +2655,8 @@ lex_raw_string (cpp_reader *pfile, cpp_token *token, 
const uchar *base)
{
  pos--;
  pfile->buffer->cur = pos;
- if ((pfile->state.in_directive || pfile->state.parsing_args)
+ if ((pfile->state.in_directive || pfile->state.parsing_args
+  || pfile->state.in_deferred_pragma)
  && pfile->buffer->next_line >= pfile->buffer->rlimit)
{
  cpp_error_with_line (pfile, CPP_DL_ERROR, token->src_loc, 0,
diff --git a/gcc/testsuite/c-c++-common/raw-string-directive-3.c 
b/gcc/testsuite/c-c++-common/raw-string-directive-3.c
new file mode 100644
index 000..fa4fa979fce
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/raw-string-directive-3.c
@@ -0,0 +1,8 @@
+/* { dg-options "-std=gnu99" { target c } } */
+/* { dg-options "-std=c++11" { target c++ } } */
+
+/* { dg-error "invalid new-line in raw string delimiter" "" { target *-*-* } 
.+4 } */
+/* { dg-error "unterminated raw string" "" { target *-*-* } .+3 } */
+/* { dg-error "stray 'R' in program" "" { target *-*-* } .+2 } */
+/* { dg-warning "expected a string" "" { target *-*-* } .+1 } */
+#pragma message R""
diff --git a/gcc/testsuite/c-c++-common/raw-string-directive-4.c 
b/gcc/testsuite/c-c++-common/raw-string-directive-4.c
new file mode 100644
index 000..935e3a1a991
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/raw-string-directive-4.c
@@ -0,0 +1,8 @@
+/* { dg-options "-std=gnu99" { target c } } */
+/* { dg-options "-std=c++11" { target c++ } } */
+
+/* { dg-error "invalid new-line in raw string delimiter" "" { target *-*-* } 
.+4 } */
+/* { dg-error "unterminated raw string" "" { target *-*-* } .+3 } */
+/* { dg-error "stray 'R' in program" "" { target *-*-* } .+2 } */
+/* { dg-warning "expected a string" "" { target *-*-* } .+1 } */
+_Pragma("message R\"\"")


Re: [PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Richard Biener
On Wed, 16 Oct 2024, Robin Dapp wrote:

> > Interesting - this is bleh | bswap (..), right, so having
> > bla1 | (bleh | bla2) fails to recognize bla1 | bla2 as bswap.
> 
> Yes, exactly.
> 
> > I'd expect this kind of pattern to fail bswap detection easily
> > if you mangle it a bit.  So possibly bswap detection should learn
> > to better pick the "pieces" from a chain of IORs ...
> 
> Yeah, it also appeared brittle to me.  But I could understand that
> we want a pass that sorts by rank before such an optimization to
> make life easier.
> 
> > Consider bswap (..) | bswap (..) badly interwinded for example.
> >
> > > Or did I miss the point?
> >
> > No, but as we can see it's a "trade-off" ...
> 
> As always :)  So a way forward with this would be to not do the
> swap before vectorization and enhance bswap so we won't regress
> for those two cases?

Yes.

Richard.


Re: [PATCH] c: Add some checking asserts to named loops handling code

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Jakub Jelinek wrote:

> Hi!
> 
> Jonathan mentioned an unnamed static analyzer reported issue in
> c_finish_bc_name.
> It is actually a false positive, because the construction of the
> loop_names vector guarantees that the last element of the vector
> (if the vector is non-empty) always has either
> C_DECL_LOOP_NAME (l) or C_DECL_SWITCH_NAME (l) (or both) flags
> set, so c will be always non-NULL after the if at the start of the
> loops.
> The following patch is an attempt to help those static analyzers
> (though dunno if it actually helps), by adding a checking assert.
> 
> Tested on x86_64-linux, ok for trunk?
> 
> 2024-10-16  Jakub Jelinek  
> 
>   * c-decl.cc (c_get_loop_names): Add checking assert that
>   c is non-NULL in the loop.
>   (c_finish_bc_name): Likewise.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH 3/7] libstdc++: Inline memmove optimizations for std::copy etc. [PR115444]

2024-10-16 Thread Jonathan Wakely
On Tue, 15 Oct 2024 at 15:29, Jonathan Wakely  wrote:
>
> This is a slightly different approach to C++98 compatibility than used
> in patch 1/1 of this series for the uninitialized algos. It worked out a
> bit cleaner this way for these algos, I think.
>
> Tested x86_64-linux.
>
> -- >8 --
>
> This removes all the __copy_move class template specializations that
> decide how to optimize std::copy and std::copy_n. We can inline those
> optimizations into the algorithms, using if-constexpr (and macros for
> C++98 compatibility) and remove the code dispatching to the various
> class template specializations.
>
> Doing this means we implement the optimization directly for std::copy_n
> instead of deferring to std::copy, That avoids the unwanted consequence
> of advancing the iterator in copy_n only to take the difference later to
> get back to the length that we already had in copy_n originally (as
> described in PR 115444).
>
> With the new flattened implementations, we can also lower contiguous
> iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so
> that they benefit from the same memmove optimizations as pointers.
> There's a subtlety though: contiguous iterators can potentially throw
> exceptions to exit the algorithm early.  So we can only transform the
> loop to memmove if dereferencing the iterator is noexcept. We don't
> check that incrementing the iterator is noexcept because we advance the
> contiguous iterators before using memmove, so that if incrementing would
> throw, that happens first. I am writing a proposal (P3249R0) which would

Oops, got my own paper number wrong, I've fixed that locally.
The paper is online now:
https://isocpp.org/files/papers/P3349R0.html

> make this unnecessary, so I hope we can drop the nothrow requirements
> later.
>
> This change also solves PR 114817 by checking is_trivially_assignable
> before optimizing copy/copy_n etc. to memmove. It's not enough to check
> that the types are trivially copyable (a precondition for using memmove
> at all), we also need to check that the specific assignment that would
> be performed by the algorithm is also trivial. Replacing a non-trivial
> assignment with memmove would be observable, so not allowed.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/115444
> PR libstdc++/114817
> * include/bits/stl_algo.h (__copy_n): Remove generic overload
> and overload for random access iterators.
> (copy_n): Inline generic version of __copy_n here. Do not defer
> to std::copy for random access iterators.
> * include/bits/stl_algobase.h (__copy_move): Remove.
> (__nothrow_contiguous_iterator, __memcpyable_iterators): New
> concepts.
> (__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers.
> (__copy_move_a2): Inline __copy_move logic and conditional
> memmove optimization into the most generic overload.
> (__copy_n_a): Likewise.
> (__copy_move_backward): Remove.
> (__copy_move_backward_a2): Inline __copy_move_backward logic and
> memmove optimization into the most generic overload.
> * 
> testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc:
> New test.
> * 
> testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc:
> New test.
> * testsuite/25_algorithms/copy/114817.cc: New test.
> * testsuite/25_algorithms/copy/115444.cc: New test.
> * testsuite/25_algorithms/copy_n/114817.cc: New test.
> ---
>  libstdc++-v3/include/bits/stl_algo.h  |  24 +-
>  libstdc++-v3/include/bits/stl_algobase.h  | 426 +-
>  .../uninitialized_copy/114817.cc  |  39 ++
>  .../uninitialized_copy_n/114817.cc|  39 ++
>  .../testsuite/25_algorithms/copy/114817.cc|  38 ++
>  .../testsuite/25_algorithms/copy/115444.cc|  93 
>  .../testsuite/25_algorithms/copy_n/114817.cc  |  38 ++
>  7 files changed, 469 insertions(+), 228 deletions(-)
>  create mode 100644 
> libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
>  create mode 100644 
> libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
>
> diff --git a/libstdc++-v3/include/bits/stl_algo.h 
> b/libstdc++-v3/include/bits/stl_algo.h
> index a1ef665506d..489ce7e14d2 100644
> --- a/libstdc++-v3/include/bits/stl_algo.h
> +++ b/libstdc++-v3/include/bits/stl_algo.h
> @@ -665,25 +665,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>return __result;
>  }
>
> -  template
> -_GLIBCXX20_CONSTEXPR
> -_OutputIterator
> -__copy_n(_InputIterator __first, _Size __n,
> -_OutputIterator __result, input_iterator_tag)
> -{
> -  return 

Re: PING - [PATCH][LRA][PR116550] Reuse scratch registers generated by LRA

2024-10-16 Thread Denis Chertykov
Just a reminder with CCing maintainer (Vladimir Makarov).
Denis.


Re: [PATCH RFC] build: update bootstrap req to C++14

2024-10-16 Thread Jakub Jelinek
On Wed, Oct 16, 2024 at 11:04:32AM -0400, Jason Merrill wrote:
> > Alternatively, systems (that care about Ada and D) running 4.7 could
> > build 10.5, systems running 4.8 could build 11.5.
> 
> Here's an updated patch.  I tested C++14 bootstrap again with 5.x compilers,
> and Jakub's dwarf2asm change breaks on 5.3 due to PR69995, while 5.4

The dwarf2asm and libcpp _cpp_trigraph_map changes were just optimizations,
so if we wanted, we could just guard it with additional __GCC_PREREQ (5, 4)
or similar.

> successfully bootstraps.
> 
> I also added the 9.5 recommendation.

> From 87e90d3677a6211b5bb9fc6865b987203a819108 Mon Sep 17 00:00:00 2001
> From: Jason Merrill 
> Date: Tue, 17 Sep 2024 17:38:35 -0400
> Subject: [PATCH] build: update bootstrap req to C++14
> To: gcc-patches@gcc.gnu.org
> 
> This implements my proposal to update our bootstrap requirement to C++14.
> The big benefit of the change is the greater constexpr power, but C++14 also
> added variable templates, generic lambdas, lambda init-capture, binary
> literals, and numeric literal digit separators.
> 
> C++14 was feature-complete in GCC 5, and became the default in GCC 6.  5.4.0
> bootstraps trunk correctly; trunk stage1 built with 5.3.0 breaks in
> eh_data_format_name due to PR69995.
> 
> gcc/ChangeLog:
> 
>   * doc/install.texi (Prerequisites): Update to C++14.
> 
> ChangeLog:
> 
>   * configure.ac: Update requirement to C++14.
>   * configure: Regenerate.

Ok from my side, but please give Richi and others a week to disagree before
committing.

Jakub



Re: [PATCH v11] ada: fix timeval timespec on 32 bits archs with 64 bits time_t [PR114065]

2024-10-16 Thread Arnaud Charlet
> The base types are unchanged, the (non-private) subtypes defined in
> System.C_Time only add range constraints preventing some obviously
> invalid values. I do not understand how this could break existing
> code, could you please give an example?

OK, that should work indeed, so I withdraw my comment.

> > It may be surprising to have the RTEMS file used by other OS. The
> > original comment should have mentionned that in the first place, but
> > the file was only used with RTEMS. With your change, the file is
> > effectively shared, so it would be best to rename it.
> 
> Could you please suggest an appropriate file name?  This may be
> obvious for you, but with my limited knowledge of GNAT internals, the
> diff between s-osprim__rtems.adb and __unix/optide.adb is not
> sufficient to guess why a separate implementation is/was required.

Would it be possible to drop it altogether and use s-osprim__posix.adb instead?
Otherwise what's the remaining difference between s-osprim__posix.adb and 
s-osprim__rtems.adb? The difference should help us find a proper name based on 
properties of the file.

> Version 12 is attached.

Can you post also a diff between version 11 and version 12? It's not practical 
to review the complete changes from scratch at this stage, the patch is too big.

Thanks!

Arno


[PATCH] c: Fix up uninitialized next.original_type use in #embed optimization

2024-10-16 Thread Jakub Jelinek
Hi!

Jonathan pointed me at a diagnostic from an unnamed static analyzer
which found that next.original_type isn't initialized for the CPP_EMBED
case when it is parsed in a comma expression, yet
  expr.original_type = next.original_type;
is done a few lines later and the expr is returned.

Fixed thusly, tested on x86_64-linux, ok for trunk?

2024-10-16  Jakub Jelinek  

* c-parser.cc (c_parser_expression): Initialize next.original_type
to integer_type_node for the CPP_EMBED case.

--- gcc/c/c-parser.cc.jj2024-10-16 10:32:27.0 +0200
+++ gcc/c/c-parser.cc   2024-10-16 14:09:44.393913829 +0200
@@ -13299,6 +13299,7 @@ c_parser_expression (c_parser *parser)
  next.value = build_int_cst (TREE_TYPE (val),
  ((const unsigned char *)
   RAW_DATA_POINTER (val))[last]);
+ next.original_type = integer_type_node;
  c_parser_consume_token (parser);
}
   else

Jakub



[PATCH v16b 0/4] c: Add __countof__ operator

2024-10-16 Thread Alejandro Colomar
Hi!

v16 changes:

-  Remove () from commit messages in function names.  [Joseph]
-  Use __countof__ (name of the actual operator) in diagnostic messages.
   [Joseph]
-  Add CC (sorry for not CCing the patches to most people in v15; I
   accidentally suppressed most of them while sending).

v16b is a resend, since I accidentally didn't send them to the mailing
list.

Have a lovely day!
Alex

Alejandro Colomar (4):
  contrib/: Add support for Cc: and Link: tags
  gcc/: Rename array_type_nelts => array_type_nelts_minus_one
  gcc/: Merge definitions of array_type_nelts_top
  c: Add __countof__ operator

 contrib/gcc-changelog/git_commit.py|   5 +-
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  32 --
 gcc/c/c-fold.cc|   7 +-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 118 ++-
 gcc/config/aarch64/aarch64.cc  |   2 +-
 gcc/config/i386/i386.cc|   2 +-
 gcc/cp/cp-tree.h   |   1 -
 gcc/cp/decl.cc |   2 +-
 gcc/cp/init.cc |   8 +-
 gcc/cp/lambda.cc   |   3 +-
 gcc/cp/tree.cc |  13 ---
 gcc/doc/extend.texi|  30 +
 gcc/expr.cc|   8 +-
 gcc/fortran/trans-array.cc |   2 +-
 gcc/fortran/trans-openmp.cc|   4 +-
 gcc/rust/backend/rust-tree.cc  |  13 ---
 gcc/rust/backend/rust-tree.h   |   2 -
 gcc/testsuite/gcc.dg/countof-compile.c | 115 +++
 gcc/testsuite/gcc.dg/countof-vla.c |  46 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 gcc/tree.cc|  17 ++-
 gcc/tree.h |   3 +-
 27 files changed, 600 insertions(+), 80 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

Range-diff:
1:  b6da2185675 = 1:  eac2d18d8a0 contrib/: Add support for Cc: and Link: tags
2:  a0fa3f139f9 ! 2:  7418a11fcd6 gcc/: Rename array_type_nelts() => 
array_type_nelts_minus_one()
@@ Metadata
 Author: Alejandro Colomar 
 
  ## Commit message ##
-gcc/: Rename array_type_nelts() => array_type_nelts_minus_one()
+gcc/: Rename array_type_nelts => array_type_nelts_minus_one
 
 The old name was misleading.
 
@@ Commit message
 * config/aarch64/aarch64.cc
 (pure_scalable_type_info::analyze_array)
 * config/i386/i386.cc (ix86_canonical_va_list_type):
-Rename array_type_nelts() => array_type_nelts_minus_one()
+Rename array_type_nelts => array_type_nelts_minus_one
 The old name was misleading.
 
 gcc/c/ChangeLog:
 
 * c-decl.cc (one_element_array_type_p, get_parm_array_spec)
 * c-fold.cc (c_fold_array_ref):
-Rename array_type_nelts() => array_type_nelts_minus_one()
+Rename array_type_nelts => array_type_nelts_minus_one
 
 gcc/cp/ChangeLog:
 
@@ Commit message
 (build_delete)
 * lambda.cc (add_capture)
 * tree.cc (array_type_nelts_top):
-Rename array_type_nelts() => array_type_nelts_minus_one()
+Rename array_type_nelts => array_type_nelts_minus_one
 
 gcc/fortran/ChangeLog:
 
@@ Commit message
 * trans-openmp.cc
 (gfc_walk_alloc_comps)
 (gfc_omp_clause_linear_ctor):
-Rename array_type_nelts() => array_type_nelts_minus_one()
+Rename array_type_nelts => array_type_nelts_minus_one
 
 gcc/rust/ChangeLog:
 
 * backend/rust-tree.cc (array_type_nelts_top):
-Rename array_type_nelts() => array_type_nelts_minus_one()
+Rename array_type_nelts => array_type_nelts_minus_one
 
 Cc: Gabriel Ravier 
 Cc: Martin Uecker 
3:  43a2e18c6a2 ! 3:  0cfae0598b3 Merge definitions of array_type_nelts_top()
@@ Metadata
 Author: Alejandro Colomar 
 
  ## Commit message ##
-Merge definitions of array_type_nelts_top()
+gcc/: Merge definitions of array_type_nelts_top
 
 There were two identical definitions, and none of them are available
 where they are needed for implementing __nelementsof__.  Merge them, 
and
4:  8a6959d2d38 ! 4:  12a30a2a6fd c: Add __countof__ operator
@@ Commit message
 (pop_maybe_used)
 (is_top_array_vla)
 (c_expr_countof_expr,

[PATCH v16b 4/4] c: Add __countof__ operator

2024-10-16 Thread Alejandro Colomar
This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

-  Fix support for [0].

gcc/ChangeLog:

* doc/extend.texi: Document __countof__ operator.

gcc/c-family/ChangeLog:

* c-common.h
* c-common.def
* c-common.cc (c_countof_type): Add __countof__ operator.

gcc/c/ChangeLog:

* c-tree.h
(c_expr_countof_expr, c_expr_countof_type)
* c-decl.cc
(start_struct, finish_struct)
(start_enum, finish_enum)
* c-parser.cc
(c_parser_sizeof_expression)
(c_parser_countof_expression)
(c_parser_sizeof_or_countof_expression)
(c_parser_unary_expression)
* c-typeck.cc
(build_external_ref)
(record_maybe_used_decl)
(pop_maybe_used)
(is_top_array_vla)
(c_expr_countof_expr, c_expr_countof_type):
Add __countof__ operator.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c
* gcc.dg/countof-vla.c
* gcc.dg/countof.c: Add tests for __countof__ operator.

Link: 
Link: 
Link: 

Link: 
Link: 
Link: 
Link: 
Link: 
Suggested-by: Xavier Del Campo Romero 
Co-authored-by: Martin Uecker 
Acked-by: "James K. Lowden" 
Cc: Joseph Myers 
Cc: Gabriel Ravier 
Cc: Jakub Jelinek 
Cc: Kees Cook 
Cc: Qing Zhao 
Cc: Jens Gustedt 
Cc: David Brown 
Cc: Florian Weimer 
Cc: Andreas Schwab 
Cc: Timm Baeder 
Cc: Daniel Plakosh 
Cc: "A. Jiang" 
Cc: Eugene Zelenko 
Cc: Aaron Ballman 
Cc: Paul Koning 
Cc: Daniel Lundin 
Cc: Nikolaos Strimpas 
Cc: JeanHeyd Meneide 
Cc: Fernando Borretti 
Cc: Jonathan Protzenko 
Cc: Chris Bazley 
Cc: Ville Voutilainen 
Cc: Alex Celeste 
Cc: Jakub Łukasiewicz 
Cc: Douglas McIlroy 
Cc: Jason Merrill 
Cc: "Gustavo A. R. Silva" 
Cc: Patrizia Kaye 
Cc: Ori Bernstein 
Cc: Robert Seacord 
Cc: Marek Polacek 
Cc: Sam James 
Cc: Richard Biener 
Signed-off-by: Alejandro Colomar 
---
 gcc/c-family/c-common.cc   |  26 +
 gcc/c-family/c-common.def  |   3 +
 gcc/c-family/c-common.h|   2 +
 gcc/c/c-decl.cc|  22 +++-
 gcc/c/c-parser.cc  |  62 +++---
 gcc/c/c-tree.h |   4 +
 gcc/c/c-typeck.cc  | 118 ++-
 gcc/doc/extend.texi|  30 +
 gcc/testsuite/gcc.dg/countof-compile.c | 115 +++
 gcc/testsuite/gcc.dg/countof-vla.c |  46 
 gcc/testsuite/gcc.dg/countof.c | 150 +
 11 files changed, 554 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 7494a2dac0a..9f48fea6543 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -466,6 +466,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__inline",RID_INLINE, 0 },
   { "__inline__",  RID_INLINE, 0 },
   { "__label__",   RID_LABEL,  0 },
+  { "__countof__", RID_COUNTOF,0 },
   { "__null",  RID_NULL,   0 },
   { "__real",  RID_REALPART,   0 },
   { "__real__",RID_REALPART,   0 },
@@ -4071,6 +4072,31 @@ c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement the countof keyword:
+   Return the number of elements of an array.  */
+
+tree
+c_countof_type (location_t loc, tree type)
+{
+  enum tree_code type_code;
+
+  type_code = TREE_CODE (type);
+  if (type_code != ARRAY_TYPE)
+{
+  error_at (loc, "invalid application of %<__countof__%> to type %qT", 
type);
+  return error_mark_node;
+}
+  if (!COMPLETE_TYPE_P (type))
+{
+  error_at (loc,
+   "invalid application of %<__countof__%> to incomplete type %qT",
+   type);
+  return error_mark_node;
+}
+
+  return array_type_nelts_top (type);
+}
 
 /* Handle C and C++ default attributes.  */
 
diff --git a/gcc/c-family/c-common.def b/gcc/c-family/c-common.def
index dc49ad09e2f..f2ae784cefe 100644
--- a/gcc/c-family/c-common.def
+++ b/gcc/c-family/c-common.def
@@ -50,6 +50,9 @@ DEFTREECODE (E

[PATCH v16b 3/4] gcc/: Merge definitions of array_type_nelts_top

2024-10-16 Thread Alejandro Colomar
There were two identical definitions, and none of them are available
where they are needed for implementing __nelementsof__.  Merge them, and
provide the single definition in gcc/tree.{h,cc}, where it's available
for __nelementsof__, which will be added in the following commit.

gcc/ChangeLog:

* tree.h (array_type_nelts_top)
* tree.cc (array_type_nelts_top):
Define function (moved from gcc/cp/).

gcc/cp/ChangeLog:

* cp-tree.h (array_type_nelts_top)
* tree.cc (array_type_nelts_top):
Remove function (move to gcc/).

gcc/rust/ChangeLog:

* backend/rust-tree.h (array_type_nelts_top)
* backend/rust-tree.cc (array_type_nelts_top):
Remove function.

Signed-off-by: Alejandro Colomar 
---
 gcc/cp/cp-tree.h  |  1 -
 gcc/cp/tree.cc| 13 -
 gcc/rust/backend/rust-tree.cc | 13 -
 gcc/rust/backend/rust-tree.h  |  2 --
 gcc/tree.cc   | 13 +
 gcc/tree.h|  1 +
 6 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 94ee550bd9c..a44100a2bc4 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8121,7 +8121,6 @@ extern tree build_exception_variant   (tree, 
tree);
 extern void fixup_deferred_exception_variants   (tree, tree);
 extern tree bind_template_template_parm(tree, tree);
 extern tree array_type_nelts_total (tree);
-extern tree array_type_nelts_top   (tree);
 extern bool array_of_unknown_bound_p   (const_tree);
 extern tree break_out_target_exprs (tree, bool = false);
 extern tree build_ctor_subob_ref   (tree, tree, tree);
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 3cac8ac4df1..c80ee068958 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -3076,19 +3076,6 @@ cxx_print_statistics (void)
 depth_reached);
 }
 
-/* Return, as an INTEGER_CST node, the number of elements for TYPE
-   (which is an ARRAY_TYPE).  This counts only elements of the top
-   array.  */
-
-tree
-array_type_nelts_top (tree type)
-{
-  return fold_build2_loc (input_location,
- PLUS_EXPR, sizetype,
- array_type_nelts_minus_one (type),
- size_one_node);
-}
-
 /* Return, as an INTEGER_CST node, the number of elements for TYPE
(which is an ARRAY_TYPE).  This one is a recursive count of all
ARRAY_TYPEs that are clumped together.  */
diff --git a/gcc/rust/backend/rust-tree.cc b/gcc/rust/backend/rust-tree.cc
index 8d32e5203ae..3dc6b076711 100644
--- a/gcc/rust/backend/rust-tree.cc
+++ b/gcc/rust/backend/rust-tree.cc
@@ -859,19 +859,6 @@ is_empty_class (tree type)
   return CLASSTYPE_EMPTY_P (type);
 }
 
-// forked from gcc/cp/tree.cc array_type_nelts_top
-
-/* Return, as an INTEGER_CST node, the number of elements for TYPE
-   (which is an ARRAY_TYPE).  This counts only elements of the top
-   array.  */
-
-tree
-array_type_nelts_top (tree type)
-{
-  return fold_build2_loc (input_location, PLUS_EXPR, sizetype,
- array_type_nelts_minus_one (type), size_one_node);
-}
-
 // forked from gcc/cp/tree.cc builtin_valid_in_constant_expr_p
 
 /* Test whether DECL is a builtin that may appear in a
diff --git a/gcc/rust/backend/rust-tree.h b/gcc/rust/backend/rust-tree.h
index 26c8b653ac6..e597c3ab81d 100644
--- a/gcc/rust/backend/rust-tree.h
+++ b/gcc/rust/backend/rust-tree.h
@@ -2993,8 +2993,6 @@ extern location_t rs_expr_location (const_tree);
 extern int
 is_empty_class (tree type);
 
-extern tree array_type_nelts_top (tree);
-
 extern bool
 is_really_empty_class (tree, bool);
 
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 94c6d086bd7..b40f4d31b2f 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -3732,6 +3732,19 @@ array_type_nelts_minus_one (const_tree type)
  ? max
  : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
 }
+
+/* Return, as an INTEGER_CST node, the number of elements for TYPE
+   (which is an ARRAY_TYPE).  This counts only elements of the top
+   array.  */
+
+tree
+array_type_nelts_top (tree type)
+{
+  return fold_build2_loc (input_location,
+ PLUS_EXPR, sizetype,
+ array_type_nelts_minus_one (type),
+ size_one_node);
+}
 
 /* If arg is static -- a reference to an object in static storage -- then
return the object.  This is not the same as the C meaning of `static'.
diff --git a/gcc/tree.h b/gcc/tree.h
index c996821c953..f4c89f5477c 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4930,6 +4930,7 @@ extern tree build_method_type (tree, tree);
 extern tree build_offset_type (tree, tree);
 extern tree build_complex_type (tree, bool named = false);
 extern tree array_type_nelts_minus_one (const_tree);
+extern tree array_type_nelts_top (tree);
 
 extern tree value_member (tree, tree);
 extern tree purpose_member (const_tree, tree);
-- 
2.45.2



signature.asc

[committed] Add libgomp.oacc-fortran/acc_on_device-1-4.f (was: Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*'

2024-10-16 Thread Tobias Burnus

Fixed the following commit

Thomas Schwinge wrote:

Pushed to trunk branch
commit 9f549d216c9716e787aaa38593bc9f83195b60ae
"Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' 
__builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*'",
see attached.


by re-adding a testcase which actually tests the builtin.

Committed as r15-4388-gee4fdda70f1080.

Tobias
commit ee4fdda70f1080bba5e49cadebc44333e19edeb4
Author: Tobias Burnus 
Date:   Wed Oct 16 16:15:40 2024 +0200

Add libgomp.oacc-fortran/acc_on_device-1-4.f

Kind of undoes r15-4315-g9f549d216c9716 by adding the original testcase back;
namely, adding acc_on_device-1-3.f as acc_on_device-1-4.f with
-fno-builtin-acc_on_device removed.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/acc_on_device-1-4.f: New test;
same as acc_on_device-1-3.f but using the builtin function.
---
 .../libgomp.oacc-fortran/acc_on_device-1-4.f   | 60 ++
 1 file changed, 60 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-4.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-4.f
new file mode 100644
index 000..401d3a372b3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-4.f
@@ -0,0 +1,60 @@
+! { dg-do run }
+! { dg-additional-options "-cpp" }
+
+! As acc_on_device-1-3.f, but using the acc_on_device builtin.
+
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
+  IMPLICIT NONE
+  INCLUDE "openacc_lib.h"
+
+!Host.
+
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 1
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 2
+  IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 3
+  IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) STOP 4
+  IF (ACC_ON_DEVICE (ACC_DEVICE_RADEON)) STOP 4
+
+
+!Host via offloading fallback mode.
+
+!$ACC PARALLEL IF(.FALSE.)
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-1 }
+!TODO Unhandled 'CONST_DECL' instances for constant arguments in 'acc_on_device' calls.
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 5
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 6
+  IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 7
+  IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) STOP 8
+  IF (ACC_ON_DEVICE (ACC_DEVICE_RADEON)) STOP 8
+!$ACC END PARALLEL
+
+
+#if !ACC_DEVICE_TYPE_host
+
+! Offloaded.
+
+!$ACC PARALLEL
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target { ! openacc_host_selected } } .-1 }
+  IF (ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 9
+  IF (ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 10
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 11
+#if ACC_DEVICE_TYPE_nvidia
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) STOP 12
+#else
+  IF (ACC_ON_DEVICE (ACC_DEVICE_NVIDIA)) STOP 13
+#endif
+#if ACC_DEVICE_TYPE_radeon
+  IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_RADEON)) STOP 14
+#else
+  IF (ACC_ON_DEVICE (ACC_DEVICE_RADEON)) STOP 15
+#endif
+!$ACC END PARALLEL
+
+#endif
+
+  END


[PATCH v3] MATCH: Simplify `a rrotate (32-b) -> a lrotate b` [PR109906]

2024-10-16 Thread Eikansh Gupta
The pattern `a rrotate (32-b)` should be optimized to `a lrotate b`.
The same is also true for `a lrotate (32-b)`. It can be optimized to
`a rrotate b`.

This patch adds following patterns:
a rrotate (32-b) -> a lrotate b
a lrotate (32-b) -> a rrotate b

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/109906

gcc/ChangeLog:

* match.pd (a rrotate (32-b) -> a lrotate b): New pattern
(a lrotate (32-b) -> a rrotate b): New pattern

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr109906.c: New test.

Signed-off-by: Eikansh Gupta 
---
 gcc/match.pd |  9 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr109906.c | 41 
 2 files changed, 50 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr109906.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 5ec31ef6269..078ef050351 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4861,6 +4861,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
build_int_cst (TREE_TYPE (@1),
   element_precision (type)), @1); }))
 
+/* a rrotate (32-b) -> a lrotate b */
+/* a lrotate (32-b) -> a rrotate b */
+(for rotate (lrotate rrotate)
+ orotate (rrotate lrotate)
+ (simplify
+  (rotate @0 (minus INTEGER_CST@1 @2))
+   (if (TYPE_PRECISION (TREE_TYPE (@0)) == wi::to_wide (@1))
+ (orotate @0 @2
+
 /* Turn (a OP c1) OP c2 into a OP (c1+c2).  */
 (for op (lrotate rrotate rshift lshift)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr109906.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr109906.c
new file mode 100644
index 000..9aa015d8c65
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr109906.c
@@ -0,0 +1,41 @@
+/* PR tree-optimization/109906 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized-raw" } */
+/* { dg-require-effective-target int32 } */
+
+/* Implementation of rotate right operation */
+static inline
+unsigned rrotate(unsigned x, int t)
+{
+  if (t >= 32) __builtin_unreachable();
+  unsigned tl = x >> (t);
+  unsigned th = x << (32 - t);
+  return tl | th;
+}
+
+/* Here rotate left is achieved by doing rotate right by (32 - x) */
+unsigned rotateleft(unsigned t, int x)
+{
+  return rrotate (t, 32 - x);
+}
+
+/* Implementation of rotate left operation */
+static inline
+unsigned lrotate(unsigned x, int t)
+{
+  if (t >= 32) __builtin_unreachable();
+  unsigned tl = x << (t);
+  unsigned th = x >> (32 - t);
+  return tl | th;
+}
+
+/* Here rotate right is achieved by doing rotate left by (32 - x) */
+unsigned rotateright(unsigned t, int x)
+{
+  return lrotate (t, 32 - x);
+}
+
+/* Shouldn't have instruction for (32 - x). */
+/* { dg-final { scan-tree-dump-not "minus_expr" "optimized" } } */
+/* { dg-final { scan-tree-dump "rrotate_expr" "optimized" } } */
+/* { dg-final { scan-tree-dump "lrotate_expr" "optimized" } } */
-- 
2.17.1



Re: [committed] libstdc++: Fix Python deprecation warning in printers.py

2024-10-16 Thread Tom Tromey
> "Jonathan" == Jonathan Wakely  writes:

Jonathan> also unable to build GDB with Python 3? ... maybe unlikely) but nobody
Jonathan> should be stuck on 3.0 and unable to replace that with 3.12 or so.

gdb has one user (maybe more, but I doubt it) on Windows XP where the
latest available version is 3.4.

Jonathan> If we no longer care about Python 2 there are a few places in that
Jonathan> file we could clean up.

Python 2 support was dropped from gdb in December 2021.  Unfortunately
this wasn't mentioned in gdb/NEWS, but according to git this seems to
have happened in gdb 13.  So I think it's safe to fix this up.

Tom


[PATCH v2] MATCH: Simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, -1.0/1.0)` [PR112472]

2024-10-16 Thread Eikansh Gupta
This patch simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, 
-1.0/1.0)`
depending on the sign of CST. Previously, it was simplified to `copysign (x, 
CST)`.
It can be optimized as the sign of the CST matters, not the value.

The patch also simplify `(trunc)abs (extend x)` to `abs (x)`.

PR tree-optimization/112472

gcc/ChangeLog:

* match.pd ((trunc)copysign ((extend)x, -CST) --> copysign (x, -1.0)): 
New pattern.
((trunc)abs (extend x) --> abs (x)): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr112472.c: New test.

Signed-off-by: Eikansh Gupta 
---
 gcc/match.pd | 20 +---
 gcc/testsuite/gcc.dg/tree-ssa/pr112472.c | 22 ++
 2 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr112472.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 940292d0d49..a1668c4c9dc 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8527,15 +8527,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
x,y is float value, similar for _Float16/double.  */
 (for copysigns (COPYSIGN_ALL)
  (simplify
-  (convert (copysigns (convert@2 @0) (convert @1)))
+  (convert (copysigns (convert@2 @0) (convert2? @1)))
(if (optimize
&& !HONOR_SNANS (@2)
&& types_match (type, TREE_TYPE (@0))
-   && types_match (type, TREE_TYPE (@1))
&& TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@2))
&& direct_internal_fn_supported_p (IFN_COPYSIGN,
  type, OPTIMIZE_FOR_BOTH))
-(IFN_COPYSIGN @0 @1
+(if (types_match (type, TREE_TYPE (@1)))
+ (IFN_COPYSIGN @0 @1)
+ /* (trunc)copysign (extend)x, CST) to copysign (x, -1.0/1.0) */
+ (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
+  (IFN_COPYSIGN @0 { build_minus_one_cst (type); })
+  (IFN_COPYSIGN @0 { build_one_cst (type); }))
+
+/* (trunc)abs (extend x) --> abs (x)
+   x is a float value */
+(simplify
+ (convert (abs (convert@1 @0)))
+  (if (optimize
+  && !HONOR_SNANS (@1)
+  && types_match (type, TREE_TYPE (@0))
+  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@1)))
+   (abs @0)))
 
 (for froms (BUILT_IN_FMAF BUILT_IN_FMA BUILT_IN_FMAL)
  tos (IFN_FMA IFN_FMA IFN_FMA)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr112472.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr112472.c
new file mode 100644
index 000..8f97278ffe8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr112472.c
@@ -0,0 +1,22 @@
+/* PR tree-optimization/109878 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+
+/* Optimized to .COPYSIGN(a, -1.0e+0) */
+float f(float a)
+{
+  return (float)__builtin_copysign(a, -3.0);
+}
+
+/* This gets converted to (float) abs((double) a)
+   With the patch it is optimized to abs(a) */
+float f2(float a)
+{
+  return (float)__builtin_copysign(a, 5.0);
+}
+
+/* { dg-final { scan-tree-dump-not "= __builtin_copysign" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " double " "optimized" { target 
ifn_copysign } } } */
+/* { dg-final { scan-tree-dump-times ".COPYSIGN" 1 "optimized" { target 
ifn_copysign } } } */
+/* { dg-final { scan-tree-dump-times "-1.0e\\+0" 1 "optimized" { target 
ifn_copysign } } } */
+/* { dg-final { scan-tree-dump-times " ABS_EXPR " 1 "optimized" { target 
ifn_copysign } } } */
-- 
2.17.1



Re: [PATCH v1 3/4] aarch64: improve assembly debug comments for build attributes

2024-10-16 Thread Matthieu Longo

On 2024-10-08 18:45, Richard Sandiford wrote:

Matthieu Longo  writes:

The previous implementation to emit build attributes did not support
string values (asciz) in aeabi_subsection, and was not emitting values
associated to tags in the assembly comments.

This new approach provides a more user-friendly interface relying on
typing, and improves the emitted assembly comments:
   * aeabi_attribute:
 ** Adds the interpreted value next to the tag in the assembly
 comment.
 ** Supports asciz values.
   * aeabi_subsection:
 ** Adds debug information for its parameters.
 ** Auto-detects the attribute types when declaring the subsection.

Additionally, it is also interesting to note that the code was moved
to a separate file to improve modularity and "releave" the 1000-lines
long aarch64.cc file from a few lines. Finally, it introduces a new
namespace "aarch64::" for AArch64 backend which reduce the length of
function names by not prepending 'aarch64_' to each of them.

gcc/ChangeLog:

* config/aarch64/aarch64.cc
(aarch64_emit_aeabi_attribute): Delete.
(aarch64_emit_aeabi_subsection): Delete.
(aarch64_start_file): Use aeabi_subsection.
* config/aarch64/aarch64-dwarf-metadata.h: New file.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/build-attributes/build-attribute-gcs.c:
  Improve test to match debugging comments in assembly.
* gcc.target/aarch64/build-attributes/build-attribute-standard.c:
  Likewise.


Generally this looks really nice!  Just some minor comments below.
Most of them are formatting nits; sorry that we don't have automatic
formatting to care of this stuff.


All the formatting comments are addressed in the next revision.


---
  gcc/config/aarch64/aarch64-dwarf-metadata.h   | 247 ++
  gcc/config/aarch64/aarch64.cc |  43 +--
  .../build-attributes/build-attribute-gcs.c|   4 +-
  .../build-attribute-standard.c|   4 +-
  4 files changed, 261 insertions(+), 37 deletions(-)
  create mode 100644 gcc/config/aarch64/aarch64-dwarf-metadata.h

diff --git a/gcc/config/aarch64/aarch64-dwarf-metadata.h 
b/gcc/config/aarch64/aarch64-dwarf-metadata.h
new file mode 100644
index 000..9638bc7702f
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-dwarf-metadata.h
@@ -0,0 +1,247 @@
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009-2024 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#ifndef GCC_AARCH64_DWARF_METADATA_H
+#define GCC_AARCH64_DWARF_METADATA_H
+
+#include 
+#include 


As mentioned in the part 1 review, we need to include system headers
via system.h.  These two should be included by default, though,
so I think they can just be removed.



Fixed.


+#include 


"vec.h"



Fixed.


+
+namespace aarch64 {
+
+enum attr_val_type: uint8_t
+{
+  uleb128 = 0x0,
+  asciz = 0x1,
+};
+
+enum BA_TagFeature_t: uint8_t
+{
+  Tag_Feature_BTI = 1,
+  Tag_Feature_PAC = 2,
+  Tag_Feature_GCS = 3,
+};
+
+template 
+struct aeabi_attribute
+{
+  T_tag tag;
+  T_val value;
+};
+
+template 
+aeabi_attribute
+make_aeabi_attribute (T_tag tag, T_val val)
+{
+  return aeabi_attribute{tag, val};
+}
+
+namespace details {
+
+constexpr const char*


Formatting nit, but: space before "*", even here.  Same below.



Fixed.


+to_c_str (bool b)
+{
+  return b ? "true" : "false";
+}
+
+constexpr const char*
+to_c_str (const char *s)
+{
+  return s;
+}
+
+constexpr const char*
+to_c_str (attr_val_type t)
+{
+  const char *s = nullptr;
+  switch (t) {
+case uleb128:
+  s = "ULEB128";
+  break;
+case asciz:
+  s = "asciz";
+  break;
+  }


Formatting nit: the opening brace should be on its own line, indented like so:

   switch (t)
 {
 case uleb128:
   s = "ULEB128";
   break;
 case asciz:
   s = "asciz";
   break;
 }

However...


+  return s;


...we are unfortunately limited to C++11 constexprs, so I think this needs
to be:

   return (t == uleb128 ? "ULEB128"
   : t == asciz ? "asciz"
   : nullptr);

if we want it to be treated as a constant for all builds.



Fixed in the next revision.
FYI we might have C++14 soon :) => 
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665644.html



+}
+
+constexpr const char*
+to_c_str (BA_TagFeature_t feature

Re: [PATCH v2] [testsuite] [arm] add effective target and options for pacbti tests

2024-10-16 Thread Richard Earnshaw (lists)
On 19/04/2024 17:37, Alexandre Oliva wrote:
> Hello, Richard,
> 
> Thanks, your response was very informative.
> 
> Here's a revised patch.
> 
> arm pac and bti tests that use -march=armv8.1-m.main get an implicit
> -mthumb, that is incompatible with vxworks kernel mode.  Declaring the
> requirement for a 8.1-m.main-compatible toolchain is enough to avoid
> those fails, because the toolchain feature test fails in kernel mode,
> but taking the -march options from the standardized arch tests, after
> testing for support for the corresponding effective target, makes it
> generally safer, and enables us to drop skip directives and extraneous
> option variants.
> 
> Tested all 6 modified testcases with an x86_64-linux-gnu-x-arm-eabi
> uberbaum build.  Ok to install?

Apologies for the delay replying.  Yes, this is OK.

Thanks,

R.

> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arm/bti-1.c: Require arch, use its opts, drop skip.
>   * gcc.target/arm/bti-2.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
>   * g++.target/arm/pac-1.C: Likewise.  Drop +mve.
> ---
>  gcc/testsuite/g++.target/arm/pac-1.C   |5 +++--
>  .../gcc.target/arm/acle/pacbti-m-predef-11.c   |4 ++--
>  .../gcc.target/arm/acle/pacbti-m-predef-12.c   |5 +++--
>  .../gcc.target/arm/acle/pacbti-m-predef-7.c|5 +++--
>  gcc/testsuite/gcc.target/arm/bti-1.c   |5 +++--
>  gcc/testsuite/gcc.target/arm/bti-2.c   |5 +++--
>  6 files changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/testsuite/g++.target/arm/pac-1.C 
> b/gcc/testsuite/g++.target/arm/pac-1.C
> index f671a27b048c6..ac15ae18197ca 100644
> --- a/gcc/testsuite/g++.target/arm/pac-1.C
> +++ b/gcc/testsuite/g++.target/arm/pac-1.C
> @@ -1,7 +1,8 @@
>  /* Check that GCC does .save and .cfi_offset directives with RA_AUTH_CODE 
> pseudo hard-register.  */
>  /* { dg-do compile } */
> -/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
> -/* { dg-options "-march=armv8.1-m.main+mve+pacbti 
> -mbranch-protection=pac-ret -mthumb -mfloat-abi=hard -g -O0" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_pacbti_ok } */
> +/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
> +/* { dg-additional-options "-mbranch-protection=pac-ret -mfloat-abi=hard -g 
> -O0" } */
>  
>  __attribute__((noinline)) void
>  fn1 (int a, int b, int c)
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> index 6a5ae92c567f3..c9c40f44027d4 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" "-mfloat-abi=*" } } */
> -/* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_pacbti_ok } */
> +/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
>  
>  #if (__ARM_FEATURE_BTI != 1)
>  #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be defined to 1."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> index db40b17c3b030..c26051347a2cc 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
> -/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
> -/* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
> +/* { dg-add-options arm_arch_v8_1m_main } */
> +/* { dg-additional-options "-mfloat-abi=softfp" } */
>  
>  #if defined (__ARM_FEATURE_BTI)
>  #error "Feature test macro __ARM_FEATURE_BTI should not be defined."
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> index 1b25907635e24..92f500c1449b3 100644
> --- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
> -/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
> "-mcpu=*" } } */
> -/* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> +/* { dg-require-effective-target arm_arch_v8_1m_main_pacbti_ok } */
> +/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
> +/* { dg-additional-options "--save-temps -mfloat-abi=hard" } */
>  
>  #if defined (__ARM_FEATURE_BTI_DEFAULT)
>  #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be undefined."
> diff --git 

Re: [PATCH v1 4/4] aarch64: encapsulate note.gnu.property emission into a class

2024-10-16 Thread Matthieu Longo

On 2024-10-08 18:51, Richard Sandiford wrote:

Matthieu Longo  writes:

gcc/ChangeLog:

 * config.gcc: Add aarch64-dwarf-metadata.o to extra_objs.
 * config/aarch64/aarch64-dwarf-metadata.h (class 
section_note_gnu_property):
 Encapsulate GNU properties code into a class.
 * config/aarch64/aarch64.cc
 (GNU_PROPERTY_AARCH64_FEATURE_1_AND): Define.
 (GNU_PROPERTY_AARCH64_FEATURE_1_BTI): Likewise.
 (GNU_PROPERTY_AARCH64_FEATURE_1_PAC): Likewise.
 (GNU_PROPERTY_AARCH64_FEATURE_1_GCS): Likewise.
 (aarch64_file_end_indicate_exec_stack): Move GNU properties code to
 aarch64-dwarf-metadata.cc
 * config/aarch64/t-aarch64: Declare target aarch64-dwarf-metadata.o
 * config/aarch64/aarch64-dwarf-metadata.cc: New file.
---
  gcc/config.gcc   |   2 +-
  gcc/config/aarch64/aarch64-dwarf-metadata.cc | 120 +++
  gcc/config/aarch64/aarch64-dwarf-metadata.h  |  19 +++
  gcc/config/aarch64/aarch64.cc|  87 +-
  gcc/config/aarch64/t-aarch64 |   7 ++
  5 files changed, 153 insertions(+), 82 deletions(-)
  create mode 100644 gcc/config/aarch64/aarch64-dwarf-metadata.cc

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f09ce9f63a0..b448c2a91d1 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -351,7 +351,7 @@ aarch64*-*-*)
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
d_target_objs="aarch64-d.o"
-   extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o 
aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o 
aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o aarch64-speculation.o 
falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o 
aarch64-early-ra.o aarch64-ldp-fusion.o"
+   extra_objs="aarch64-builtins.o aarch-common.o aarch64-dwarf-metadata.o 
aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o 
aarch64-sve-builtins-sve2.o aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o 
aarch64-speculation.o falkor-tag-collision-avoidance.o aarch-bti-insert.o 
aarch64-cc-fusion.o aarch64-early-ra.o aarch64-ldp-fusion.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.h 
\$(srcdir)/config/aarch64/aarch64-builtins.cc 
\$(srcdir)/config/aarch64/aarch64-sve-builtins.h 
\$(srcdir)/config/aarch64/aarch64-sve-builtins.cc"
target_has_targetm_common=yes
;;
diff --git a/gcc/config/aarch64/aarch64-dwarf-metadata.cc 
b/gcc/config/aarch64/aarch64-dwarf-metadata.cc
new file mode 100644
index 000..36659862b59
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-dwarf-metadata.cc
@@ -0,0 +1,120 @@


Missing copyright & licence.



Oups :/ Fixed.


+#define INCLUDE_STRING
+#define INCLUDE_ALGORITHM
+#define INCLUDE_MEMORY
+#define INCLUDE_VECTOR
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "output.h"
+
+#include "aarch64-dwarf-metadata.h"
+
[...]
diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64
index c2a0715e9ab..194e3a4ac99 100644
--- a/gcc/config/aarch64/t-aarch64
+++ b/gcc/config/aarch64/t-aarch64
@@ -139,6 +139,13 @@ aarch-common.o: $(srcdir)/config/arm/aarch-common.cc 
$(CONFIG_H) $(SYSTEM_H) \
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/arm/aarch-common.cc
  
+aarch64-dwarf-metadata.o: $(srcdir)/config/aarch64/aarch64-dwarf-metadata.cc \

+$(CONFIG_H) \
+$(SYSTEM_H) \
+$(TARGET_H)


Looks like this also needs: $(RTL_H) output.h aarch64-dwarf-metadata.h


Fixed.


The comments for patch 1 apply to this refactored version too, but otherwise
it looks good.


Fixed in both patches 1/4 and 4/4 in the next revision.


Thanks,
Richard



+   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_SPPFLAGS) $(INCLUDES) \
+ $(srcdir)/config/aarch64/aarch64-dwarf-metadata.cc
+
  aarch64-c.o: $(srcdir)/config/aarch64/aarch64-c.cc $(CONFIG_H) $(SYSTEM_H) \
  coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H) $(TARGET_H)
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \




Linaro CI new feature: skip precommit testing

2024-10-16 Thread Christophe Lyon
Hi,

Following "popular request", we are happy to announce that users can
now request to skip Linaro CI precommit testing for some patches.

The current implementation skips testing in two cases:
1- there is [RFC] or [RFC v[0-9]] in the patch subject
2- the commit message contains a line starting with 'CI-tag: skip'

[1] Enables to avoid undesirable regression notifications when one
sends an incomplete patch to start discussing ideas.

[2] Aims at helping workflows where people submit patches for "master
files" and "regenerated files" as two patches to make review easier.
In such cases, the patch with only "master files" changes would
generally generate regression notifications, confusing both reviewers
and patch authors.  For instance:
- patch #1/3: introduce new code -> should pass CI
- patch #2/3: changes to "master files" -> skip CI
- patch #3/3: changes to "regenerated files" -> should pass CI


This change does NOT affect postcommit CI, where CI is always expected
to pass (otherwise regression notifications will be generated).


Technically, we use 'git am --keep' when applying the patches, so
that subject lines are untouched: this enables us to handle
standard markers such as [PATCH RFC] or [RFC PATCH] for instance.

This also means that a 'CI-tag: skip' after the usual '---' lines
will be ignored (git-am will consider it as part of the patch,
rather than the commit message).

People used to git am --scissors may find it convenient to put
the tag at the start of the commit message:

CI-tag: skip
-- >8 --
=

But it's fine to put that tag along with other tags (such as
Signed-Off-By, Co-authored-by, ...)

We hope this will be useful / helpful.

Thanks,

The Linaro Toolchain team.


Re: [Ping, Fortran, Patch, PR80235, v1] Fix ICE when coarray from module is referenced in submodule.

2024-10-16 Thread Paul Richard Thomas
Hi Andre,

The handling of submodules is something of a kludge, especially where
module procedures are concerned, that seems to work OK. Given that, your
patch looks right and is good for mainline.

Thanks for the patch.

Paul


On Wed, 16 Oct 2024 at 08:21, Andre Vehreschild  wrote:

> Hi all,
>
> PING.
>
> Re-Regtested ok again on x86_64-pc-linux-gnu. Ok for mainline?
>
> Regards,
> Andre
>
> On Wed, 25 Sep 2024 12:29:21 +0200
> Andre Vehreschild  wrote:
>
> > Hi all and esp. Paul,
> >
> > the attached patch fixes an ICE with coarrays defined in modules and
> then used
> > in submodules. Referencing the variable relied on the curr_module being
> set in
> > the gfc_build_qualified_array routine, which it was not. I therefore
> took the
> > name from the symbol. I don't know if this is correct. Paul any idea?
> You have
> > done the submodule part in the past. I am still not sure, if the coarray
> is
> > exported into the .mod file correctly, because it gets host_assoc set,
> while I
> > would expect it to have use_assoc set. So may be this patch is only
> healing
> > the symptom and not the cause. Could you have a look, please?
> >
> > Regtests ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline?
> >
> > Regards,
> >   Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de
>


Re: [PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Robin Dapp
> Interesting - this is bleh | bswap (..), right, so having
> bla1 | (bleh | bla2) fails to recognize bla1 | bla2 as bswap.

Yes, exactly.

> I'd expect this kind of pattern to fail bswap detection easily
> if you mangle it a bit.  So possibly bswap detection should learn
> to better pick the "pieces" from a chain of IORs ...

Yeah, it also appeared brittle to me.  But I could understand that
we want a pass that sorts by rank before such an optimization to
make life easier.

> Consider bswap (..) | bswap (..) badly interwinded for example.
>
> > Or did I miss the point?
>
> No, but as we can see it's a "trade-off" ...

As always :)  So a way forward with this would be to not do the
swap before vectorization and enhance bswap so we won't regress
for those two cases?

Luckily I didn't notice other things relying on rank-order.

-- 
Regards
 Robin



Re: [PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Richard Biener
On Wed, 16 Oct 2024, Robin Dapp wrote:

> Hi,
> 
> this is probably rather an RFC than a patch as I'm not sure whether
> reassoc is the right place to fix it.  On top, the heuristic might
> be a bit "ad-hoc".  Maybe we can also work around it in the vectorizer?
> 
> The following function is vectorized in a very inefficient way because we
> construct vectors from scalar loads.
> 
> uint64_t
> foo (uint8_t *pix, int i_stride)
> {
>   uint32_t sum = 0, sqr = 0;
>   int x, y;
>   for (y = 0; y < 16; y++)
> {
>   for (x = 0; x < 16; x++)
>   sum += pix[x];
>   pix += i_stride;
> }
>   return sum;
> }
> 
> The reason for this is that reassoc reorders the last three operands of
> the summation sequence by rank introducing a temporary in the process
> that breaks the "homogeneity" (sum_n = sum_{n - 1} + pix[n]) of the sum.
> 
> This patch adds a function likely_vectorizable_p that checks if an
> operand vector contains only operands of the same rank except the last
> one.  In that case the sequence is likely vectorizable and the easier
> vectorization will outweigh any CSE opportunity we can expose by
> swapping operands.

Hmm, reassoc isn't supposed to apply width > 1 before vectorization;
IIRC I "fixed" that at some point, but this is what I see:

  # sum_126 = PHI 
...
  _23 = *pix_134; 
  _24 = (unsigned int) _23;
  _31 = MEM[(uint8_t *)pix_134 + 1B]; 
  _32 = (unsigned int) _31;
  _118 = _24 + _32;
  sum_33 = _118 + sum_126; 
...

I think the place you add the condition to on swap_ops_for_binary_stmt
simply lacks a !reassoc_insert_powi_p check like we have on the earlier
rewrite_expr_tree_parallel.

Richard.

> Bootstrapped and regtested on x86, regtested on rv64gcv.
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * tree-ssa-reassoc.cc (likely_vectorizable_p): New function.
>   (reassociate_bb): Use new function.
>   (dump_ops_vector): Change prototype.
>   (debug_ops_vector): Change prototype.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/reassoc-52.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c |  27 ++
>  gcc/tree-ssa-reassoc.cc| 102 -
>  2 files changed, 124 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c
> new file mode 100644
> index 000..b117b7519bb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-52.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3 -fdump-tree-reassoc-details" } */
> +
> +typedef unsigned char uint8_t;
> +typedef unsigned int uint32_t;
> +typedef unsigned long uint64_t;
> +
> +uint64_t
> +foo (uint8_t *pix, int i_stride)
> +{
> +  uint32_t sum = 0, sqr = 0;
> +  int x, y;
> +  for (y = 0; y < 16; y++)
> +{
> +  for (x = 0; x < 16; x++)
> + sum += pix[x];
> +  pix += i_stride;
> +}
> +  return sum;
> +}
> +
> +/* Ensure that we only add to sum variables and don't create a temporary that
> +   does something else.  In doing so we enable a much more efficient
> +   vectorization scheme.  */
> +
> +/* { dg-final { scan-tree-dump-times "\\s+sum_\\d+\\s+=\\s+sum_\\d+\\s+\\\+" 
> 16 "reassoc1" } } */
> +/* { dg-final { scan-tree-dump-times "\\s+sum_\\d+\\s=.*\\\+\\ssum_\\d+;" 1 
> "reassoc1" } } */
> diff --git a/gcc/tree-ssa-reassoc.cc b/gcc/tree-ssa-reassoc.cc
> index 556ecdebe2d..13f8a1070b6 100644
> --- a/gcc/tree-ssa-reassoc.cc
> +++ b/gcc/tree-ssa-reassoc.cc
> @@ -6992,6 +6992,96 @@ rank_ops_for_fma (vec *ops)
>  }
>return mult_num;
>  }
> +
> +/* Check if the operand vector contains operands that all have the same rank
> +   apart from the last one.  The last one must be a PHI node which links back
> +   to the first operand.
> +   This can indicate an easily vectorizable sequence in which case we don't
> +   want to reorder the first three elements for rank reasons.
> +
> +   Consider the following example:
> +
> +for (int y = 0; y < 16; y++)
> +  {
> + for (int x = 0; x < 16; x++)
> +   {
> + sum += pix[x];
> +   }
> +   ...
> +  }
> +
> +# sum_201 = PHI 
> +# y_149 = PHI 
> +
> +_33 = *pix_214;
> +_34 = (unsigned intD.4) _33;
> +sum_35 = sum_201 + _34;
> +
> +_46 = MEM[(uint8_tD.2849 *)pix_214 + 1B];
> +_47 = (unsigned intD.4) _46;
> +sum_48 = sum_35 + _47;
> +_49 = (intD.1) _46;
> +
> +All loads of pix are of the same rank, just the sum PHI has a different
> +one in {..., _46, _33, sum_201}.
> +swap_ops_for_binary_stmt will move sum_201 before _46 and _33 so that
> +the first operation _33 OP _46 is exposed as an optimization opportunity.
> +In doing so it will trip up the vectorizer which now needs to work much
> +harder because it can't recognize the whole sequence as doing the same
> +operation.  */
> +
> +static bool
> +likely_vectori

Re: [PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Robin Dapp
> Hmm, reassoc isn't supposed to apply width > 1 before vectorization;
> IIRC I "fixed" that at some point, but this is what I see:
>
>   # sum_126 = PHI 
> ...
>   _23 = *pix_134; 
>   _24 = (unsigned int) _23;
>   _31 = MEM[(uint8_t *)pix_134 + 1B]; 
>   _32 = (unsigned int) _31;
>   _118 = _24 + _32;
>   sum_33 = _118 + sum_126; 
> ...
>
> I think the place you add the condition to on swap_ops_for_binary_stmt
> simply lacks a !reassoc_insert_powi_p check like we have on the earlier
> rewrite_expr_tree_parallel.

An earlier version of my patch had a less strict check, i.e. wouldn't allow
the swap for more sequences.  This resulted in two regressions on x86
where we expect this exact swap to happend in order for the bswap pass
to recognize a bswap idiom.

It was something like
bleh = ...
l1 = MEM... << 8
l2 = MEM... >> 8
bla1 = l1 & ...
bla2 = l2 & ...
res1 = bla1 | bleh
res2 = bla2 | res1

reassoc would reorder res1, bla1, and bla2 so that bswap sees bla1 | bla2
and recognizes this as bswap.  When not allowing the swap before
vectorization this will fail.

Or did I miss the point?


Re: [committed] libstdc++: Fix Python deprecation warning in printers.py

2024-10-16 Thread Tom Tromey
> "Jonathan" == Jonathan Wakely  writes:

Jonathan> Using a keyword argument for count only became possible with Python 
3.1
Jonathan> so introduce a new function to do the substitution.

gdb docs are inconsistent on this, but at least one spot says that the
minimum supported Python version is 3.2.  Plus, I know we agreed that
the minimum should be 3.4, and I'll be sending a patch to this effect
soon.

Anyway, I think it would be reasonably safe to just assume 3.1 if you
want to do that.

Tom


Re: [PATCH v1 1/4] aarch64: add debug comments to feature properties in .note.gnu.property

2024-10-16 Thread Matthieu Longo

On 2024-10-08 18:39, Richard Sandiford wrote:

Sorry for the slow review.

Matthieu Longo  writes:

GNU properties are emitted to provide some information about the features
used in the generated code like PAC, BTI, or GCS. However, no debug
comment are emitted in the generated assembly even if -dA is provided.
This makes understanding the information stored in the .note.gnu.property
section more difficult than needed.

This patch adds assembly comments (if -dA is provided) next to the GNU
properties. For instance, if PAC and BTI are enabled, it will emit:
   .word  3  // GNU_PROPERTY_AARCH64_FEATURE_1_AND (BTI, PAC)

gcc/ChangeLog:

 * config/aarch64/aarch64.cc
 (aarch64_file_end_indicate_exec_stack): Emit assembly comments.

gcc/testsuite/ChangeLog:

 * gcc.target/aarch64/bti-1.c: Emit assembly comments, and update
   test assertion.
---
  gcc/config/aarch64/aarch64.cc| 41 +++-
  gcc/testsuite/gcc.target/aarch64/bti-1.c |  5 +--
  2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 4b2e7a690c6..6d9075011ec 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -98,6 +98,8 @@
  #include "ipa-fnsummary.h"
  #include "hash-map.h"
  
+#include 

+


If we do keep this, it needs to be done via a new INCLUDE_NUMERIC
in system.h, which aarch64.cc would then define before including
any files.  This avoids clashes between GCC and system headers
on some hosts.  But see below.


I removed it as I removed std::accumulate.




  /* This file should be included last.  */
  #include "target-def.h"
  
@@ -29129,8 +29131,45 @@ aarch64_file_end_indicate_exec_stack ()

 data   = feature_1_and.  */
assemble_integer (GEN_INT (GNU_PROPERTY_AARCH64_FEATURE_1_AND), 4, 32, 
1);
assemble_integer (GEN_INT (4), 4, 32, 1);
-  assemble_integer (GEN_INT (feature_1_and), 4, 32, 1);
  
+  if (!flag_debug_asm)

+   assemble_integer (GEN_INT (feature_1_and), 4, 32, 1);
+  else
+   {
+ asm_fprintf (asm_out_file, "\t.word\t%u", feature_1_and);


It's probably better to use integer_asm_op (4, 1) rather than
hard-coding .word.


Done.


+
+ auto join_s = [] (std::string s1,
+   const std::string &s2,
+   const std::string &separator = ", ") -> std::string
+ {
+   return std::move (s1)
+ .append (separator)
+ .append (s2);
+ };
+
+ auto features_to_string
+   = [&join_s] (unsigned feature_1_and) -> std::string
+ {
+   std::vector feature_bits;
+   if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
+ feature_bits.push_back ("BTI");
+   if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_PAC)
+ feature_bits.push_back ("PAC");
+   if (feature_1_and & GNU_PROPERTY_AARCH64_FEATURE_1_GCS)
+ feature_bits.push_back ("GCS");
+
+   if (feature_bits.empty ())
+ return {};
+   return std::accumulate(std::next(feature_bits.cbegin()),
+  feature_bits.cend(),
+  feature_bits[0],
+  join_s);
+ };
+ auto const& s_features = features_to_string (feature_1_and);


I do like this!  But I wonder whether it would be simpler to go for
the more prosaic:

  struct flag_name { unsigned int mask; const char *name; };
  static const flag_name flags[] =
  {
{ GNU_PROPERTY_AARCH64_FEATURE_1_BTI, "BTI" },
{ GNU_PROPERTY_AARCH64_FEATURE_1_PAC, "PAC" }
  };

  const char *separator = "";
  std::string s_features;
  for (auto &flag : flags)
if (feature_1_and & flag.mask)
  {
s_features.append (separator).append (flag.name);
separator = ", ";
  }

It's slightly shorter, but also means that there's a bit less
cut-&-paste for each flag.  (In principle, the table could even
be generated from the same source as the definitions of the
GNU_PROPERTY_*s, but that's probaby overkill.)


Replaced by your proposition.


+ asm_fprintf (asm_out_file,
+   "\t%s GNU_PROPERTY_AARCH64_FEATURE_1_AND (%s)\n",
+   ASM_COMMENT_START, s_features.c_str ());
+   }


Formatting:

  asm_fprintf (asm_out_file,
   "\t%s GNU_PROPERTY_AARCH64_FEATURE_1_AND (%s)\n",
   ASM_COMMENT_START, s_features.c_str ());

Thanks,
Richard


/* Pad the size of the note to the required alignment.  */
assemble_align (POINTER_SIZE);
  }
diff --git a/gcc/testsuite/gcc.target/aarch64/bti-1.c 
b/gcc/testsuite/gcc.target/aarch64/bti-1.c
index 5a556b08ed1..e48017abc35 100644
--- a/gcc/testsuite/gcc.target/aarch64/bti-1.c
+++ b/gcc/testsuite/gcc.target/aa

Re: [committed] libstdc++: Fix Python deprecation warning in printers.py

2024-10-16 Thread Jonathan Wakely
On Wed, 16 Oct 2024 at 16:58, Tom Tromey  wrote:
>
> > "Jonathan" == Jonathan Wakely  writes:
>
> Jonathan> Using a keyword argument for count only became possible with Python 
> 3.1
> Jonathan> so introduce a new function to do the substitution.
>
> gdb docs are inconsistent on this, but at least one spot says that the
> minimum supported Python version is 3.2.  Plus, I know we agreed that
> the minimum should be 3.4, and I'll be sending a patch to this effect
> soon.
>
> Anyway, I think it would be reasonably safe to just assume 3.1 if you
> want to do that.

We still have a few places in the printers.py file that check for
Python 3 vs Python 2, so I did so here too.

For the Python 3 case, I'm assuming 3.1 or later, because I can
imagine some people might be stuck on Python 2 for some reason (and
also unable to build GDB with Python 3? ... maybe unlikely) but nobody
should be stuck on 3.0 and unable to replace that with 3.12 or so.

So the condition is Python 2x, or Python 3.1+, and Python 3.0 is unsupported.

If we no longer care about Python 2 there are a few places in that
file we could clean up.



Re: [PATCH v1 2/4] aarch64: add minimal support for GCS build attributes

2024-10-16 Thread Matthieu Longo

On 2024-10-08 18:42, Richard Sandiford wrote:

Since this is an RFC, it would probably be more helpful to get review
comments about the design or the adherence to the spec.  I'll need to
look into things a bit more for that, though, so I'm afraid the below
is more implementation trivia.

Matthieu Longo  writes:

[...]
diff --git a/gcc/config.in b/gcc/config.in
index 7fcabbe5061..1309ba2b133 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -379,6 +379,12 @@
  #endif
  
  
+/* Define if your assembler supports GCS build attributes. */

+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_BUILD_ATTRIBUTES_GCS
+#endif


How about making this HAVE_AS_BUILD_AEABI_ATTRIBUTE?  Or is the idea
that each feature that uses build attributes would need its own
configure check?


The binutils interface is agnostic of the tag value. binutils only 
checks whether the values in this object are matching in the other 
object, that's all.

Checking for the support of each feature in Gas does not make sense.
As you recommended, I defined and used HAVE_AS_BUILD_AEABI_ATTRIBUTE in 
the next revision.



[...]
@@ -24697,6 +24724,20 @@ aarch64_start_file (void)
 asm_fprintf (asm_out_file, "\t.arch %s\n",
aarch64_last_printed_arch_string.c_str ());
  
+/* Emit gcs build attributes only when building a native AArch64-hosted

+   compiler.  */
+#if defined(__aarch64__)


Why is this restricted to native compilers?  We should avoid that
if at all possible.


This check comes from the initial revision 
(https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662825.html).
I am not sure why there was this restriction, it does not make sense to 
me as well.

I removed it in the next revision.


+  /* Check the current assembly supports gcs build attributes, if not


"Check whether the current assembler supports..."


Fixed.


+ fallback to .note.gnu.property section.  */
+  #ifdef HAVE_AS_BUILD_ATTRIBUTES_GCS


It would be good to add:

#ifndef HAVE_AS_BUILD_ATTRIBUTES_GCS
#define HAVE_AS_BUILD_ATTRIBUTES_GCS 0
#endif

somewhere, so that we can use HAVE_AS_BUILD_ATTRIBUTES_GCS in C++
conditions as well as preprocessor conditions.


Done. I added it at the top of aarch64.cc in the next revision.


+if (aarch64_gcs_enabled ())
+  {
+   aarch64_emit_aeabi_subsection (".aeabi-feature-and-bits", 1, 0);
+   aarch64_emit_aeabi_attribute ("Tag_Feature_GCS", 3, 1);
+  }
+  #endif
+#endif
+
 default_file_start ();
  }
[...]
diff --git 
a/gcc/testsuite/gcc.target/aarch64/build-attributes/build-attributes.exp 
b/gcc/testsuite/gcc.target/aarch64/build-attributes/build-attributes.exp
new file mode 100644
index 000..ea47e209227
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/build-attributes/build-attributes.exp
@@ -0,0 +1,46 @@
+# Copyright (C) 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if ![istarget aarch64*-*-*] then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+proc check_effective_target_gas_has_build_attributes { } {
+return [check_no_compiler_messages gas_has_build_attributes object {
+   /* Assembly */
+   .set ATTR_TYPE_uleb128,   0
+   .set ATTR_TYPE_asciz, 1
+   .set Tag_Feature_GCS, 3
+   .aeabi_subsection .aeabi-feature-and-bits, 1, ATTR_TYPE_uleb128
+   .aeabi_attribute Tag_Feature_GCS, 1
+}]
+}


This should go in lib/target-supports.exp instead, probably as
"aarch64_gas_has...".


I renamed the function to 
check_effective_target_aarch64_gas_has_build_attributes to make it clear 
that this is for AArch64 targets, and moved the code into 
lib/target-supports.exp.



+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
+   "" ""
+
+# All done.
+dg-finish
diff --git 
a/gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-bti.c 
b/gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-bti.c
new file mode 100644
index 000..a8712d949f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/build-attributes/no-build-attribute-bti.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { aarch64*-*-linux* && { ! 
gas_has_build_attributes } } } } */
+/* { dg-options "-mbranch-protection=bti -dA" } *

[PATCH] c: Add some checking asserts to named loops handling code

2024-10-16 Thread Jakub Jelinek
Hi!

Jonathan mentioned an unnamed static analyzer reported issue in
c_finish_bc_name.
It is actually a false positive, because the construction of the
loop_names vector guarantees that the last element of the vector
(if the vector is non-empty) always has either
C_DECL_LOOP_NAME (l) or C_DECL_SWITCH_NAME (l) (or both) flags
set, so c will be always non-NULL after the if at the start of the
loops.
The following patch is an attempt to help those static analyzers
(though dunno if it actually helps), by adding a checking assert.

Tested on x86_64-linux, ok for trunk?

2024-10-16  Jakub Jelinek  

* c-decl.cc (c_get_loop_names): Add checking assert that
c is non-NULL in the loop.
(c_finish_bc_name): Likewise.

--- gcc/c/c-decl.cc.jj  2024-10-16 10:06:06.378589144 +0200
+++ gcc/c/c-decl.cc 2024-10-16 14:30:36.253187498 +0200
@@ -13881,6 +13881,7 @@ c_get_loop_names (tree before_labels, bo
{
  if (C_DECL_LOOP_NAME (l) || C_DECL_SWITCH_NAME (l))
c = l;
+ gcc_checking_assert (c);
  loop_names_hash->put (l, c);
  if (i == first)
break;
@@ -13952,6 +13953,7 @@ c_finish_bc_name (location_t loc, tree n
  {
if (C_DECL_LOOP_NAME (l) || C_DECL_SWITCH_NAME (l))
  c = l;
+   gcc_checking_assert (c);
if (l == lab)
  {
label = c;
@@ -13970,6 +13972,7 @@ c_finish_bc_name (location_t loc, tree n
{
  if (C_DECL_LOOP_NAME (l) || C_DECL_SWITCH_NAME (l))
c = l;
+ gcc_checking_assert (c);
  if (is_break || C_DECL_LOOP_NAME (c))
candidates.safe_push (IDENTIFIER_POINTER (DECL_NAME (l)));
}

Jakub



Re: [PATCH v3 2/5] openmp: Add support for iterators in map clauses (C/C++)

2024-10-16 Thread Jakub Jelinek
On Fri, Oct 11, 2024 at 04:59:27PM +0200, Jakub Jelinek wrote:
> E.g. it would be IMHO fine if the gimplification is done in a similar way
> how we do OMP_CLAUSE_REDUCTION_{INIT,MERGE} gimplification into
> &OMP_CLAUSE_REDUCTION_GIMPLE_{INIT,MERGE}, instead of gimplifying the
> map clause expression into a sequence pushed before the target construct
> gimplify it into a gimple_seq operand of the clause, if needed with some
> placeholder in it (I'd assume placeholder would be in this case the
> iterator).  Then at omp lowering time one can just emit a loop and
> in the body of the loop just copy the gimple seq from the clause, which the
> placeholder set to the loop iterator.

To expand more on why this is essential, gimplification performs an
important part of the OpenMP handling, among others discovery of implicit
OpenMP clauses.  So, if one bypasses gimplification of some expression,
that part isn't performed, so it would need to be duplicated later.
Consider:
int bar (int, int);
void baz (int, int *);
#pragma omp declare target enter (baz)

void
foo (int x, int *p)
{
  #pragma omp parallel
  #pragma omp master
  #pragma omp target map (to, iterator (i = 0 : 4) : p[bar (x, i)])
  baz (x, p);
}

If p[bar (x, i)] isn't gimplified during gimplification, then nothing will
add the needed implicit firstprivate (x) clause on the parallel and it will
ICE.

Jakub



[PATCH] RISC-V:Auto vect for vector bf16

2024-10-16 Thread Feng Wang
This patch add auto-vect patterns for vector-bfloat16 extension.
Similar to vector extensions, these patterns can use vector
BF16 instructions to optimize the automatic vectorization of for loops.
gcc/ChangeLog:

* config/riscv/vector-bfloat16.md (extend2):
Add auto-vect pattern for vector-bfloat16.
(trunc2): Ditto.
(*widen_bf16_fma): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c: New test.
* gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c: New test.
* gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c: New test.

Signed-off-by: Feng Wang 
---
 gcc/config/riscv/vector-bfloat16.md   | 144 --
 .../riscv/rvv/autovec/vfncvt-auto-vect.c  |  19 +++
 .../riscv/rvv/autovec/vfwcvt-auto-vect.c  |  19 +++
 .../riscv/rvv/autovec/vfwmacc-auto-vect.c |  14 ++
 4 files changed, 182 insertions(+), 14 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c

diff --git a/gcc/config/riscv/vector-bfloat16.md 
b/gcc/config/riscv/vector-bfloat16.md
index 562aa8ee5ed..e6482a83356 100644
--- a/gcc/config/riscv/vector-bfloat16.md
+++ b/gcc/config/riscv/vector-bfloat16.md
@@ -25,8 +25,24 @@
   (RVVMF2SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32 && 
TARGET_MIN_VLEN > 32")
 ])
 
-(define_mode_attr V_FP32TOBF16_TRUNC [
+(define_mode_iterator VSF [
+  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32") (RVVM4SF "TARGET_VECTOR_ELEN_FP_32") 
(RVVM2SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM1SF "TARGET_VECTOR_ELEN_FP_32") (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && 
TARGET_MIN_VLEN > 32")
+])
+
+(define_mode_iterator VDF [
+  (RVVM8DF "TARGET_VECTOR_ELEN_FP_64") (RVVM4DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM2DF "TARGET_VECTOR_ELEN_FP_64") (RVVM1DF "TARGET_VECTOR_ELEN_FP_64")
+])
+
+(define_mode_attr V_FPWIDETOBF16_TRUNC [
   (RVVM8SF "RVVM4BF") (RVVM4SF "RVVM2BF") (RVVM2SF "RVVM1BF") (RVVM1SF 
"RVVMF2BF") (RVVMF2SF "RVVMF4BF")
+  (RVVM8DF "RVVM2BF") (RVVM4DF "RVVM1BF") (RVVM2DF "RVVMF2BF") (RVVM1DF 
"RVVMF4BF")
+])
+
+(define_mode_attr v_fpwidetobf16_trunc [
+  (RVVM8SF "rvvm4bf") (RVVM4SF "rvvm2bf") (RVVM2SF "rvvm1bf") (RVVM1SF 
"rvvmf2bf") (RVVMF2SF "rvvmf4bf")
+  (RVVM8DF "rvvm2bf") (RVVM4DF "rvvm1bf") (RVVM2DF "rvvmf2bf") (RVVM1DF 
"rvvmf4bf")
 ])
 
 (define_mode_attr VF32_SUBEL [
@@ -35,8 +51,8 @@
 ;; Zvfbfmin extension
 
 (define_insn "@pred_trunc_to_bf16"
-  [(set (match_operand: 0 "register_operand"   "=vd, vd, 
vr, vr,  &vr,  &vr")
- (if_then_else:
+  [(set (match_operand: 0 "register_operand"   "=vd, vd, 
vr, vr,  &vr,  &vr")
+ (if_then_else:
(unspec:
  [(match_operand: 1 "vector_mask_operand"  " vm, 
vm,Wc1,Wc1,vmWc1,vmWc1")
   (match_operand 4 "vector_length_operand" " rK, rK, 
rK, rK,   rK,   rK")
@@ -47,13 +63,13 @@
   (reg:SI VL_REGNUM)
   (reg:SI VTYPE_REGNUM)
   (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
-   (float_truncate:
+   (float_truncate:
   (match_operand:VWEXTF_ZVFBF 3 "register_operand"  "  0,  0,  
0,  0,   vr,   vr"))
-   (match_operand: 2 "vector_merge_operand" " vu,  0, 
vu,  0,   vu,0")))]
+   (match_operand: 2 "vector_merge_operand" " vu,  
0, vu,  0,   vu,0")))]
   "TARGET_ZVFBFMIN"
   "vfncvtbf16.f.f.w\t%0,%3%p1"
   [(set_attr "type" "vfncvtbf16")
-   (set_attr "mode" "")
+   (set_attr "mode" "")
(set (attr "frm_mode")
(symbol_ref "riscv_vector::get_frm_mode (operands[8])"))])
 
@@ -69,12 +85,12 @@
  (reg:SI VL_REGNUM)
  (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
   (float_extend:VWEXTF_ZVFBF
- (match_operand: 3 "register_operand" "   vr,   
vr"))
+ (match_operand: 3 "register_operand" "   vr,   
vr"))
   (match_operand:VWEXTF_ZVFBF 2 "vector_merge_operand""   vu,
0")))]
   "TARGET_ZVFBFMIN"
   "vfwcvtbf16.f.f.v\t%0,%3%p1"
   [(set_attr "type" "vfwcvtbf16")
-   (set_attr "mode" "")])
+   (set_attr "mode" "")])
 
 
 (define_insn "@pred_widen_bf16_mul_"
@@ -93,15 +109,15 @@
   (plus:VWEXTF_ZVFBF
 (mult:VWEXTF_ZVFBF
   (float_extend:VWEXTF_ZVFBF
-(match_operand: 3 "register_operand" "   vr"))
+(match_operand: 3 "register_operand" "   
vr"))
   (float_extend:VWEXTF_ZVFBF
-(match_operand: 4 "register_operand" "   vr")))
+(match_operand: 4 "register_operand" "   
vr")))
 (match_operand:VWEXTF_ZVFBF 2 "register_operand" "0"))
   (match_dup 2)))]
   "TARGET_ZVFBFWMA"
   "vfwmaccbf16.vv\t%0,%3,%4%p1"
   [(set_attr "type" "vfwmaccbf16")
-   (set_attr "mode" "")
+   (set_attr "mode" "")
(set (attr "frm_mode")
(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))])
 
@@ -121,15 +137,115 @@
   (plus:

[PATCH v16b 1/4] contrib/: Add support for Cc: and Link: tags

2024-10-16 Thread Alejandro Colomar
contrib/ChangeLog:

* gcc-changelog/git_commit.py (GitCommit):
Add support for 'Cc: ' and 'Link: ' tags.

Cc: Jason Merrill 
Signed-off-by: Alejandro Colomar 
---
 contrib/gcc-changelog/git_commit.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 87ecb9e1a17..64fb986b74c 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -182,7 +182,8 @@ CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
 
 REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
'acked-by: ', 'tested-by: ', 'reported-by: ',
-   'suggested-by: ')
+   'suggested-by: ', 'cc: ')
+LINK_PREFIXES = ('link: ')
 DATE_FORMAT = '%Y-%m-%d'
 
 
@@ -524,6 +525,8 @@ class GitCommit:
 continue
 elif lowered_line.startswith(REVIEW_PREFIXES):
 continue
+elif lowered_line.startswith(LINK_PREFIXES):
+continue
 else:
 m = cherry_pick_regex.search(line)
 if m:
-- 
2.45.2



signature.asc
Description: PGP signature


[PATCH 1/3] PR 117048: simplify-rtx: Simplify (X << C1) [+,^] (X >> C2) into ROTATE

2024-10-16 Thread Kyrylo Tkachov
Hi all,

The motivating testcase for this is in AArch64 intrinsics:

uint64x2_t G2(uint64x2_t a, uint64x2_t b) {
uint64x2_t c = veorq_u64(a, b);
return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
}

which I was hoping to fold to a single XAR (a ROTATE+XOR instruction) but
GCC was failing to detect the rotate operation for two reasons:
1) The combination of the two arms of the expression is done under XOR rather
than IOR that simplify-rtx currently supports.
2) The ASHIFT operation is actually a (PLUS X X) operation and thus is not
detected as the LHS of the two arms we require.

The patch fixes both issues.  The analysis of the two arms of the rotation
expression is factored out into a common helper simplify_rotate which is
then used in the PLUS, XOR, IOR cases in simplify_binary_operation_1.

The check-assembly testcase for this is added in the following patch because
it needs some extra AArch64 backend work, but I've added self-tests in this
patch to validate the transformation.

Bootstrapped and tested on aarch64-none-linux-gnu

Ok for mainline?
Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov 

PR target/117048
* simplify-rtx.cc (extract_ashift_operands_p): Define.
(simplify_rotate_op): Likewise.
(simplify_context::simplify_binary_operation_1): Use the above in
the PLUS, IOR, XOR cases.
(test_vector_rotate): Define.
(test_vector_ops): Use the above.



0001-PR-117048-simplify-rtx-Simplify-X-C1-X-C2-into-ROTAT.patch
Description: 0001-PR-117048-simplify-rtx-Simplify-X-C1-X-C2-into-ROTAT.patch


[PATCH 2/3] PR 117048: aarch64: Add define_insn_and_split for vector ROTATE

2024-10-16 Thread Kyrylo Tkachov
Hi all,

The ultimate goal in this PR is to match the XAR pattern that is represented
as a (ROTATE (XOR X Y) VCST) from the ACLE intrinsics code in the testcase.
The first blocker for this was the missing recognition of ROTATE in
simplify-rtx, which is fixed in the previous patch.
The next problem is that once the ROTATE has been matched from the shifts
and orr/xor/plus, it will try to match it in an insn before trying to combine
the XOR into it.  But as we don't have a backend pattern for a vector ROTATE
this recog fails and combine does not try the followup XOR+ROTATE combination
which would have succeeded.

This patch solves that by introducing a sort of "scaffolding" pattern for
vector ROTATE, which allows it to be combined into the XAR.
If it fails to be combined into anything the splitter will break it back
down into the SHL+USRA sequence that it would have emitted.
By having this splitter we can special-case some rotate amounts in the future
to emit more specialised instructions e.g. from the REV* family.
This can be done if the ROTATE is not combined into something else.

This optimisation is done in the next patch in the series.

Bootstrapped and tested on aarch64-none-linux-gnu.

Interested in feedback on the approach, but this needs patch [1/3] before this 
has effect.

Signed-off-by: Kyrylo Tkachov 

gcc/

PR target/117048
* config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm):
New define_insn_and_split.

gcc/testsuite/

PR target/117048
* gcc.target/aarch64/simd/pr117048.c: New test.



0002-PR-117048-aarch64-Add-define_insn_and_split-for-vect.patch
Description: 0002-PR-117048-aarch64-Add-define_insn_and_split-for-vect.patch


[PATCH 3/3] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-16 Thread Kyrylo Tkachov
Hi all,

Some vector rotate operations can be implemented in a single instruction
rather than using the fallback SHL+USRA sequence.
In particular, when the rotate amount is half the bitwidth of the element
we can use a REV64,REV32,REV16 instruction.
This patch adds this transformation in the recently added splitter for vector
rotates.  I've also received requests to optimise vector rotates by any amount
that is a multiple of 8 into a TBL i.e. a vector permute, because the permute
constant can be hoisted outside of hot paths and TBL instructions have high
throughput on modern cores.  It is an interesting idea, but as it's not
strictly fewer instructions I have not implemented it here, but it's something
to consider.

I'm also adding an expander for the rotl3 standard name.
In some cases the vector rotate is detected very early on (even before GIMPLE?)
For example when using GNU vector extensions:
uint64x2_t G1 (uint64x2_t r) {
return (r >> 32) | (r << 32);
}
This gets optimised into a r>>32 fairly early on.  Because we do not have an
expander for such vector rotates the expand pass synthesises it with RTL
operations that end up generating the SHL+USRA sequence.  It seems wasteful
to expand it to multiple RTL ops only to then try to combine them back into
a ROTATE during combine.  Better to emit a simple ROTATE-by-vector-constant
RTX to give the early RTL passes a chace to optimise it or combine it into
something.

Bootstrapped and tested on aarch64-none-linux-gnu.
As with patch [2/3] interested in feedback on the approach.

Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov 

gcc/

* config/aarch64/aarch64-protos.h (aarch64_emit_opt_vec_rotate):
Declare prototype.
* config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Implement.
* config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm):
Call the above.
(rotl3): New define_expand.

gcc/testsuite/

* gcc.target/aarch64/simd/pr117048_2.c: New test.



0003-aarch64-Optimize-vector-rotates-into-REV-instruction.patch
Description: 0003-aarch64-Optimize-vector-rotates-into-REV-instruction.patch


RE: [PATCH v1] Internal-fn: Add new IFN mask_len_strided_load/store

2024-10-16 Thread Li, Pan2
It is quit a while since last discussion.
I recall these materials recently and have a try in the risc-v backend.

   1   │ void foo (int * __restrict a, int * __restrict b, int stride, int n)
   2   │ {
   3   │ for (int i = 0; i < n; i++)
   4   │   a[i*stride] = b[i*stride] + 100;
   5   │ }

We will have expand similar as below for VEC_SERIES_EXPR + MASK_LEN_GATHER_LOAD.
There will be 8 insns after expand which is not applicable when try_combine (at 
most 4 insn) if
my understand is correct.

Thus, is there any other approaches instead of adding new IFN? If we need to 
add new IFN, can
we leverage match.pd to try to match the MASK_LEN_GATHER_LOAD(base, 
VEC_SERICES_EXPR, ...)
pattern and then emit the new IFN like sat alu does.

Thanks a lot.

 316   │ ;; _58 = VEC_SERIES_EXPR <0, _57>;
 317   │
 318   │ (insn 17 16 18 (set (reg:DI 156 [ _56 ])
 319   │ (ashiftrt:DI (reg:DI 141 [ _54 ])
 320   │ (const_int 2 [0x2]))) -1
 321   │  (expr_list:REG_EQUAL (div:DI (reg:DI 141 [ _54 ])
 322   │ (const_int 4 [0x4]))
 323   │ (nil)))
 324   │
 325   │ (insn 18 17 19 (set (reg:DI 158)
 326   │ (unspec:DI [
 327   │ (const_int 32 [0x20])
 328   │ ] UNSPEC_VLMAX)) -1
 329   │  (nil))
 330   │
 331   │ (insn 19 18 20 (set (reg:RVVM1SI 157)
 332   │ (if_then_else:RVVM1SI (unspec:RVVMF32BI [
 333   │ (const_vector:RVVMF32BI repeat [
 334   │ (const_int 1 [0x1])
 335   │ ])
 336   │ (reg:DI 158)
 337   │ (const_int 2 [0x2]) repeated x2
 338   │ (const_int 1 [0x1])
 339   │ (reg:SI 66 vl)
 340   │ (reg:SI 67 vtype)
 341   │ ] UNSPEC_VPREDICATE)
 342   │ (vec_series:RVVM1SI (const_int 0 [0])
 343   │ (const_int 1 [0x1]))
 344   │ (unspec:RVVM1SI [
 345   │ (reg:DI 0 zero)
 346   │ ] UNSPEC_VUNDEF))) -1
 347   │  (nil))
 348   │
 349   │ (insn 20 19 21 (set (reg:DI 160)
 350   │ (unspec:DI [
 351   │ (const_int 32 [0x20])
 352   │ ] UNSPEC_VLMAX)) -1
 353   │  (nil))
 354   │
 355   │ (insn 21 20 22 (set (reg:RVVM1SI 159)
 356   │ (if_then_else:RVVM1SI (unspec:RVVMF32BI [
 357   │ (const_vector:RVVMF32BI repeat [
 358   │ (const_int 1 [0x1])
 359   │ ])
 360   │ (reg:DI 160)
 361   │ (const_int 2 [0x2]) repeated x2
 362   │ (const_int 1 [0x1])
 363   │ (reg:SI 66 vl)
 364   │ (reg:SI 67 vtype)
 365   │ ] UNSPEC_VPREDICATE)
 366   │ (mult:RVVM1SI (vec_duplicate:RVVM1SI (subreg:SI (reg:DI 
156 [ _56 ]) 0))
 367   │ (reg:RVVM1SI 157))
 368   │ (unspec:RVVM1SI [
 369   │ (reg:DI 0 zero)
 370   │ ] UNSPEC_VUNDEF))) -1
 371   │  (nil))
 ...
 403   │ ;; vect__5.16_61 = .MASK_LEN_GATHER_LOAD (vectp_b.14_59, _58, 4, { 0, 
... }, { -1, ... }, _73, 0);
 404   │
 405   │ (insn 27 26 28 (set (reg:RVVM2DI 161)
 406   │ (sign_extend:RVVM2DI (reg:RVVM1SI 145 [ _58 ]))) 
"strided_ld-st.c":4:22 -1
 407   │  (nil))
 408   │
 409   │ (insn 28 27 29 (set (reg:RVVM2DI 162)
 410   │ (ashift:RVVM2DI (reg:RVVM2DI 161)
 411   │ (const_int 2 [0x2]))) "strided_ld-st.c":4:22 -1
 412   │  (nil))
 413   │
 414   │ (insn 29 28 0 (set (reg:RVVM1SI 146 [ vect__5.16 ])
 415   │ (if_then_else:RVVM1SI (unspec:RVVMF32BI [
 416   │ (const_vector:RVVMF32BI repeat [
 417   │ (const_int 1 [0x1])
 418   │ ])
 419   │ (reg:DI 149 [ _73 ])
 420   │ (const_int 2 [0x2]) repeated x2
 421   │ (const_int 0 [0])
 422   │ (reg:SI 66 vl)
 423   │ (reg:SI 67 vtype)
 424   │ ] UNSPEC_VPREDICATE)
 425   │ (unspec:RVVM1SI [
 426   │ (reg/v/f:DI 151 [ b ])
 427   │ (mem:BLK (scratch) [0  A8])
 428   │ (reg:RVVM2DI 162)
 429   │ ] UNSPEC_UNORDERED)
 430   │ (unspec:RVVM1SI [
 431   │ (reg:DI 0 zero)
 432   │ ] UNSPEC_VUNDEF))) "strided_ld-st.c":4:22 -1
 433   │  (nil))

Pan


-Original Message-
From: Li, Pan2  
Sent: Wednesday, June 5, 2024 3:50 PM
To: Richard Biener ; Richard Sandiford 

Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
tamar.christ...@arm.com
Subject: RE: [PATCH v1] Internal-fn: Add new IFN mask_len_strided_load/store

Looks not easy to get the original context/history, only catch some shadow 

Re: [Ping, Fortran, Patch, PR80235, v1] Fix ICE when coarray from module is referenced in submodule.

2024-10-16 Thread Andre Vehreschild
Hi Paul,

thanks for the review. Committed as: gcc-15-4405-ge32fff675c3

Thanks again,
Andre

On Wed, 16 Oct 2024 14:41:09 +0100
Paul Richard Thomas  wrote:

> Hi Andre,
>
> The handling of submodules is something of a kludge, especially where
> module procedures are concerned, that seems to work OK. Given that, your
> patch looks right and is good for mainline.
>
> Thanks for the patch.
>
> Paul
>
>
> On Wed, 16 Oct 2024 at 08:21, Andre Vehreschild  wrote:
>
> > Hi all,
> >
> > PING.
> >
> > Re-Regtested ok again on x86_64-pc-linux-gnu. Ok for mainline?
> >
> > Regards,
> > Andre
> >
> > On Wed, 25 Sep 2024 12:29:21 +0200
> > Andre Vehreschild  wrote:
> >
> > > Hi all and esp. Paul,
> > >
> > > the attached patch fixes an ICE with coarrays defined in modules and
> > then used
> > > in submodules. Referencing the variable relied on the curr_module being
> > set in
> > > the gfc_build_qualified_array routine, which it was not. I therefore
> > took the
> > > name from the symbol. I don't know if this is correct. Paul any idea?
> > You have
> > > done the submodule part in the past. I am still not sure, if the coarray
> > is
> > > exported into the .mod file correctly, because it gets host_assoc set,
> > while I
> > > would expect it to have use_assoc set. So may be this patch is only
> > healing
> > > the symptom and not the cause. Could you have a look, please?
> > >
> > > Regtests ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline?
> > >
> > > Regards,
> > >   Andre
> > > --
> > > Andre Vehreschild * Email: vehre ad gmx dot de
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de
> >


--
Andre Vehreschild * Email: vehre ad gmx dot de


Re: [PATCH] SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.

2024-10-16 Thread Jennifer Schmitz


> On 16 Oct 2024, at 21:16, Richard Sandiford  wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Jennifer Schmitz  writes:
>> As suggested in
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html,
>> this patch adds the method gimple_folder::fold_active_lanes_to (tree X).
>> This method folds active lanes to X and sets inactive lanes according to
>> the predication, returning a new gimple statement. That makes folding of
>> SVE intrinsics easier and reduces code duplication in the
>> svxxx_impl::fold implementations.
>> Using this new method, svdiv_impl::fold and svmul_impl::fold were refactored.
>> Additionally, the method was used for two optimizations:
>> 1) Fold svdiv to the dividend, if the divisor is all ones and
>> 2) for svmul, if one of the operands is all ones, fold to the other operand.
>> Both optimizations were previously applied to _x and _m predication on
>> the RTL level, but not for _z, where svdiv/svmul were still being used.
>> For both optimization, codegen was improved by this patch, for example by
>> skipping sel instructions with all-same operands and replacing sel
>> instructions by mov instructions.
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz 
>> 
>> gcc/
>>  * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>>  Refactor using fold_active_lanes_to and fold to dividend, is the
>>  divisor is all ones.
>>  (svmul_impl::fold): Refactor using fold_active_lanes_to and fold
>>  to the other operand, if one of the operands is all ones.
>>  * config/aarch64/aarch64-sve-builtins.h: Declare
>>  gimple_folder::fold_active_lanes_to (tree).
>>  * config/aarch64/aarch64-sve-builtins.cc
>>  (gimple_folder::fold_actives_lanes_to): Add new method to fold
>>  actives lanes to given argument and setting inactives lanes
>>  according to the predication.
>> 
>> gcc/testsuite/
>>  * gcc.target/aarch64/sve/acle/asm/div_s32.c: Adjust expected outcome.
>>  * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
>>  * gcc.target/aarch64/sve/fold_div_zero.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_s16.c: New test.
>>  * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
>>  * gcc.target/aarch64/sve/mul_const_run.c: Likewise.
> 
> Thanks, this looks great.  Just one comment on the tests:
> 
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c
>> index d5a23bf0726..521f8bb4758 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c
>> @@ -57,7 +57,6 @@ TEST_UNIFORM_ZX (div_w0_s32_m_untied, svint32_t, int32_t,
>> 
>> /*
>> ** div_1_s32_m_tied1:
>> -**   sel z0\.s, p0, z0\.s, z0\.s
>> **   ret
>> */
>> TEST_UNIFORM_Z (div_1_s32_m_tied1, svint32_t,
>> @@ -66,7 +65,7 @@ TEST_UNIFORM_Z (div_1_s32_m_tied1, svint32_t,
>> 
>> /*
>> ** div_1_s32_m_untied:
>> -**   sel z0\.s, p0, z1\.s, z1\.s
>> +**   mov z0\.d, z1\.d
>> **   ret
>> */
>> TEST_UNIFORM_Z (div_1_s32_m_untied, svint32_t,
>> @@ -217,9 +216,8 @@ TEST_UNIFORM_ZX (div_w0_s32_z_untied, svint32_t, int32_t,
>> 
>> /*
>> ** div_1_s32_z_tied1:
>> -**   mov (z[0-9]+\.s), #1
>> -**   movprfx z0\.s, p0/z, z0\.s
>> -**   sdivz0\.s, p0/m, z0\.s, \1
>> +**   mov (z[0-9]+)\.b, #0
>> +**   sel z0\.s, p0, z0\.s, \1\.s
>> **   ret
>> */
>> TEST_UNIFORM_Z (div_1_s32_z_tied1, svint32_t,
> 
> Tamar will soon push a patch to change how we generate zeros.
> Part of that will involve rewriting existing patterns to be more
> forgiving about the exact instruction that is used to zero a register.
> 
> The new preferred way of matching zeros is:
> 
> **  movi?   [vdz]([0-9]+)\.(?:[0-9]*[bhsd])?, #?0
> 
> (yeah, it's a bit of mouthful).  Could you change all the tests
> to use that?  The regexp only captures the register number, so uses
> of \1 etc. will need to become z\1.
> 
> OK with that change.  But would you mind waiting until Tamar pushes
> his patch ("AArch64: use movi d0, #0 to clear SVE registers instead
> of mov z0.d, #0"), just to make sure that the tests work with that?
> 
Thanks for the review. Sure, I can make the changes, wait for Tamar’s patch, 
and re-validate after rebasing.
One question about the regexp pattern:
The “\.” is outside the second c

Re: [PATCH 2/7] libstdc++: Make __normal_iterator constexpr, always_inline, nodiscard

2024-10-16 Thread Jonathan Wakely
On Thu, 17 Oct 2024, 03:33 Patrick Palka,  wrote:

> On Tue, 15 Oct 2024, Jonathan Wakely wrote:
>
> > Tested x86_64-linux.
> >
> > -- >8 --
> >
> > The __gnu_cxx::__normal_iterator type we use for std::vector::iterator
> > is not specified by the standard, it's an implementation detail. This
> > means it's not constrained by the rule that forbids strengthening
> > constexpr. We can make it meet the constexpr iterator requirements for
> > older standards, not only when it's required to be for C++20.
> >
> > For the non-const member functions they can't be constexpr in C++11, so
> > use _GLIBCXX14_CONSTEXPR for those. For all constructors, const members
> > and non-member operator overloads, use _GLIBCXX_CONSTEXPR or just
> > constexpr.
> >
> > We can also liberally add [[nodiscard]] and [[gnu::always_inline]]
> > attributes to those functions.
> >
> > Also change some internal helpers for std::move_iterator which can be
> > unconditionally constexpr and marked nodiscard.
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/bits/stl_iterator.h (__normal_iterator): Make all
> >   members and overloaded operators constexpr before C++20.
> >   (__niter_base, __niter_wrap, __to_address): Add nodiscard
> >   and always_inline attributes.
> >   (__make_move_if_noexcept_iterator, __miter_base): Add nodiscard
> >   and make unconditionally constexpr.
> > ---
> >  libstdc++-v3/include/bits/stl_iterator.h | 125 ++-
> >  1 file changed, 76 insertions(+), 49 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/bits/stl_iterator.h
> b/libstdc++-v3/include/bits/stl_iterator.h
> > index 85b9861..3cc10a160bd 100644
> > --- a/libstdc++-v3/include/bits/stl_iterator.h
> > +++ b/libstdc++-v3/include/bits/stl_iterator.h
> > @@ -656,7 +656,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >template
> >  _GLIBCXX20_CONSTEXPR
> > -auto
> > +inline auto
> >  __niter_base(reverse_iterator<_Iterator> __it)
> >  -> decltype(__make_reverse_iterator(__niter_base(__it.base(
> >  { return __make_reverse_iterator(__niter_base(__it.base())); }
> > @@ -668,7 +668,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >template
> >  _GLIBCXX20_CONSTEXPR
> > -auto
> > +inline auto
>
> These 'inline' additions aren't mentioned in the ChangeLog it seems.


Good point.

Is
> the intent of these inlines solely as a compiler hint?
>

Yes, but they do a little more work than the really trivial ones, and are
less important, less frequently used, so I didn't think always_inline was
justified.


> >  __miter_base(reverse_iterator<_Iterator> __it)
> >  -> decltype(__make_reverse_iterator(__miter_base(__it.base(
> >  { return __make_reverse_iterator(__miter_base(__it.base())); }
> > @@ -1060,23 +1060,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >using iterator_concept = std::__detail::__iter_concept<_Iterator>;
> >  #endif
> >
> > -  _GLIBCXX_CONSTEXPR __normal_iterator() _GLIBCXX_NOEXCEPT
> > -  : _M_current(_Iterator()) { }
> > +  __attribute__((__always_inline__))
> > +  _GLIBCXX_CONSTEXPR
> > +  __normal_iterator() _GLIBCXX_NOEXCEPT
> > +  : _M_current() { }
> >
> > -  explicit _GLIBCXX20_CONSTEXPR
> > +  __attribute__((__always_inline__))
> > +  explicit _GLIBCXX_CONSTEXPR
> >__normal_iterator(const _Iterator& __i) _GLIBCXX_NOEXCEPT
> >: _M_current(__i) { }
> >
> >// Allow iterator to const_iterator conversion
> >  #if __cplusplus >= 201103L
> >template>
> > - _GLIBCXX20_CONSTEXPR
> > + [[__gnu__::__always_inline__]]
> > + constexpr
> >   __normal_iterator(const __normal_iterator<_Iter, _Container>& __i)
> >   noexcept
> >  #else
> >// N.B. _Container::pointer is not actually in container
> requirements,
> >// but is present in std::vector and std::basic_string.
> >template
> > + __attribute__((__always_inline__))
> >  __normal_iterator(const __normal_iterator<_Iter,
> > typename __enable_if<
> >  (std::__are_same<_Iter, typename
> _Container::pointer>::__value),
> > @@ -1085,17 +1090,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >  : _M_current(__i.base()) { }
> >
> >// Forward iterator requirements
> > -  _GLIBCXX20_CONSTEXPR
> > +
> > +  __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
> > +  _GLIBCXX_CONSTEXPR
> >reference
> >operator*() const _GLIBCXX_NOEXCEPT
> >{ return *_M_current; }
> >
> > -  _GLIBCXX20_CONSTEXPR
> > +  __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
> > +  _GLIBCXX_CONSTEXPR
> >pointer
> >operator->() const _GLIBCXX_NOEXCEPT
> >{ return _M_current; }
> >
> > -  _GLIBCXX20_CONSTEXPR
> > +  __attribute__((__always_inline__))
> > +  _GLIBCXX14_CONSTEXPR
> >__normal_iterator&
> >operator++() _GLIBCXX_NOEXCEPT
> >{
> > @@ -1103,13 +1112,16 @@ _

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-16 Thread Soumya AR
Hi Richard,

Thanks for the feedback. I’ve updated the patch with the suggested change.
Ok for mainline?

Best,
Soumya

> On 14 Oct 2024, at 6:40 PM, Richard Sandiford  
> wrote:
>
> External email: Use caution opening links or attachments
>
>
> Soumya AR  writes:
>> This patch implements constant folding for svlsl. Test cases have been added 
>> to
>> check for the following cases:
>>
>> Zero, merge, and don't care predication.
>> Shift by 0.
>> Shift by register width.
>> Overflow shift on signed and unsigned integers.
>> Shift on a negative integer.
>> Maximum possible shift, eg. shift by 7 on an 8-bit integer.
>>
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>>
>> Signed-off-by: Soumya AR 
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::fold):
>>  Try constant folding.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/aarch64/sve/const_fold_lsl_1.c: New test.
>>
>> From 0cf5223e51623dcdbc47a06cbd17d927c74094e2 Mon Sep 17 00:00:00 2001
>> From: Soumya AR 
>> Date: Tue, 24 Sep 2024 09:09:32 +0530
>> Subject: [PATCH] SVE intrinsics: Fold constant operands for svlsl.
>>
>> This patch implements constant folding for svlsl. Test cases have been added 
>> to
>> check for the following cases:
>>
>> Zero, merge, and don't care predication.
>> Shift by 0.
>> Shift by register width.
>> Overflow shift on signed and unsigned integers.
>> Shift on a negative integer.
>> Maximum possible shift, eg. shift by 7 on an 8-bit integer.
>>
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>>
>> Signed-off-by: Soumya AR 
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::fold):
>>  Try constant folding.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/aarch64/sve/const_fold_lsl_1.c: New test.
>> ---
>> .../aarch64/aarch64-sve-builtins-base.cc  |  15 +-
>> .../gcc.target/aarch64/sve/const_fold_lsl_1.c | 133 ++
>> 2 files changed, 147 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/const_fold_lsl_1.c
>>
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
>> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> index afce52a7e8d..be5d6eae525 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> @@ -1893,6 +1893,19 @@ public:
>>   }
>> };
>>
>> +class svlsl_impl : public rtx_code_function
>> +{
>> +public:
>> +  CONSTEXPR svlsl_impl ()
>> +: rtx_code_function (ASHIFT, ASHIFT) {}
>> +
>> +  gimple *
>> +  fold (gimple_folder &f) const override
>> +  {
>> +return f.fold_const_binary (LSHIFT_EXPR);
>> +  }
>> +};
>> +
>
> Sorry for the slow review.  I think we should also make aarch64_const_binop
> return 0 for LSHIFT_EXPR when the shift is out of range, to match the
> behaviour of the underlying instruction.
>
> It looks good otherwise.
>
> Thanks,
> Richard
>
>> class svmad_impl : public function_base
>> {
>> public:
>> @@ -3199,7 +3212,7 @@ FUNCTION (svldnf1uh, svldxf1_extend_impl, 
>> (TYPE_SUFFIX_u16, UNSPEC_LDNF1))
>> FUNCTION (svldnf1uw, svldxf1_extend_impl, (TYPE_SUFFIX_u32, UNSPEC_LDNF1))
>> FUNCTION (svldnt1, svldnt1_impl,)
>> FUNCTION (svlen, svlen_impl,)
>> -FUNCTION (svlsl, rtx_code_function, (ASHIFT, ASHIFT))
>> +FUNCTION (svlsl, svlsl_impl,)
>> FUNCTION (svlsl_wide, shift_wide, (ASHIFT, UNSPEC_ASHIFT_WIDE))
>> FUNCTION (svlsr, rtx_code_function, (LSHIFTRT, LSHIFTRT))
>> FUNCTION (svlsr_wide, shift_wide, (LSHIFTRT, UNSPEC_LSHIFTRT_WIDE))
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/const_fold_lsl_1.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/const_fold_lsl_1.c
>> new file mode 100644
>> index 000..4299dbd850e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/const_fold_lsl_1.c
>> @@ -0,0 +1,133 @@
>> +/* { dg-final { check-function-bodies "**" "" } } */
>> +/* { dg-options "-O2" } */
>> +
>> +#include "arm_sve.h"
>> +
>> +/*
>> +** s64_x:
>> +**   mov z[0-9]+\.d, #20
>> +**   ret
>> +*/
>> +svint64_t s64_x (svbool_t pg) {
>> +return svlsl_n_s64_x (pg, svdup_s64 (5), 2);
>> +}
>> +
>> +/*
>> +** s64_x_vect:
>> +**   mov z[0-9]+\.d, #20
>> +**   ret
>> +*/
>> +svint64_t s64_x_vect (svbool_t pg) {
>> +return svlsl_s64_x (pg, svdup_s64 (5), svdup_u64 (2));
>> +}
>> +
>> +/*
>> +** s64_z:
>> +**   mov z[0-9]+\.d, p[0-7]/z, #20
>> +**   ret
>> +*/
>> +svint64_t s64_z (svbool_t pg) {
>> +return svlsl_n_s64_z (pg, svdup_s64 (5), 2);
>> +}
>> +
>> +/*
>> +** s64_z_vect:
>> +**   mov z[0-9]+\.d, p[0-7]/z, #20
>> +**   ret
>> +*/
>> +svint64_t s64_z_vect (svbool_t pg) {
>> +return svlsl_s64_z (pg, svdup_s64 (5), svdup_u64 (2));
>> +}
>> +
>> +/*
>> +** s64_m_ptrue:
>> +**   mov z[0-9]+\.d, #20
>> +**   ret
>> +*/
>> +svint64_t s64_m_ptrue () {
>> +return svlsl_n_s64_m (svptrue_b64 (), svdup_s64 (5), 2);
>> +

Re: [PATCH v4] libstdc++: implement concatenation of strings and string_views

2024-10-16 Thread François Dumont
As a side note you should provide your patches as .txt files so that any 
email client can render it without going through an editor.


And regarding the patch, I wonder what the std::move is for on the 
returned value ?


Like this one:

+    {
+  return std::move(__lhs.append(__rhs));
+    }

As it's a C&P the question might not be for you Giuseppe.

François


On 13/10/2024 19:59, Giuseppe D'Angelo wrote:

Hello,

On 09/10/2024 22:39, Patrick Palka wrote:

+#if __glibcxx_string_view >= 202403L
+  // const string & + string_view
+  template
+    [[nodiscard]]
+    constexpr inline basic_string<_CharT, _Traits, _Alloc>


Redundant 'inline's


+    operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
+   type_identity_t> __rhs)
+    {
+  typedef basic_string<_CharT, _Traits, _Alloc> _Str;


These typedefs might as well be usings instead

Besides that LGTM!


Thank you for the review, updated patch attached to fix both of these.
(Just for the record, these had been C&P from the corresponding 
operator+ overloads that deal with const char *.)


Thanks,


[PATCH] [AVX512] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-16 Thread liuhongt
r12-6103-g1a7ce8570997eb combines vpcmpuw + zero_extend to vpcmpuw
with the pre_reload splitter, but the splitter transforms the
zero_extend into a subreg which make reload think the upper part is
garbage, it's not correct.

The patch adjusts the zero_extend define_insn_and_split to
define_insn to keep zero_extend.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready push to trunk.

gcc/ChangeLog:

PR target/117159
* config/i386/sse.md
(*_cmp3_zero_extend):
Change from define_insn_and_split to define_insn.
(*_cmp3_zero_extend):
Ditto.
(*_ucmp3_zero_extend):
Ditto.
(*_ucmp3_zero_extend):
Ditto.
(*_cmp3_zero_extend_2):
Split to the zero_extend pattern.
(*_cmp3_zero_extend_2):
Ditto.
(*_ucmp3_zero_extend_2):
Ditto.
(*_ucmp3_zero_extend_2):
Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117159.c: New test.
* gcc.target/i386/avx512bw-pr103750-1.c: Remove xfail.
* gcc.target/i386/avx512bw-pr103750-2.c: Remove xfail.
---
 gcc/config/i386/sse.md| 186 +++---
 .../gcc.target/i386/avx512bw-pr103750-1.c |   3 +-
 .../gcc.target/i386/avx512bw-pr103750-2.c |   3 +-
 gcc/testsuite/gcc.target/i386/pr117159.c  |  42 
 4 files changed, 113 insertions(+), 121 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr117159.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a45b50ad732..06c2c9d7a5e 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4298,32 +4298,19 @@ (define_insn 
"_cmp3"
 
 ;; Since vpcmpd implicitly clear the upper bits of dest, transform
 ;; vpcmpd + zero_extend to vpcmpd since the instruction
-(define_insn_and_split 
"*_cmp3_zero_extend"
-  [(set (match_operand:SWI248x 0 "register_operand")
+(define_insn "*_cmp3_zero_extend"
+  [(set (match_operand:SWI248x 0 "register_operand" "=k")
(zero_extend:SWI248x
  (unspec:
-   [(match_operand:V48H_AVX512VL 1 "nonimmediate_operand")
-(match_operand:V48H_AVX512VL 2 "nonimmediate_operand")
-(match_operand:SI 3 "const_0_to_7_operand")]
+   [(match_operand:V48H_AVX512VL 1 "nonimmediate_operand" "v")
+(match_operand:V48H_AVX512VL 2 "nonimmediate_operand" "vm")
+(match_operand:SI 3 "const_0_to_7_operand" "n")]
UNSPEC_PCMP)))]
   "TARGET_AVX512F
&& (!VALID_MASK_AVX512BW_MODE (mode) || TARGET_AVX512BW)
-   && ix86_pre_reload_split ()
&& (GET_MODE_NUNITS (mode)
   < GET_MODE_PRECISION (mode))"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-   (unspec:
- [(match_dup 1)
-  (match_dup 2)
-  (match_dup 3)]
- UNSPEC_PCMP))]
-{
-  operands[1] = force_reg (mode, operands[1]);
-  operands[0] = lowpart_subreg (mode,
-operands[0], mode);
-}
+  "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ssecmp")
(set_attr "length_immediate" "1")
(set_attr "prefix" "evex")
@@ -4351,21 +4338,19 @@ (define_insn_and_split 
"*_cmp3_zero_extend
- [(match_dup 1)
-  (match_dup 2)
-  (match_dup 3)]
- UNSPEC_PCMP))
-   (set (match_dup 4) (match_dup 0))]
+(zero_extend:SWI248x
+ (unspec:
+   [(match_dup 1)
+(match_dup 2)
+(match_dup 3)]
+   UNSPEC_PCMP)))
+   (set (match_dup 4) (match_dup 5))]
 {
-  operands[1] = force_reg (mode, operands[1]);
-  operands[0] = lowpart_subreg (mode,
+  operands[5] = lowpart_subreg (mode,
operands[0], mode);
-}
-  [(set_attr "type" "ssecmp")
-   (set_attr "length_immediate" "1")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "")])
+  SUBREG_PROMOTED_VAR_P (operands[5]) = 1;
+  SUBREG_PROMOTED_SET (operands[5], 1);
+})
 
 (define_insn_and_split "*_cmp3"
   [(set (match_operand: 0 "register_operand")
@@ -4400,31 +4385,18 @@ (define_insn 
"_cmp3"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_insn_and_split 
"*_cmp3_zero_extend"
-  [(set (match_operand:SWI248x 0 "register_operand")
+(define_insn "*_cmp3_zero_extend"
+  [(set (match_operand:SWI248x 0 "register_operand" "=k")
(zero_extend:SWI248x
  (unspec:
-   [(match_operand:VI12_AVX512VL 1 "nonimmediate_operand")
-(match_operand:VI12_AVX512VL 2 "nonimmediate_operand")
-(match_operand:SI 3 "const_0_to_7_operand")]
+   [(match_operand:VI12_AVX512VL 1 "nonimmediate_operand" "v")
+(match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
+(match_operand:SI 3 "const_0_to_7_operand" "n")]
UNSPEC_PCMP)))]
   "TARGET_AVX512BW
-  && ix86_pre_reload_split ()
-  && (GET_MODE_NUNITS (mode)
-  < GET_MODE_PRECISION (mode))"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-   (unspec:
- [(match_dup 1)
-  (match_dup 2)
-  (ma

[PATCH 2/2] Only do switch bit test clustering when multiple labels point to same bb

2024-10-16 Thread Andi Kleen
From: Andi Kleen 

The bit cluster code generation strategy is only beneficial when
multiple case labels point to the same code. Do a quick check if
that is the case before trying to cluster.

This fixes the switch part of PR117091 where all case labels are unique
however it doesn't address the performance problems for non unique
cases.

gcc/ChangeLog:

PR middle-end/117091
* gimple-if-to-switch.cc (if_chain::is_beneficial): Update
find_bit_test call.
* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests):
Get max_c argument and bail out early if all case labels are
unique.
(switch_decision_tree::compute_cases_per_edge): Record number of
targets per label and return.
(switch_decision_tree::analyze_switch_statement): ... pass to
find_bit_tests.
* tree-switch-conversion.h: Update prototypes.
---
 gcc/gimple-if-to-switch.cc|  2 +-
 gcc/tree-switch-conversion.cc | 23 ---
 gcc/tree-switch-conversion.h  |  5 +++--
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
index 96ce1c380a59..4151d1bb520e 100644
--- a/gcc/gimple-if-to-switch.cc
+++ b/gcc/gimple-if-to-switch.cc
@@ -254,7 +254,7 @@ if_chain::is_beneficial ()
   else
 output.release ();
 
-  output = bit_test_cluster::find_bit_tests (filtered_clusters);
+  output = bit_test_cluster::find_bit_tests (filtered_clusters, 2);
   r = output.length () < filtered_clusters.length ();
   if (r)
 dump_clusters (&output, "BT can be built");
diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
index 00426d46..bb7b8cf215a3 100644
--- a/gcc/tree-switch-conversion.cc
+++ b/gcc/tree-switch-conversion.cc
@@ -1772,12 +1772,13 @@ jump_table_cluster::is_beneficial (const vec 
&,
 }
 
 /* Find bit tests of given CLUSTERS, where all members of the vector
-   are of type simple_cluster.  New clusters are returned.  */
+   are of type simple_cluster. max_c is the max number of cases per label.
+   New clusters are returned.  */
 
 vec
-bit_test_cluster::find_bit_tests (vec &clusters)
+bit_test_cluster::find_bit_tests (vec &clusters, int max_c)
 {
-  if (!is_enabled ())
+  if (!is_enabled () || max_c == 1)
 return clusters.copy ();
 
   unsigned l = clusters.length ();
@@ -2206,18 +2207,26 @@ bit_test_cluster::hoist_edge_and_branch_if_true 
(gimple_stmt_iterator *gsip,
 }
 
 /* Compute the number of case labels that correspond to each outgoing edge of
-   switch statement.  Record this information in the aux field of the edge.  */
+   switch statement.  Record this information in the aux field of the edge.
+   Return the approx max number of cases per edge.  */
 
-void
+int
 switch_decision_tree::compute_cases_per_edge ()
 {
+  int max_c = 0;
   reset_out_edges_aux (m_switch);
   int ncases = gimple_switch_num_labels (m_switch);
   for (int i = ncases - 1; i >= 1; --i)
 {
   edge case_edge = gimple_switch_edge (cfun, m_switch, i);
   case_edge->aux = (void *) ((intptr_t) (case_edge->aux) + 1);
+  /* For a range case add one extra. That's enough for the bit
+cluster heuristic.  */
+  if ((intptr_t)case_edge->aux > max_c)
+   max_c = (intptr_t)case_edge->aux +
+   !!CASE_HIGH (gimple_switch_label (m_switch, i));
 }
+  return max_c;
 }
 
 /* Analyze switch statement and return true when the statement is expanded
@@ -2235,7 +2244,7 @@ switch_decision_tree::analyze_switch_statement ()
   m_case_bbs.reserve (l);
   m_case_bbs.quick_push (default_bb);
 
-  compute_cases_per_edge ();
+  int max_c = compute_cases_per_edge ();
 
   for (unsigned i = 1; i < l; i++)
 {
@@ -2256,7 +2265,7 @@ switch_decision_tree::analyze_switch_statement ()
   reset_out_edges_aux (m_switch);
 
   /* Find bit-test clusters.  */
-  vec output = bit_test_cluster::find_bit_tests (clusters);
+  vec output = bit_test_cluster::find_bit_tests (clusters, max_c);
 
   /* Find jump table clusters.  */
   vec output2;
diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
index fbfd7ff7b3ff..15f919f24f9f 100644
--- a/gcc/tree-switch-conversion.h
+++ b/gcc/tree-switch-conversion.h
@@ -399,7 +399,7 @@ public:
 
   /* Find bit tests of given CLUSTERS, where all members of the vector
  are of type simple_cluster.  New clusters are returned.  */
-  static vec find_bit_tests (vec &clusters);
+  static vec find_bit_tests (vec &clusters, int max_c);
 
   /* Return true when RANGE of case values with UNIQ labels
  can build a bit test.  */
@@ -577,8 +577,9 @@ public:
   bool try_switch_expansion (vec &clusters);
   /* Compute the number of case labels that correspond to each outgoing edge of
  switch statement.  Record this information in the aux field of the edge.
+ Returns max number of cases per edge.
  */
-  void compute_cases_per_edge ();
+  int compute_cases_per_edge ();
 
   /* Before switch transformation, recor

[PATCH 1/2] Disable -fbit-tests and -fjump-tables at -O0

2024-10-16 Thread Andi Kleen
From: Andi Kleen 

gcc/ChangeLog:

* common.opt: Enable -fbit-tests and -fjump-tables only at -O1.
* tree-switch-conversion.h (jump_table_cluster::is_enabled):
  Dito.
---
 gcc/common.opt   | 4 ++--
 gcc/tree-switch-conversion.h | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 12b25ff486de..4af7a94fea42 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2189,11 +2189,11 @@ Common Var(flag_ivopts) Init(1) Optimization
 Optimize induction variables on trees.
 
 fjump-tables
-Common Var(flag_jump_tables) Init(1) Optimization
+Common Var(flag_jump_tables) Init(-1) Optimization
 Use jump tables for sufficiently large switch statements.
 
 fbit-tests
-Common Var(flag_bit_tests) Init(1) Optimization
+Common Var(flag_bit_tests) Init(-1) Optimization
 Use bit tests for sufficiently large switch statements.
 
 fkeep-inline-functions
diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
index 6468995eb316..fbfd7ff7b3ff 100644
--- a/gcc/tree-switch-conversion.h
+++ b/gcc/tree-switch-conversion.h
@@ -442,7 +442,7 @@ public:
   /* Return whether bit test expansion is allowed.  */
   static inline bool is_enabled (void)
   {
-return flag_bit_tests;
+return flag_bit_tests >= 0 ? flag_bit_tests : (optimize >= 1);
   }
 
   /* True when the jump table handles an entire switch statement.  */
@@ -524,7 +524,8 @@ bool jump_table_cluster::is_enabled (void)
  over-ruled us, we really have no choice.  */
   if (!targetm.have_casesi () && !targetm.have_tablejump ())
 return false;
-  if (!flag_jump_tables)
+  int flag = flag_jump_tables >= 0 ? flag_jump_tables : (optimize >= 1);
+  if (!flag)
 return false;
 #ifndef ASM_OUTPUT_ADDR_DIFF_ELT
   if (flag_pic)
-- 
2.46.2



Re: [PATCH v1] contrib/: Configure git-format-patch(1) to add To: gcc-patches@gcc.gnu.org

2024-10-16 Thread Alejandro Colomar
On Wed, Oct 16, 2024 at 03:41:00PM GMT, Eric Gallager wrote:
> On Wed, Oct 16, 2024 at 8:55 AM Alejandro Colomar  wrote:
> >
> > Just like we already do for git-send-email(1).  In some cases, patches
> > are prepared with git-format-patch(1), but are sent with a different
> > program, or some flags to git-send-email(1) may accidentally inhibit the
> > configuration.  By adding the TO in the email file, we make sure that
> > gcc-patches@ will receive the patch.
> >
> > contrib/ChangeLog:
> >
> > * gcc-git-customization.sh: Configure git-format-patch(1) to add
> > 'To: gcc-patches@gcc.gnu.org'.
> >
> > Signed-off-by: Alejandro Colomar 
> > ---
> >  contrib/gcc-git-customization.sh | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/contrib/gcc-git-customization.sh 
> > b/contrib/gcc-git-customization.sh
> > index 54bd35ea1aa..1cd1b4472f0 100755
> > --- a/contrib/gcc-git-customization.sh
> > +++ b/contrib/gcc-git-customization.sh
> > @@ -43,6 +43,7 @@ git config diff.md.xfuncname '^\(define.*$'
> >
> >  # Tell git send-email where patches go.
> 
> I'd update this comment, too, while you're at it.

Ok, will do.  I was wondering if it was worth touching the line.  :)

Have a lovely night!
Alex

> 
> >  # ??? Maybe also set sendemail.tocmd to guess from MAINTAINERS?
> > +git config format.to 'gcc-patches@gcc.gnu.org'
> >  git config sendemail.to 'gcc-patches@gcc.gnu.org'
> >
> >  set_user=$(git config --get "user.name")
> > --
> > 2.45.2
> >

-- 



signature.asc
Description: PGP signature


Re: [PATCH][LRA][PR116550] Reuse scratch registers generated by LRA

2024-10-16 Thread Vladimir Makarov



On 10/10/24 14:32, Denis Chertykov wrote:


The patch is very simple.
On x86_64, it bootstraps+regtests fine.
Ok for trunk?

Sorry for the delay with the answer. I missed your patch and pinging it 
was the right thing to do.


Thanks for the detail explanation of the problem which makes me easy to 
approve your patch.


I don't expect that the patch will create some problems for other 
targets, but LRA patch behavior prediction can be very tricky.  So 
please still pay attention for possible issues on the other targets for 
couple days.


The patch is ok to commit to the trunk.  Thank you for the patch, Denis.




PR target/116550
gcc/
* lra-constraints.cc (get_reload_reg): Reuse scratch registers
generated by LRA.


diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index fdcc07764a2..1f63113f321 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -680,7 +680,8 @@ get_reload_reg (enum op_type type, machine_mode 
mode, rtx original,

  used by reload instructions.  */
   if (REG_P (original)
   && (int) REGNO (original) >= new_regno_start
-  && INSN_UID (curr_insn) >= new_insn_uid_start
+  && (INSN_UID (curr_insn) >= new_insn_uid_start
+  || ira_former_scratch_p (REGNO (original)))
   && in_class_p (original, rclass, &new_class, true))
 {
   unsigned int regno = REGNO (original);





Re: [PATCH] c, libcpp, v2: Partially implement C2Y N3353 paper [PR117028]

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Jakub Jelinek wrote:

> On Tue, Oct 15, 2024 at 05:40:58PM +, Joseph Myers wrote:
> > > --- gcc/testsuite/gcc.dg/cpp/c23-delimited-escape-seq-1.c.jj  
> > > 2024-10-14 17:58:54.436815339 +0200
> > > +++ gcc/testsuite/gcc.dg/cpp/c23-delimited-escape-seq-1.c 2024-10-14 
> > > 17:59:05.032666716 +0200
> > > @@ -0,0 +1,87 @@
> > > +/* P2290R3 - Delimited escape sequences */
> > 
> > I don't think the comments on this and other C tests should reference a 
> > C++ paper.
> 
> Ok, changed.
> 
> > I think there should also be tests using digit separators with the 0o / 0O 
> > prefixes (both valid cases, and testing the error for having the digit 
> > separator immediately after 0o / 0O).
> 
> Done.  Also added test for 0b1'01 because we only had test for invalid 0b'0
> and 0B'1.
> 
> Tested on x86_64-linux and i686-linux, ok for trunk?

This version is OK.

-- 
Joseph S. Myers
josmy...@redhat.com



[PATCH][v5] RISC-V: add option -m(no-)autovec-segment

2024-10-16 Thread Edwin Lu
From: Greg McGary 

Add option -m(no-)autovec-segment to enable/disable autovectorizer
from emitting vector segment load/store instructions. This is useful for
performance experiments.

gcc/ChangeLog:
* config/riscv/autovec.md (vec_mask_len_load_lanes, 
vec_mask_len_store_lanes):
  Predicate with TARGET_VECTOR_AUTOVEC_SEGMENT
* gcc/config/riscv/riscv-opts.h (TARGET_VECTOR_AUTOVEC_SEGMENT): New 
macro.
* gcc/config/riscv/riscv.opt (-m(no-)autovec-segment): New option.
* testsuite/gcc.target/riscv/rvv/autovec/struct/*_noseg*.c,
testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: New tests.
---
Relying on CI for testing. Please wait for testing to complete before
committing.

v5 changelog:
Remove vsetivli scan tests as they may be flakey given the number of
found vsetivli's can change depending on configurations
---
 gcc/config/riscv/autovec.md   |  4 +-
 gcc/config/riscv/riscv-opts.h |  5 ++
 gcc/config/riscv/riscv.opt|  4 ++
 .../gcc.target/riscv/rvv/autovec/no-segment.c | 61 +++
 .../autovec/struct/mask_struct_load_noseg-1.c |  6 ++
 .../autovec/struct/mask_struct_load_noseg-2.c |  6 ++
 .../autovec/struct/mask_struct_load_noseg-3.c |  6 ++
 .../autovec/struct/mask_struct_load_noseg-4.c |  6 ++
 .../autovec/struct/mask_struct_load_noseg-5.c |  6 ++
 .../autovec/struct/mask_struct_load_noseg-6.c |  6 ++
 .../autovec/struct/mask_struct_load_noseg-7.c |  6 ++
 .../struct/mask_struct_load_noseg_run-1.c |  4 ++
 .../struct/mask_struct_load_noseg_run-2.c |  4 ++
 .../struct/mask_struct_load_noseg_run-3.c |  4 ++
 .../struct/mask_struct_load_noseg_run-4.c |  4 ++
 .../struct/mask_struct_load_noseg_run-5.c |  4 ++
 .../struct/mask_struct_load_noseg_run-6.c |  4 ++
 .../struct/mask_struct_load_noseg_run-7.c |  4 ++
 .../struct/mask_struct_store_noseg-1.c|  6 ++
 .../struct/mask_struct_store_noseg-2.c|  6 ++
 .../struct/mask_struct_store_noseg-3.c|  6 ++
 .../struct/mask_struct_store_noseg-4.c|  6 ++
 .../struct/mask_struct_store_noseg-5.c|  6 ++
 .../struct/mask_struct_store_noseg-6.c|  6 ++
 .../struct/mask_struct_store_noseg-7.c|  6 ++
 .../struct/mask_struct_store_noseg_run-1.c|  4 ++
 .../struct/mask_struct_store_noseg_run-2.c|  4 ++
 .../struct/mask_struct_store_noseg_run-3.c|  4 ++
 .../struct/mask_struct_store_noseg_run-4.c|  4 ++
 .../struct/mask_struct_store_noseg_run-5.c|  4 ++
 .../struct/mask_struct_store_noseg_run-6.c|  4 ++
 .../struct/mask_struct_store_noseg_run-7.c|  4 ++
 .../rvv/autovec/struct/struct_vect_noseg-1.c  |  7 +++
 .../rvv/autovec/struct/struct_vect_noseg-10.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-11.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-12.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-13.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-14.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-15.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-16.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-17.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-18.c |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-2.c  |  7 +++
 .../rvv/autovec/struct/struct_vect_noseg-3.c  |  7 +++
 .../rvv/autovec/struct/struct_vect_noseg-4.c  |  7 +++
 .../rvv/autovec/struct/struct_vect_noseg-5.c  |  7 +++
 .../rvv/autovec/struct/struct_vect_noseg-6.c  |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-7.c  |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-8.c  |  6 ++
 .../rvv/autovec/struct/struct_vect_noseg-9.c  |  6 ++
 .../autovec/struct/struct_vect_noseg_run-1.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-10.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-11.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-12.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-13.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-14.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-15.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-16.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-17.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-18.c |  4 ++
 .../autovec/struct/struct_vect_noseg_run-2.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-3.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-4.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-5.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-6.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-7.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-8.c  |  4 ++
 .../autovec/struct/struct_vect_noseg_run-9.c  |  4 ++
 68 files changed, 397 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/no-segment.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-2.c
 create mode 100644 
gcc/testsuite/gcc

Re: [PATCH 2/7] libstdc++: Make __normal_iterator constexpr, always_inline, nodiscard

2024-10-16 Thread Patrick Palka
On Tue, 15 Oct 2024, Jonathan Wakely wrote:

> Tested x86_64-linux.
> 
> -- >8 --
> 
> The __gnu_cxx::__normal_iterator type we use for std::vector::iterator
> is not specified by the standard, it's an implementation detail. This
> means it's not constrained by the rule that forbids strengthening
> constexpr. We can make it meet the constexpr iterator requirements for
> older standards, not only when it's required to be for C++20.
> 
> For the non-const member functions they can't be constexpr in C++11, so
> use _GLIBCXX14_CONSTEXPR for those. For all constructors, const members
> and non-member operator overloads, use _GLIBCXX_CONSTEXPR or just
> constexpr.
> 
> We can also liberally add [[nodiscard]] and [[gnu::always_inline]]
> attributes to those functions.
> 
> Also change some internal helpers for std::move_iterator which can be
> unconditionally constexpr and marked nodiscard.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/stl_iterator.h (__normal_iterator): Make all
>   members and overloaded operators constexpr before C++20.
>   (__niter_base, __niter_wrap, __to_address): Add nodiscard
>   and always_inline attributes.
>   (__make_move_if_noexcept_iterator, __miter_base): Add nodiscard
>   and make unconditionally constexpr.
> ---
>  libstdc++-v3/include/bits/stl_iterator.h | 125 ++-
>  1 file changed, 76 insertions(+), 49 deletions(-)
> 
> diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> b/libstdc++-v3/include/bits/stl_iterator.h
> index 85b9861..3cc10a160bd 100644
> --- a/libstdc++-v3/include/bits/stl_iterator.h
> +++ b/libstdc++-v3/include/bits/stl_iterator.h
> @@ -656,7 +656,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>template
>  _GLIBCXX20_CONSTEXPR
> -auto
> +inline auto
>  __niter_base(reverse_iterator<_Iterator> __it)
>  -> decltype(__make_reverse_iterator(__niter_base(__it.base(
>  { return __make_reverse_iterator(__niter_base(__it.base())); }
> @@ -668,7 +668,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>template
>  _GLIBCXX20_CONSTEXPR
> -auto
> +inline auto

These 'inline' additions aren't mentioned in the ChangeLog it seems.  Is
the intent of these inlines solely as a compiler hint?

>  __miter_base(reverse_iterator<_Iterator> __it)
>  -> decltype(__make_reverse_iterator(__miter_base(__it.base(
>  { return __make_reverse_iterator(__miter_base(__it.base())); }
> @@ -1060,23 +1060,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>using iterator_concept = std::__detail::__iter_concept<_Iterator>;
>  #endif
>  
> -  _GLIBCXX_CONSTEXPR __normal_iterator() _GLIBCXX_NOEXCEPT
> -  : _M_current(_Iterator()) { }
> +  __attribute__((__always_inline__))
> +  _GLIBCXX_CONSTEXPR
> +  __normal_iterator() _GLIBCXX_NOEXCEPT
> +  : _M_current() { }
>  
> -  explicit _GLIBCXX20_CONSTEXPR
> +  __attribute__((__always_inline__))
> +  explicit _GLIBCXX_CONSTEXPR
>__normal_iterator(const _Iterator& __i) _GLIBCXX_NOEXCEPT
>: _M_current(__i) { }
>  
>// Allow iterator to const_iterator conversion
>  #if __cplusplus >= 201103L
>template>
> - _GLIBCXX20_CONSTEXPR
> + [[__gnu__::__always_inline__]]
> + constexpr
>   __normal_iterator(const __normal_iterator<_Iter, _Container>& __i)
>   noexcept
>  #else
>// N.B. _Container::pointer is not actually in container requirements,
>// but is present in std::vector and std::basic_string.
>template
> + __attribute__((__always_inline__))
>  __normal_iterator(const __normal_iterator<_Iter,
> typename __enable_if<
>  (std::__are_same<_Iter, typename _Container::pointer>::__value),
> @@ -1085,17 +1090,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  : _M_current(__i.base()) { }
>  
>// Forward iterator requirements
> -  _GLIBCXX20_CONSTEXPR
> +
> +  __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
> +  _GLIBCXX_CONSTEXPR
>reference
>operator*() const _GLIBCXX_NOEXCEPT
>{ return *_M_current; }
>  
> -  _GLIBCXX20_CONSTEXPR
> +  __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
> +  _GLIBCXX_CONSTEXPR
>pointer
>operator->() const _GLIBCXX_NOEXCEPT
>{ return _M_current; }
>  
> -  _GLIBCXX20_CONSTEXPR
> +  __attribute__((__always_inline__))
> +  _GLIBCXX14_CONSTEXPR
>__normal_iterator&
>operator++() _GLIBCXX_NOEXCEPT
>{
> @@ -1103,13 +1112,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   return *this;
>}
>  
> -  _GLIBCXX20_CONSTEXPR
> +  __attribute__((__always_inline__))
> +  _GLIBCXX14_CONSTEXPR
>__normal_iterator
>operator++(int) _GLIBCXX_NOEXCEPT
>{ return __normal_iterator(_M_current++); }
>  
>// Bidirectional iterator requirements
> -  _GLIBCXX20_CONSTEXPR
> +
> +  __attribute__((__always_inline__))
> +  _GLIBCXX

[PATCHv2 1/2] cfgexpand: Handle scope conflicts better [PR111422]

2024-10-16 Thread Andrew Pinski
After fixing loop-im to do the correct overflow rewriting
for pointer types too. We end up with code like:
```
  _9 = (unsigned long) &g;
  _84 = _9 + 18446744073709551615;
  _11 = _42 + _84;
  _44 = (signed char *) _11;
...
  *_44 = 10;
  g ={v} {CLOBBER(eos)};
...
  n[0] = &f;
  *_44 = 8;
  g ={v} {CLOBBER(eos)};
```
Which was not being recongized by the scope conflicts code.
This was because it only handled one level walk backs rather than multiple ones.
This fixes it by using a work_list to avoid huge recursion and a visited bitmap 
to avoid
going into an infinite loops when dealing with loops.
Adds a cache for the addr_expr's that are associated with each ssa name. I 
found 2 element
cache was the decent trade off for size and speed.  Most ssa names will have 
only
one address associated with it but there are times (phis) where 2 or more will
be there. But 2 is the common case for most if statements.

gcc/ChangeLog:

PR middle-end/111422
* cfgexpand.cc: Define INCLUDE_STRING if ADDR_WALKER_STATS
is defined.
(class addr_ssa_walker): New class.
(add_scope_conflicts_2): Rename to ...
(addr_ssa_walker::operator()): This and rewrite to be a full walk
of all operands and their uses and use a cache.
(add_scope_conflicts_1): Add walker new argument for the addr cache.
Just walk the phi result since that will include all addr_exprs.
Change call to add_scope_conflicts_2 to walker.
(add_scope_conflicts): Add walker variable and update call to
add_scope_conflicts_1.

Signed-off-by: Andrew Pinski 
---
 gcc/cfgexpand.cc | 207 ---
 1 file changed, 176 insertions(+), 31 deletions(-)

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 6c1096363af..74f4cfc0f22 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -17,6 +17,9 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#ifdef ADDR_WALKER_STATS
+#define INCLUDE_STRING
+#endif
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -571,35 +574,175 @@ visit_conflict (gimple *, tree op, tree, void *data)
   return false;
 }
 
-/* Helper function for add_scope_conflicts_1.  For USE on
-   a stmt, if it is a SSA_NAME and in its SSA_NAME_DEF_STMT is known to be
-   based on some ADDR_EXPR, invoke VISIT on that ADDR_EXPR.  */
+namespace {
 
-static inline void
-add_scope_conflicts_2 (tree use, bitmap work,
-  walk_stmt_load_store_addr_fn visit)
+class addr_ssa_walker
+{
+private:
+  struct addr_cache
+  {
+  private:
+unsigned elems = 0;
+static constexpr unsigned maxelements = 2;
+bool visited = false;
+tree cached[maxelements] = {};
+  public:
+/* Returns true if the cache is valid. */
+operator bool ()
+{
+  return visited && elems <= maxelements;
+}
+/* Mark as visited. The cache might be invalidated
+   by adding too many elements though. */
+void visit () { visited = true; }
+/* Iterator over the cached values. */
+tree *begin () { return &cached[0]; }
+tree *end ()
+{
+  /* If there was too many elements, then there are
+nothing to vist in the cache. */
+  if (elems > maxelements)
+   return &cached[0];
+  return &cached[elems];
+}
+/* Add ADDR_EXPR to the cache if it is not there already. */
+void add (tree addr)
+{
+  if (elems > maxelements)
+   {
+ statistics_counter_event (cfun, "addr_walker already overflow", 1);
+ return;
+   }
+  /* Skip if the cache already contains the addr_expr. */
+  for(tree other : *this)
+   if (operand_equal_p (other, addr))
+ return;
+  elems++;
+  /* Record that the cache overflowed. */
+  if (elems > maxelements)
+   {
+ statistics_counter_event (cfun, "addr_walker overflow", 1);
+ return;
+   }
+  cached[elems - 1] = addr;
+}
+  };
+public:
+  addr_ssa_walker () : cache (new addr_cache[num_ssa_names]{}) { }
+  ~addr_ssa_walker (){ delete[] cache; }
+
+  /* Walk the name and its defining statement,
+ call func with for addr_expr's. */
+  template
+  void operator ()(tree name, T func);
+
+private:
+
+  /* Cannot create a copy. */
+  addr_ssa_walker (const addr_ssa_walker &) = delete;
+  addr_ssa_walker (addr_ssa_walker &&) = delete;
+  /* Return the cache entries for a SSA NAME. */
+  addr_cache &operator[] (tree name)
+  {
+return cache[SSA_NAME_VERSION (name)];
+  }
+
+  addr_cache *cache;
+};
+
+/* Walk backwards on the defining statements of NAME
+   and call FUNC on the addr_expr. Use the cache for
+   the SSA name if possible.  */
+
+template
+void
+addr_ssa_walker::operator() (tree name, T func)
 {
-  if (TREE_CODE (use) == SSA_NAME
-  && (POINTER_TYPE_P (TREE_TYPE (use))
- || INTEGRAL_TYPE_P (TREE_TYPE (use
+  gcc_asser

[PATCH 2/2] cfgexpand: Handle integral vector types and constructors for scope conflicts [PR105769]

2024-10-16 Thread Andrew Pinski
This is an expansion of the last patch to also track pointers via vector types 
and the
constructor that are used with vector types.
In this case we had:
```
_15 = (long unsigned int) &bias;
_10 = (long unsigned int) &cov_jn;
_12 = {_10, _15};
...

MEM[(struct vec *)&cov_jn] ={v} {CLOBBER(bob)};
bias ={v} {CLOBBER(bob)};
MEM[(struct function *)&D.6156] ={v} {CLOBBER(bob)};

...
MEM  [(void *)&D.6172 + 32B] = _12;
MEM[(struct function *)&D.6157] ={v} {CLOBBER(bob)};
```

Anyways tracking the pointers via vector types to say they are alive
at the point where the store of the vector happens fixes the bug by saying
it is alive at the same time as another variable is alive.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/105769

PR tree-optimization/105769

gcc/ChangeLog:

* cfgexpand.cc (addr_ssa_walker::operator()): For constructors
walk over the elements.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr105769-1.C: New test.
---
 gcc/cfgexpand.cc  | 15 -
 gcc/testsuite/g++.dg/torture/pr105769-1.C | 67 +++
 2 files changed, 80 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr105769-1.C

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 74f4cfc0f22..970ee0b85f7 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -691,7 +691,7 @@ addr_ssa_walker::operator() (tree name, T func)
 
   /* Only Pointers and integral types are used to track addresses.  */
   if (!POINTER_TYPE_P (TREE_TYPE (use))
- && !INTEGRAL_TYPE_P (TREE_TYPE (use)))
+ && !ANY_INTEGRAL_TYPE_P (TREE_TYPE (use)))
continue;
 
   /* Check the cache, if there is a hit use it.  */
@@ -718,10 +718,21 @@ addr_ssa_walker::operator() (tree name, T func)
 if there was too many names associated with the cache. */
   work_list.safe_push (work);
 
+  /* CONSTRUCTOR here is always a vector initialization,
+walk each element too. */
+  if (gimple_assign_single_p (g)
+ && TREE_CODE (gimple_assign_rhs1 (g)) == CONSTRUCTOR)
+   {
+ tree ctr = gimple_assign_rhs1 (g);
+ unsigned i;
+ tree elm;
+ FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (ctr), i, elm)
+   work_list.safe_push (std::make_pair (elm, use));
+   }
   /* For assign statements, add each operand to the work list.
 Note operand 0 is the same as the use here so there is nothing
 to be done.  */
-  if (gassign *a = dyn_cast  (g))
+  else if (gassign *a = dyn_cast  (g))
{
  for (unsigned i = 1; i < gimple_num_ops (g); i++)
work_list.safe_push (std::make_pair (gimple_op (a, i), use));
diff --git a/gcc/testsuite/g++.dg/torture/pr105769-1.C 
b/gcc/testsuite/g++.dg/torture/pr105769-1.C
new file mode 100644
index 000..3fe973656b8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr105769-1.C
@@ -0,0 +1,67 @@
+// { dg-do run }
+
+// PR tree-optimization/105769
+
+// The partitioning code would incorrectly have bias
+// and a temporary in the same partitioning because
+// it was thought bias was not alive when those were alive
+// do to vectorization of a store of pointers (that included bias).
+
+#include 
+
+template
+struct vec {
+  T dat[n];
+  vec() {}
+  explicit vec(const T& x) { for(size_t i = 0; i < n; i++) dat[i] = x; }
+  T& operator [](size_t i) { return dat[i]; }
+  const T& operator [](size_t i) const { return dat[i]; }
+};
+
+template
+using mat = vec>;
+template
+using sq_mat = mat;
+using map_t = std::function;
+template
+using est_t = std::function;
+template using est2_t = std::function;
+map_t id_map() { return [](size_t j) -> size_t { return j; }; }
+
+template
+est2_t jacknife(const est_t> est, sq_mat& cov, vec& bias) {
+  return [est, &cov, &bias](map_t map) -> void
+  {
+bias = est(map);
+for(size_t i = 0; i < n; i++)
+{
+  bias[i].print();
+}
+  };
+}
+
+template
+void print_cov_ratio() {
+  sq_mat<2, T> cov_jn;
+  vec<2, T> bias;
+  jacknife<2, T>([](map_t map) -> vec<2, T> { vec<2, T> retv; retv[0] = 1; 
retv[1] = 1; return retv; }, cov_jn, bias)(id_map());
+}
+struct ab {
+  long long unsigned a;
+  short unsigned b;
+  double operator()() { return a; }
+  ab& operator=(double rhs) { a = rhs; return *this; }
+ void print();
+};
+
+void
+ab::print()
+{
+
+}
+
+int main() {
+  print_cov_ratio();
+  return 0;
+}
+
-- 
2.34.1



Re: [PATCH 1/2] libstdc++: Implement C++23 (P0429R9)

2024-10-16 Thread Patrick Palka
On Mon, 30 Sep 2024, Patrick Palka wrote:

> This implements the C++23 container adaptors std::flat_map and
> std::flat_multimap from P0429R9.  The implementation is shared
> as much as possible between the two adaptors via a common base
> class that's parameterized according to key uniqueness.
> 
> The main known issues are:
> 
>   * the range insert() overload exceeds its complexity requirements
> since an idiomatic efficient implementation needs a non-buggy
> ranges::inplace_merge
>   * exception safety is likely incomplete/buggy
>   * unimplemented from_range_t constructors and insert_range function
>   * the main workhorse function _M_try_emplace is probably buggy
> buggy wrt its handling of the hint parameter and could be simplified
>   * more extensive testcases are a WIP
> 
> The iterator type is encoded as a {pointer, index} pair instead of an
> {iterator, iterator} pair.  I'm not sure which encoding is preferable?
> It seems the latter would allow for better debuggability when the
> underlying iterators are debug iterators.

Here's v2 which adds somewhat more tests and uses the std:: algos
instead of ranges:: algos where possible, along with some other
very minor cleanups.

-- >8 --

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new header .
* include/Makefile.in: Regenerate.
* include/bits/stl_function.h (__transparent_comparator): Define.
* include/bits/utility.h (sorted_unique_t): Define for C++23.
(sorted_unique): Likewise.
(sorted_equivalent_t): Likewise.
(sorted_equivalent): Likewise.
* include/bits/version.def (flat_map): Define.
* include/bits/version.h: Regenerate.
* include/std/flat_map: New file.
* testsuite/23_containers/flat_map/1.cc: New test.
* testsuite/23_containers/flat_multimap/1.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |1 +
 libstdc++-v3/include/Makefile.in  |1 +
 libstdc++-v3/include/bits/stl_function.h  |6 +
 libstdc++-v3/include/bits/utility.h   |8 +
 libstdc++-v3/include/bits/version.def |8 +
 libstdc++-v3/include/bits/version.h   |   10 +
 libstdc++-v3/include/std/flat_map | 1475 +
 .../testsuite/23_containers/flat_map/1.cc |  123 ++
 .../23_containers/flat_multimap/1.cc  |  106 ++
 9 files changed, 1738 insertions(+)
 create mode 100644 libstdc++-v3/include/std/flat_map
 create mode 100644 libstdc++-v3/testsuite/23_containers/flat_map/1.cc
 create mode 100644 libstdc++-v3/testsuite/23_containers/flat_multimap/1.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 422a0f4bd0a..632bbafa63e 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -70,6 +70,7 @@ std_headers = \
${std_srcdir}/deque \
${std_srcdir}/execution \
${std_srcdir}/filesystem \
+   ${std_srcdir}/flat_map \
${std_srcdir}/format \
${std_srcdir}/forward_list \
${std_srcdir}/fstream \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 9fd4ab4848c..1ac963c4415 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -426,6 +426,7 @@ std_freestanding = \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/deque \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/execution \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/filesystem \
+@GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/flat_map \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/format \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/forward_list \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/fstream \
diff --git a/libstdc++-v3/include/bits/stl_function.h 
b/libstdc++-v3/include/bits/stl_function.h
index c9123ccecae..c579ba9f47b 100644
--- a/libstdc++-v3/include/bits/stl_function.h
+++ b/libstdc++-v3/include/bits/stl_function.h
@@ -1426,6 +1426,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 using __has_is_transparent_t
   = typename __has_is_transparent<_Func, _SfinaeType>::type;
+
+#if __cpp_concepts
+  template
+concept __transparent_comparator
+  = requires { typename _Func::is_transparent; };
+#endif
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/include/bits/utility.h 
b/libstdc++-v3/include/bits/utility.h
index 4a6c16dc2e0..9e10ce2cb1c 100644
--- a/libstdc++-v3/include/bits/utility.h
+++ b/libstdc++-v3/include/bits/utility.h
@@ -308,6 +308,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   _GLIBCXX17_INLINE constexpr _Swallow_assign ignore{};
 
+#if __glibcxx_flat_map || __glibcxx_flat_set // >= C++23
+  struct sorted_unique_t { explicit sorted_unique_t() = default; };
+  inline constexpr sorted_unique_t sorted_unique{};
+
+  struct sorted_equivalent_t { explicit sorted_equivalent_t() = default; };
+  inline constexpr sorted_equivalent_t sorted_equivalent{};
+#endif
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
diff --git a/libstdc++-v3/include/bits

Re: [PATCH 2/2] libstdc++: Implement C++23 (P1222R4)

2024-10-16 Thread Patrick Palka
On Mon, 30 Sep 2024, Patrick Palka wrote:

> This implements the C++23 container adaptors std::flat_set and
> std::flat_multiset from P1222R4.  The implementation is essentially
> an simpler and pared down version of std::flat_map.
> 
> The main known issues are:
> 
>   * exception safety is likely incomplete/buggy
>   * unimplemented from_range_t constructors and insert_range function
>   * the main worthouse function _M_try_emplace is probably buggy
> wrt its handling of the hint parameter and could be simplified
>   * more extensive testcases are a WIP

Here's v2 which adds somewhat more tests and uses std:: algos instead
of ranges:: algos where possible, along with some other very minor
cleanups.

-- >8 --

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new header .
* include/Makefile.in: Regenerate.
* include/bits/version.def (__cpp_flat_set): Define.
* include/bits/version.h: Regenerate
* include/std/flat_set: New file.
* testsuite/23_containers/flat_multiset/1.cc: New test.
* testsuite/23_containers/flat_set/1.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/flat_set | 972 ++
 .../23_containers/flat_multiset/1.cc  | 102 ++
 .../testsuite/23_containers/flat_set/1.cc | 111 ++
 7 files changed, 1205 insertions(+)
 create mode 100644 libstdc++-v3/include/std/flat_set
 create mode 100644 libstdc++-v3/testsuite/23_containers/flat_multiset/1.cc
 create mode 100644 libstdc++-v3/testsuite/23_containers/flat_set/1.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 632bbafa63e..e49cdb23c55 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -71,6 +71,7 @@ std_headers = \
${std_srcdir}/execution \
${std_srcdir}/filesystem \
${std_srcdir}/flat_map \
+   ${std_srcdir}/flat_set \
${std_srcdir}/format \
${std_srcdir}/forward_list \
${std_srcdir}/fstream \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 1ac963c4415..8e6ee44cc0e 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -427,6 +427,7 @@ std_freestanding = \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/execution \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/filesystem \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/flat_map \
+@GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/flat_set \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/format \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/forward_list \
 @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/fstream \
diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index 6f1958ac705..7d12cdf07f1 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -1666,6 +1666,14 @@ ftms = {
   };
 };
 
+ftms = {
+  name = flat_set;
+  values = {
+v = 202207;
+cxxmin = 23;
+  };
+};
+
 ftms = {
   name = formatters;
   values = {
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index ecead3edb26..76a88a17351 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -1840,6 +1840,16 @@
 #endif /* !defined(__cpp_lib_flat_map) && defined(__glibcxx_want_flat_map) */
 #undef __glibcxx_want_flat_map
 
+#if !defined(__cpp_lib_flat_set)
+# if (__cplusplus >= 202100L)
+#  define __glibcxx_flat_set 202207L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_flat_set)
+#   define __cpp_lib_flat_set 202207L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_flat_set) && defined(__glibcxx_want_flat_set) */
+#undef __glibcxx_want_flat_set
+
 #if !defined(__cpp_lib_formatters)
 # if (__cplusplus >= 202100L) && _GLIBCXX_HOSTED
 #  define __glibcxx_formatters 202302L
diff --git a/libstdc++-v3/include/std/flat_set 
b/libstdc++-v3/include/std/flat_set
new file mode 100644
index 000..53bda6e03a3
--- /dev/null
+++ b/libstdc++-v3/include/std/flat_set
@@ -0,0 +1,972 @@
+//  -*- C++ -*-
+
+// Copyright The GNU Toolchain Authors.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as publis

Re: [PATCH] c++: Fix crash during NRV optimization with invalid input [PR117099]

2024-10-16 Thread Sam James
Simon Martin  writes:

> We ICE upon the following invalid code because we end up calling
> finalize_nrv_r with a RETURN_EXPR with no operand.
> 
> === cut here ===
> struct X {
>   ~X();
> };
> X test(bool b) {
>   {
> X x;
> return x;
>   }
>   if (!(b)) return;
> }
> === cut here ===
> 
> This patch fixes this by simply returning error_mark_node when detecting
> a void return in a function returning non-void.
> 
> Successfully tested on x86_64-pc-linux-gnu.
> 
>   PR c++/117099
> 
> gcc/cp/ChangeLog:
> 
>   * typeck.cc (check_return_expr): Return error_mark_node upon
>   void return for function returning non-void.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/parse/crash77.C: New test.
> 
> ---
>  gcc/cp/typeck.cc |  1 +
>  gcc/testsuite/g++.dg/parse/crash77.C | 14 ++
>  2 files changed, 15 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/parse/crash77.C
> 
> diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
> index 71d879abef1..22a6ec9a185 100644
> --- a/gcc/cp/typeck.cc
> +++ b/gcc/cp/typeck.cc
> @@ -11238,6 +11238,7 @@ check_return_expr (tree retval, bool *no_warning, 
> bool *dangling)
>RETURN_EXPR to avoid control reaches end of non-void function
>warnings in tree-cfg.cc.  */
>*no_warning = true;
> +  return error_mark_node;
>  }
>/* Check for a return statement with a value in a function that
>   isn't supposed to return a value.  */
> diff --git a/gcc/testsuite/g++.dg/parse/crash77.C 
> b/gcc/testsuite/g++.dg/parse/crash77.C
> new file mode 100644
> index 000..d3f0ae6a877
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/parse/crash77.C
> @@ -0,0 +1,14 @@
> +// PR c++/117099
> +// { dg-compile }

dg-do compile

> +
> +struct X {
> +  ~X();
> +};
> +
> +X test(bool b) {
> +  {
> +X x;
> +return x;
> +  } 
> +  if (!(b)) return; // { dg-error "return-statement with no value" }
> +}
> -- 
> 2.44.0
> 

BTW, the line-endings on this seem a bit odd. Did you use git-send-email?


[PATCH] c: Fix up speed up compilation of large char array initializers when not using #embed [PR117177]

2024-10-16 Thread Jakub Jelinek
Hi!

Apparently my
c: Speed up compilation of large char array initializers when not using #embed
patch broke building glibc.

The issue is that when using CPP_EMBED, we are guaranteed by the
preprocessor that there is CPP_NUMBER CPP_COMMA before it and
CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST
never ends up at the end of arrays of unknown length.
Now, the c_parser_initval optimization attempted to preserve that property
rather than changing everything that e.g. inferes array number of elements
from the initializer etc. to deal with RAW_DATA_CST at the end, but
it didn't take into account the possibility that there could be
CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant).

As we are peaking already at 4 tokens in that code, peeking more would
require using raw tokens and that seems to be expensive doing it for
every pair of tokens due to vec_free done when we are out of raw tokens.

So, the following patch instead determines the case where we want
another INTEGER_CST element after it after consuming the tokens, and just
arranges for another process_init_element.

Ok for trunk if this passes bootstrap/regtest?

2024-10-16  Jakub Jelinek  

PR c/117177
gcc/c/
* c-parser.cc (c_parser_initval): Instead of doing
orig_len == INT_MAX checks before consuming tokens to set
last = 1, check it after consuming it and if not followed
by CPP_COMMA CPP_NUMBER, call process_init_element once
more with the last CPP_NUMBER.
gcc/testsuite/
* c-c++-common/init-4.c: New test.

--- gcc/c/c-parser.cc.jj2024-10-16 17:45:16.482325343 +0200
+++ gcc/c/c-parser.cc   2024-10-16 22:57:33.083698527 +0200
@@ -6529,6 +6529,7 @@ c_parser_initval (c_parser *parser, stru
unsigned int i;
gcc_checking_assert (len >= 64);
location_t last_loc = UNKNOWN_LOCATION;
+   location_t prev_loc = UNKNOWN_LOCATION;
for (i = 0; i < 64; ++i)
  {
c_token *tok = c_parser_peek_nth_token_raw (parser, 1 + 2 * i);
@@ -6544,6 +6545,7 @@ c_parser_initval (c_parser *parser, stru
buf1[i] = (char) tree_to_uhwi (tok->value);
if (i == 0)
  loc = tok->location;
+   prev_loc = last_loc;
last_loc = tok->location;
  }
if (i < 64)
@@ -6567,6 +6569,7 @@ c_parser_initval (c_parser *parser, stru
unsigned int max_len = 131072 - offsetof (struct tree_string, str) - 1;
unsigned int orig_len = len;
unsigned int off = 0, last = 0;
+   unsigned char lastc = 0;
if (!wi::neg_p (wi::to_wide (val)) && wi::to_widest (val) <= UCHAR_MAX)
  off = 1;
len = MIN (len, max_len - off);
@@ -6596,20 +6599,25 @@ c_parser_initval (c_parser *parser, stru
if (tok2->type != CPP_COMMA && tok2->type != CPP_CLOSE_BRACE)
  break;
buf2[i + off] = (char) tree_to_uhwi (tok->value);
-   /* If orig_len is INT_MAX, this can be flexible array member and
-  in that case we need to ensure another element which
-  for CPP_EMBED is normally guaranteed after it.  Include
-  that byte in the RAW_DATA_OWNER though, so it can be optimized
-  later.  */
-   if (tok2->type == CPP_CLOSE_BRACE && orig_len == INT_MAX)
- {
-   last = 1;
-   break;
- }
+   prev_loc = last_loc;
last_loc = tok->location;
c_parser_consume_token (parser);
c_parser_consume_token (parser);
  }
+   /* If orig_len is INT_MAX, this can be flexible array member and
+  in that case we need to ensure another element which
+  for CPP_EMBED is normally guaranteed after it.  Include
+  that byte in the RAW_DATA_OWNER though, so it can be optimized
+  later.  */
+   if (orig_len == INT_MAX
+   && (!c_parser_next_token_is (parser, CPP_COMMA)
+   || c_parser_peek_2nd_token (parser)->type != CPP_NUMBER))
+ {
+   --i;
+   last = 1;
+   std::swap (prev_loc, last_loc);
+   lastc = (unsigned char) buf2[i + off];
+ }
val = make_node (RAW_DATA_CST);
TREE_TYPE (val) = integer_type_node;
RAW_DATA_LENGTH (val) = i;
@@ -6625,6 +6633,13 @@ c_parser_initval (c_parser *parser, stru
init.original_type = integer_type_node;
init.m_decimal = 0;
process_init_element (loc, init, false, braced_init_obstack);
+   if (last)
+ {
+   init.value = build_int_cst (integer_type_node, lastc);
+   init.original_code = INTEGER_CST;
+   set_c_expr_source_range (&init, prev_loc, prev_loc);
+   process_init_element (prev_loc, init, false, braced_init_obstack);
+ }
   }
 }
 
--- gcc/testsuite/c-c++-common/init-4.c.jj  2024-10-16 22:56:05.535934184 
+0200
+++ gcc/testsuite/c-c++-common/init-4.c 202

Re: [PATCH v2 1/4] tree-object-size: use size_for_offset in more cases

2024-10-16 Thread Siddhesh Poyarekar

On 2024-09-27 06:31, Jakub Jelinek wrote:

On Fri, Sep 20, 2024 at 12:40:26PM -0400, Siddhesh Poyarekar wrote:

When wholesize != size, there is a reasonable opportunity for static
object sizes also to be computed using size_for_offset, so use that.

gcc/ChangeLog:

* tree-object-size.cc (plus_stmt_object_size): Call
SIZE_FOR_OFFSET for some negative offset cases.
* testsuite/gcc.dg/builtin-object-size-3.c (test9): Adjust test.
* testsuite/gcc.dg/builtin-object-size-4.c (test8): Likewise.
---
  gcc/testsuite/gcc.dg/builtin-object-size-3.c | 6 +++---
  gcc/testsuite/gcc.dg/builtin-object-size-4.c | 6 +++---
  gcc/tree-object-size.cc  | 2 +-
  3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-3.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
index 3f58da3d500..ec2c62c9640 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-3.c
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
@@ -574,7 +574,7 @@ test9 (unsigned cond)
if (__builtin_object_size (&p[-4], 2) != (cond ? 6 : 10))
  FAIL ();
  #else
-  if (__builtin_object_size (&p[-4], 2) != 0)
+  if (__builtin_object_size (&p[-4], 2) != 6)
  FAIL ();
  #endif
  
@@ -585,7 +585,7 @@ test9 (unsigned cond)

if (__builtin_object_size (p, 2) != ((cond ? 2 : 6) + cond))
  FAIL ();
  #else
-  if (__builtin_object_size (p, 2) != 0)
+  if (__builtin_object_size (p, 2) != 2)
  FAIL ();
  #endif
  
@@ -598,7 +598,7 @@ test9 (unsigned cond)

!= sizeof (y) - __builtin_offsetof (struct A, c) - 8 + cond)
  FAIL ();
  #else
-  if (__builtin_object_size (p, 2) != 0)
+  if (__builtin_object_size (p, 2) != sizeof (y) - __builtin_offsetof (struct 
A, c) - 8)
  FAIL ();
  #endif
  }
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-4.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-4.c
index b3eb36efb74..7bcd24c4150 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-4.c
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-4.c
@@ -482,7 +482,7 @@ test8 (unsigned cond)
if (__builtin_object_size (&p[-4], 3) != (cond ? 6 : 10))
  FAIL ();
  #else
-  if (__builtin_object_size (&p[-4], 3) != 0)
+  if (__builtin_object_size (&p[-4], 3) != 6)
  FAIL ();
  #endif
  
@@ -493,7 +493,7 @@ test8 (unsigned cond)

if (__builtin_object_size (p, 3) != ((cond ? 2 : 6) + cond))
  FAIL ();
  #else
-  if (__builtin_object_size (p, 3) != 0)
+  if (__builtin_object_size (p, 3) != 2)
  FAIL ();
  #endif
  
@@ -505,7 +505,7 @@ test8 (unsigned cond)

if (__builtin_object_size (p, 3) != sizeof (y.c) - 8 + cond)
  FAIL ();
  #else
-  if (__builtin_object_size (p, 3) != 0)
+  if (__builtin_object_size (p, 3) != sizeof (y.c) - 8)
  FAIL ();
  #endif
  }


The testcase changes look reasonable to me.


diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 6544730e153..f8fae0cbc82 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -1527,7 +1527,7 @@ plus_stmt_object_size (struct object_size_info *osi, tree 
var, gimple *stmt)
if (size_unknown_p (bytes, 0))
;
else if ((object_size_type & OST_DYNAMIC)
-  || compare_tree_int (op1, offset_limit) <= 0)
+  || bytes != wholesize || compare_tree_int (op1, offset_limit) <= 
0)
bytes = size_for_offset (bytes, op1, wholesize);
/* In the static case, with a negative offset, the best estimate for
 minimum size is size_unknown but for maximum size, the wholesize is a


The coding conventions say that in cases like this where the whole condition
doesn't fit on a single line, each ||/&& operand should be on a separate
line.
So, the patch should be adding || bytes != wholesize on a separate line.

That said, there is a pre-existing problem, the tree direct comparisons
(bytes != wholesize here, && wholesize != sz in size_for_offset (note,
again, it should be on a separate line), maybe others).

We do INTEGER_CST caching, either using small array for small values or
hash table for larger ones, so INTEGER_CSTs with the same value of the
same type should be pointer equal unless they are TREE_OVERFLOW or similar,
but for anything else, unless you guarantee that in the "same" case
you assign the same tree to size/wholesize rather than say
perform size_binop twice, I'd expect instead comparisons with
operand_equal_p or something similar.

Though, because this patch is solely for the __builtin_object_size case
and the sizes in that case should be solely INTEGER_CSTs, I guess this patch
is ok with the formatting nit fix (and ideally the one in size_for_offset
too).


Thanks, we basically pass the same object to sz and wholesize when 
there's no separate subobject, so the pointer equivalence should be 
sufficient.  However I reckon could be a theoretical situation where a 
subobject size expression folds into being the same as a whole size, but 
I'm not sure if that'll actually happen in practice.  I could

Re: [PATCH v16b 2/4] gcc/: Rename array_type_nelts => array_type_nelts_minus_one

2024-10-16 Thread Alejandro Colomar
Hi Joseph,

On Wed, Oct 16, 2024 at 05:21:39PM GMT, Joseph Myers wrote:
> On Wed, 16 Oct 2024, Alejandro Colomar wrote:
> 
> > The old name was misleading.
> > 
> > While at it, also rename some temporary variables that are used with
> > this function, for consistency.
> 
> This patch is OK and should be committed (assuming it has passed bootstrap 
> and regression testing).

Thanks!

I did bootstrap + regression testing in v15.  In v16 I got lazy and
since the changes were small, I only checked that the new tests all
pass.

Since it looks like it's close to merging, I'll run bootstrap +
regression testing again.

Have a lovely night!
Alex

-- 



signature.asc
Description: PGP signature


Re: [PATCH] c: Fix up speed up compilation of large char array initializers when not using #embed [PR117177]

2024-10-16 Thread Joseph Myers
On Wed, 16 Oct 2024, Jakub Jelinek wrote:

> Hi!
> 
> Apparently my
> c: Speed up compilation of large char array initializers when not using #embed
> patch broke building glibc.
> 
> The issue is that when using CPP_EMBED, we are guaranteed by the
> preprocessor that there is CPP_NUMBER CPP_COMMA before it and
> CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST
> never ends up at the end of arrays of unknown length.
> Now, the c_parser_initval optimization attempted to preserve that property
> rather than changing everything that e.g. inferes array number of elements
> from the initializer etc. to deal with RAW_DATA_CST at the end, but
> it didn't take into account the possibility that there could be
> CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant).
> 
> As we are peaking already at 4 tokens in that code, peeking more would
> require using raw tokens and that seems to be expensive doing it for
> every pair of tokens due to vec_free done when we are out of raw tokens.
> 
> So, the following patch instead determines the case where we want
> another INTEGER_CST element after it after consuming the tokens, and just
> arranges for another process_init_element.
> 
> Ok for trunk if this passes bootstrap/regtest?

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



  1   2   >