date:20241219

[PATCH 0/2] RISC-V: Add intrinsics support and testcases for SiFive Xsfvcp extension.

2024-12-19 Thread shiyulong

From: yulong 

This patch implements the Sifvie vendor extension Xsfvfnrclipxfqf[1]
 support to gcc. Providing support for FP32-to-int8 Ranged Clip
 instrctions.

[1] 
https://www.sifive.com/document-file/sifive-vector-coprocessor-interface-vcix-software

Co-Authored by: Jiawei Chen 
Co-Authored by: Shihua Liao 
Co-Authored by: Yixuan Chen 

Liao Shihua (1):
  RISC-V: Add intrinsic testcases for SiFive Xsfvcp extensions.

yulong (1):
  RISC-V: Add intrinsics support for SiFive Xsfvcp extensions.

 gcc/config/riscv/constraints.md   |   10 +
 gcc/config/riscv/generic-vector-ooo.md|4 +
 gcc/config/riscv/genrvv-type-indexer.cc   |9 +
 .../riscv/riscv-vector-builtins-shapes.cc |   48 +
 .../riscv/riscv-vector-builtins-shapes.h  |2 +
 .../riscv/riscv-vector-builtins-types.def |   40 +
 gcc/config/riscv/riscv-vector-builtins.cc |  362 ++-
 gcc/config/riscv/riscv-vector-builtins.def|   30 +-
 gcc/config/riscv/riscv-vector-builtins.h  |8 +
 gcc/config/riscv/riscv.md |5 +-
 .../riscv/sifive-vector-builtins-bases.cc |   78 +
 .../riscv/sifive-vector-builtins-bases.h  |3 +
 .../sifive-vector-builtins-functions.def  |   45 +
 gcc/config/riscv/sifive-vector.md |  569 
 gcc/config/riscv/vector-iterators.md  |   48 +
 gcc/config/riscv/vector.md|3 +-
 .../gcc.target/riscv/rvv/xsfvector/sf_vc_f.c  | 1286 
 .../gcc.target/riscv/rvv/xsfvector/sf_vc_i.c  | 2682 +
 .../gcc.target/riscv/rvv/xsfvector/sf_vc_v.c  | 1954 
 .../gcc.target/riscv/rvv/xsfvector/sf_vc_x.c  | 2679 
 20 files changed, 9858 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vc_f.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vc_i.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vc_v.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xsfvector/sf_vc_x.c

-- 
2.34.1

[PATCH 1/2] RISC-V: Add intrinsics support for SiFive Xsfvcp extensions.

2024-12-19 Thread shiyulong

From: yulong 

This commit adds intrinsics support for Xsfvcp extension.

Co-Authored by: Jiawei Chen 
Co-Authored by: Shihua Liao 
Co-Authored by: Yixuan Chen 

gcc/ChangeLog:

* config/riscv/constraints.md (Ou01): New constraint.
(Ou02): Ditto.
* config/riscv/generic-vector-ooo.md (vec_sf_vcp): New reservation.
* config/riscv/genrvv-type-indexer.cc (main): New type.
* config/riscv/riscv-vector-builtins-shapes.cc (struct sf_vcix_se_def): 
New function.
(struct sf_vcix_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_X2_U_OPS): New 
type.
(DEF_RVV_X2_WU_OPS): Ditto.
(vuint8mf8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_X2_U_OPS): New 
builtins def.
(DEF_RVV_X2_WU_OPS): Ditto.
(rvv_arg_type_info::get_scalar_float_type): Ditto.
(function_instance::modifies_global_state_p): Ditto.
* config/riscv/riscv-vector-builtins.def (v_x): New base type.
(i): Ditto.
(v_i): Ditto.
(xv): Ditto.
(iv): Ditto.
(fv): Ditto.
(vvv): Ditto.
(xvv): Ditto.
(ivv): Ditto.
(fvv): Ditto.
(vvw): Ditto.
(xvw): Ditto.
(ivw): Ditto.
(fvw): Ditto.
(v_vv): Ditto.
(v_xv): Ditto.
(v_iv): Ditto.
(v_fv): Ditto.
(v_vvv): Ditto.
(v_xvv): Ditto.
(v_ivv): Ditto.
(v_fvv): Ditto.
(v_vvw): Ditto.
(v_xvw): Ditto.
(v_ivw): Ditto.
(v_fvw): Ditto.
(x2_vector): Ditto.
(scalar_float): Ditto.
* config/riscv/riscv-vector-builtins.h (enum required_ext): New 
extension.
(required_ext_to_isa_name): Ditto.
(required_extensions_specified): Ditto.
(struct rvv_arg_type_info): Ditto.
(struct function_group_info): Ditto.
* config/riscv/riscv.md: New attr.
* config/riscv/sifive-vector-builtins-bases.cc (class sf_vc): New 
function.
(BASE): New base_name.
* config/riscv/sifive-vector-builtins-bases.h: New function_base.
* config/riscv/sifive-vector-builtins-functions.def 
(REQUIRED_EXTENSIONS):New intrinsics def.
(sf_vc): Ditto.
* config/riscv/sifive-vector.md (@sf_vc_x):New RTL mode.
(@sf_vc_v_x): Ditto.
(@sf_vc_i): Ditto.
(@sf_vc_v_i): Ditto.
(@sf_vc_vv): Ditto.
(@sf_vc_v_vv): Ditto.
(@sf_vc_xv): Ditto.
(@sf_vc_v_xv): Ditto.
(@sf_vc_iv): Ditto.
(@sf_vc_v_iv): Ditto.
(@sf_vc_fv): Ditto.
(@sf_vc_v_fv): Ditto.
(@sf_vc_vvv): Ditto.
(@sf_vc_v_vvv): Ditto.
(@sf_vc_xvv): Ditto.
(@sf_vc_v_xvv): Ditto.
(@sf_vc_ivv): Ditto.
(@sf_vc_v_ivv): Ditto.
(@sf_vc_fvv): Ditto.
(@sf_vc_v_fvv): Ditto.
(@sf_vc_vvw): Ditto.
(@sf_vc_v_vvw): Ditto.
(@sf_vc_xvw): Ditto.
(@sf_vc_v_xvw): Ditto.
(@sf_vc_ivw): Ditto.
(@sf_vc_v_ivw): Ditto.
(@sf_vc_fvw): Ditto.
(@sf_vc_v_fvw): Ditto.
* config/riscv/vector-iterators.md: New iterator.
* config/riscv/vector.md: New vtype.

---
 gcc/config/riscv/constraints.md   |  10 +
 gcc/config/riscv/generic-vector-ooo.md|   4 +
 gcc/config/riscv/genrvv-type-indexer.cc   |   9 +
 .../riscv/riscv-vector-builtins-shapes.cc |  48 ++
 .../riscv/riscv-vector-builtins-shapes.h  |   2 +
 .../riscv/riscv-vector-builtins-types.def |  40 ++
 gcc/config/riscv/riscv-vector-builtins.cc | 362 ++-
 gcc/config/riscv/riscv-vector-builtins.def|  30 +-
 gcc/config/riscv/riscv-vector-builtins.h  |   8 +
 gcc/config/riscv/riscv.md |   5 +-
 .../riscv/sifive-vector-builtins-bases.cc |  78 +++
 .../riscv/sifive-vector-builtins-bases.h  |   3 +
 .../sifive-vector-builtins-functions.def  |  45 ++
 gcc/config/riscv/sifive-vector.md | 569 ++
 gcc/config/riscv/vector-iterators.md  |  48 ++
 gcc/config/riscv/vector.md|   3 +-
 16 files changed, 1257 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index ebb71000d12..15bd6370b9a 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -305,3 +305,13 @@
   "Shifting immediate for SIMD shufflei3."
   (and (match_code "const_int")
(matc

Re: Ping^3: [PATCH] warn-access: ignore template parameters when matching operator new/delete [PR109224]

2024-12-19 Thread Jason Merrill

On Thu, Sep 19, 2024 at 6:41 PM Arsen Arsenović  wrote:
>
> Arsen Arsenović  writes:
>
> > Gentle ping again.  Full patch:
> > https://inbox.sourceware.org/gcc-patches/86y14ptvdi@aarsen.me/

For some reason it seems Thunderbird didn't want me to see this patch.  OK.

> And again.  To clarify, the above is a v2 of sorts (it has the comment
> fixed, I just didn't update the subject).
>
> TIA, have a lovely day.
> --
> Arsen Arsenović

[patch 1/2] Add new target hook to assemble a variable

2024-12-19 Thread Georg-Johann Lay


This patch adds a new target hook that allows the backend to asm output
a variable definition in its own way.  This hook is needed because
varasm.cc imposes a very restrictive layout for all variable definitions
which will be basically ELF style (on ELF targets as least).  To date,
there is no way for a backend to output a variable definition in a
different way.

This hook is required by the avr backend when it outputs definitions
for variables defined with the "io", "io_low" or "address" attribute
that don't follow ELF style.  These attributes are basically symbol
definitions of the form

   .global var_io
   var_io = 32

with some additional assertions.

The avr part that uses TARGET_ASM_VARIABLE is patch part 2/2.

The patch passes bootstrap and tests with no new regressions.

Ok for trunk?

Johann

--

This patch adds a new target hook that allows the backend to asm output
a variable definition in its own way.  This hook is needed because
varasm.cc imposes a very restrictive layout for all variable definitions
which will be basically ELF style (on ELF targets as least).  To date,
there is no way for a backend to output a variable definition in a
different way.
   This hook is required by the avr backend when it outputs definitions
for variables defined with the "io", "io_low" or "address" attribute that
don't follow ELF style.  These attributes are basically symbol definitions
of the form

   .global var_io
   var_io = 32

with some additional assertions.

gcc/
* target.def (TARGET_ASM_OUT) : Add new DEFHOOK.
* targhooks.cc (default_asm_out_variable): New function.
* targhooks.h (default_asm_out_variable): New prototype.
* doc/tm.texi.in (TARGET_ASM_VARIABLE): Place hook documentation.
* doc/tm.texi: Rebuild.
* varasm.cc (assemble_variable): Call targetm.asm_out.variable
in order to allow the backend to output a variable definition
in its own style.diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index d7170f45206..12c09c88e6c 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -8522,6 +8522,24 @@ The default implementation of this hook will use the
 when the relevant string is @code{NULL}.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_ASM_VARIABLE (FILE *@var{stream}, tree @var{decl}, const char *@var{name})
+This hook outputs the assembly code for a @var{decl} that satisfies
+@code{VAR_P} for an object with assembler name @var{name}.
+@code{DECL_RTL (@var{decl})} is of the form @code{(mem (symbol_ref))}.
+Returns @code{true} when the output has been performed and
+@code{false}, otherwise.
+
+It is unlikely that you'll ever neede to implement this hook.
+The middle-and knows how to output object definitions in terms of hook
+macros and functions like @code{GLOBAL_ASM_OP},
+@code{TARGET_ASM_INTEGER}, etc.
+
+When the output is performed by means of this hook, then the complete object
+definition has to be emit, including directives like @code{.global},
+@code{.size}, @code{.type}, @code{.section}, the object label and the
+object content.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_ASM_DECL_END (void)
 Define this hook if the target assembler requires a special marker to
 terminate an initialized variable declaration.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f6657f9df1d..2bfeb6ca310 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -5256,6 +5256,8 @@ It must not be modified by command-line option processing.
 
 @hook TARGET_ASM_INTEGER
 
+@hook TARGET_ASM_VARIABLE
+
 @hook TARGET_ASM_DECL_END
 
 @hook TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
diff --git a/gcc/target.def b/gcc/target.def
index 8cf29c57eae..e90b4bad134 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -93,6 +93,26 @@ when the relevant string is @code{NULL}.",
  bool, (rtx x, unsigned int size, int aligned_p),
  default_assemble_integer)
 
+DEFHOOK
+(variable,
+ "This hook outputs the assembly code for a @var{decl} that satisfies\n\
+@code{VAR_P} for an object with assembler name @var{name}.\n\
+@code{DECL_RTL (@var{decl})} is of the form @code{(mem (symbol_ref))}.\n\
+Returns @code{true} when the output has been performed and\n\
+@code{false}, otherwise.\n\
+\n\
+It is unlikely that you'll ever neede to implement this hook.\n\
+The middle-and knows how to output object definitions in terms of hook\n\
+macros and functions like @code{GLOBAL_ASM_OP},\n\
+@code{TARGET_ASM_INTEGER}, etc.\n\
+\n\
+When the output is performed by means of this hook, then the complete object\n\
+definition has to be emit, including directives like @code{.global},\n\
+@code{.size}, @code{.type}, @code{.section}, the object label and the\n\
+object content.",
+ bool, (FILE *stream, tree decl, const char *name),
+ default_asm_out_variable)
+
 /* Assembly strings required after the .cfi_startproc label.  */
 DEFHOOK
 (post_cfi_startproc,
diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index 8ea8d778003..c0833b4573d 100644
--- a/gcc/targhooks.cc
+++ b/gcc/t

Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-12-19 Thread Richard Sandiford

Jennifer Schmitz  writes:
> @@ -8834,22 +8834,7 @@ vectorizable_store (vec_info *vinfo,
>   {
> if (costing_p)
>   {
> -   /* Only need vector extracting when there are more
> -  than one stores.  */
> -   if (nstores > 1)
> - inside_cost
> -   += record_stmt_cost (cost_vec, 1, vec_to_scalar,
> -stmt_info, slp_node,
> -0, vect_body);
> -   /* Take a single lane vector type store as scalar
> -  store to avoid ICE like 110776.  */
> -   if (VECTOR_TYPE_P (ltype)
> -   && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U))
> - n_adjacent_stores++;
> -   else
> - inside_cost
> -   += record_stmt_cost (cost_vec, 1, scalar_store,
> -stmt_info, 0, vect_body);
> +   n_adjacent_stores++;
> continue;
>   }
> tree newref, newoff;
> @@ -8905,9 +8890,26 @@ vectorizable_store (vec_info *vinfo,
>if (costing_p)
>   {
> if (n_adjacent_stores > 0)
> - vect_get_store_cost (vinfo, stmt_info, slp_node, n_adjacent_stores,
> -  alignment_support_scheme, misalignment,
> -  &inside_cost, cost_vec);
> + {
> +   /* Take a single lane vector type store as scalar
> +  store to avoid ICE like 110776.  */
> +   if (VECTOR_TYPE_P (ltype)
> +   && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U))

Sorry to ask, since it's pre-existing, but could you change this to
maybe_ne while you're there?  nunits==1+1X should be treated as a vector
rather than a scalar.

Thanks,
Richard

> + vect_get_store_cost (vinfo, stmt_info, slp_node,
> +  n_adjacent_stores, 
> alignment_support_scheme,
> +  misalignment, &inside_cost, cost_vec);
> +   else
> + inside_cost
> +   += record_stmt_cost (cost_vec, n_adjacent_stores,
> +scalar_store, stmt_info, 0, vect_body);
> +   /* Only need vector extracting when there are more
> +  than one stores.  */
> +   if (nstores > 1)
> + inside_cost
> +   += record_stmt_cost (cost_vec, n_adjacent_stores,
> +vec_to_scalar, stmt_info, slp_node,
> +0, vect_body);
> + }
> if (dump_enabled_p ())
>   dump_printf_loc (MSG_NOTE, vect_location,
>"vect_model_store_cost: inside_cost = %d, "

Re: [PATCH v3] testsuite: arm: Use effective-target for memset-inline* tests

2024-12-19 Thread Richard Earnshaw (lists)

On 18/12/2024 18:45, Torbjörn SVENSSON wrote:
> Changes since v1:
> 
> - Split tests into two parts. One part for doing asm checkes. Another part
>   for doing run test as these require hardware to be available.
> - Changed existing tests to be "compile" instead of "run".
> 
> Changes since v2:
> 
> - Applied the same fix to memset-inline-8.c and memset-inline-9.c since
>   they also fail for the same reason.
> 
> Ok for trunk and releases/gcc-14?
> 
> --
> 
> Split tests into 2 parts:
> - The first part checkes the assmbler generated.
> - The second part does the run test and this part now requires
>   effective-target arm_neon_hw.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/memset-inline-4.c: Only check assembler output.
>   * gcc.target/arm/memset-inline-5.c: Likewise.
>   * gcc.target/arm/memset-inline-6.c: Likewise.
>   * gcc.target/arm/memset-inline-8.c: Likewise.
>   * gcc.target/arm/memset-inline-9.c: Likewise.
>   * gcc.target/arm/memset-inline-4-exe.c: New test.
>   * gcc.target/arm/memset-inline-5-exe.c: Likewise.
>   * gcc.target/arm/memset-inline-6-exe.c: Likewise.
>   * gcc.target/arm/memset-inline-8-exe.c: Likewise.
>   * gcc.target/arm/memset-inline-9-exe.c: Likewise.
> 
> Signed-off-by: Torbjörn SVENSSON 

OK.

R.

> ---
>  gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c | 7 +++
>  gcc/testsuite/gcc.target/arm/memset-inline-4.c | 2 +-
>  gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c | 7 +++
>  gcc/testsuite/gcc.target/arm/memset-inline-5.c | 2 +-
>  gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c | 7 +++
>  gcc/testsuite/gcc.target/arm/memset-inline-6.c | 2 +-
>  gcc/testsuite/gcc.target/arm/memset-inline-8-exe.c | 7 +++
>  gcc/testsuite/gcc.target/arm/memset-inline-8.c | 2 +-
>  gcc/testsuite/gcc.target/arm/memset-inline-9-exe.c | 7 +++
>  gcc/testsuite/gcc.target/arm/memset-inline-9.c | 2 +-
>  10 files changed, 40 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-8-exe.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-9-exe.c
> 
> diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c 
> b/gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c
> new file mode 100644
> index 000..fef6c4365e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c
> @@ -0,0 +1,7 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "Don't inline memset using neon instructions" { ! 
> arm_tune_string_ops_prefer_neon } } */
> +/* { dg-require-effective-target arm_neon_hw } */
> +/* { dg-options "-save-temps -O2 -fno-inline" } */
> +/* { dg-add-options "arm_neon" } */
> +
> +#include "./memset-inline-4.c"
> diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-4.c 
> b/gcc/testsuite/gcc.target/arm/memset-inline-4.c
> index 5d7223ef2c0..6eb2a9d18a3 100644
> --- a/gcc/testsuite/gcc.target/arm/memset-inline-4.c
> +++ b/gcc/testsuite/gcc.target/arm/memset-inline-4.c
> @@ -1,4 +1,4 @@
> -/* { dg-do run } */
> +/* { dg-do compile } */
>  /* { dg-skip-if "Don't inline memset using neon instructions" { ! 
> arm_tune_string_ops_prefer_neon } } */
>  /* { dg-options "-save-temps -O2 -fno-inline" } */
>  /* { dg-add-options "arm_neon" } */
> diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c 
> b/gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c
> new file mode 100644
> index 000..a52a527ea13
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c
> @@ -0,0 +1,7 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "Don't inline memset using neon instructions" { ! 
> arm_tune_string_ops_prefer_neon } } */
> +/* { dg-require-effective-target arm_neon_hw } */
> +/* { dg-options "-save-temps -O2 -fno-inline" } */
> +/* { dg-add-options "arm_neon" } */
> +
> +#include "./memset-inline-5.c"
> diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-5.c 
> b/gcc/testsuite/gcc.target/arm/memset-inline-5.c
> index 6e7ae65eef4..0f55c7b8c88 100644
> --- a/gcc/testsuite/gcc.target/arm/memset-inline-5.c
> +++ b/gcc/testsuite/gcc.target/arm/memset-inline-5.c
> @@ -1,4 +1,4 @@
> -/* { dg-do run } */
> +/* { dg-do compile } */
>  /* { dg-skip-if "Don't inline memset using neon instructions" { ! 
> arm_tune_string_ops_prefer_neon } } */
>  /* { dg-options "-save-temps -O2 -fno-inline" } */
>  /* { dg-add-options "arm_neon" } */
> diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c 
> b/gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c
> new file mode 100644
> index 000..8e58d681023
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c
> @@ -0,0 +1,7 @@
> +/* { dg-do run } */
> +/* { dg-skip-if "Don't inline memset using neon instructions" { ! 
> arm_tune_string_op

Re: [PATCH] testsuite: arm: C++26 uses __equal() instead of operator==()

2024-12-19 Thread Richard Earnshaw (lists)

On 18/12/2024 19:57, Torbjörn SVENSSON wrote:
> Ok for trunk?
> 
> --
> 
> Update test case to align with used function in C++26.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/abi/arm_rtti1.C: Check for expected symbol in C++26.
> 
> Signed-off-by: Torbjörn SVENSSON 

OK.

R.
> ---
>  gcc/testsuite/g++.dg/abi/arm_rtti1.C | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/g++.dg/abi/arm_rtti1.C 
> b/gcc/testsuite/g++.dg/abi/arm_rtti1.C
> index 74f00033d9a..5ebae26e670 100644
> --- a/gcc/testsuite/g++.dg/abi/arm_rtti1.C
> +++ b/gcc/testsuite/g++.dg/abi/arm_rtti1.C
> @@ -2,7 +2,8 @@
>  // { dg-options "-O2" } 
>  // Check that, even when optimizing, we emit an out-of-line call to
>  // the type-info comparison function.
> -// { dg-final { scan-assembler _ZNKSt9type_infoeqERKS_ } }
> +// { dg-final { scan-assembler _ZNKSt9type_infoeqERKS_ { target { ! c++26 } 
> } } }
> +// { dg-final { scan-assembler _ZNKSt9type_info7__equalERKS_ { target { 
> c++26 } } } }
>  
>  #include 
>

Re: [PATCH v2] libstdc++: add initializer_list constructor to std::span (P2447)

2024-12-19 Thread Giuseppe D'Angelo


Hello,

On 19/12/2024 13:27, Jonathan Wakely wrote:

I was about to push this and realised it's missing a Signed-off-by
tag. I assume you meant to contribute this under the DCO terms, as
with your previous patches?


Yes, of course; sorry for forgetting the line. Here's a signed patch.


--
Giuseppe D'Angelo

From a8be47634a6cb9d26a66ccc43dd22dc928850aa8 Mon Sep 17 00:00:00 2001
From: Giuseppe D'Angelo 
Date: Tue, 3 Dec 2024 16:56:45 +0100
Subject: [PATCH] libstdc++: add initializer_list constructor to std::span
 (P2447R6)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This commit implements P2447R6. The code is straightforward (just one
extra constructor, with constraints and conditional explicit).

I decided to suppress -Winit-list-lifetime because otherwise it would
give too many false positives. The new constructor is meant to be used
as a parameter-passing interface (this is a design choice, see
P2447R6/§2) and, as such, the initializer_list won't dangle despite
GCC's warnings.

The new constructor isn't 100% backwards compatible. A couple of
examples are included in Annex C, but I have also lifted some more
from R4. A new test checks for the old and the new behaviors.

libstdc++-v3/ChangeLog:

	* include/bits/version.def: Added the new feature-testing macro.
	* include/bits/version.h (defined): Regenerated.
	* include/std/span: Added constructor from initializer_list.
	* testsuite/23_containers/span/init_list_cons.cc: New test.
	* testsuite/23_containers/span/init_list_cons_neg.cc: New test.

Signed-off-by: Giuseppe D'Angelo 
---
 libstdc++-v3/include/bits/version.def |  8 +++
 libstdc++-v3/include/bits/version.h   | 10 +++
 libstdc++-v3/include/std/span | 17 +
 .../23_containers/span/init_list_cons.cc  | 65 +++
 .../23_containers/span/init_list_cons_neg.cc  | 31 +
 5 files changed, 131 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/23_containers/span/init_list_cons.cc
 create mode 100644 libstdc++-v3/testsuite/23_containers/span/init_list_cons_neg.cc

diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def
index 8d4b8e9b383..cfa0469fb2d 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -1853,6 +1853,14 @@ ftms = {
   };
 };
 
+ftms = {
+  name = span_initializer_list;
+  values = {
+v = 202311;
+cxxmin = 26;
+  };
+};
+
 ftms = {
   name = text_encoding;
   values = {
diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h
index c556aca38fa..6a2c66bdf81 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -2055,6 +2055,16 @@
 #endif /* !defined(__cpp_lib_saturation_arithmetic) && defined(__glibcxx_want_saturation_arithmetic) */
 #undef __glibcxx_want_saturation_arithmetic
 
+#if !defined(__cpp_lib_span_initializer_list)
+# if (__cplusplus >  202302L)
+#  define __glibcxx_span_initializer_list 202311L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_span_initializer_list)
+#   define __cpp_lib_span_initializer_list 202311L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_span_initializer_list) && defined(__glibcxx_want_span_initializer_list) */
+#undef __glibcxx_want_span_initializer_list
+
 #if !defined(__cpp_lib_text_encoding)
 # if (__cplusplus >  202302L) && _GLIBCXX_HOSTED && (_GLIBCXX_USE_NL_LANGINFO_L)
 #  define __glibcxx_text_encoding 202306L
diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index e8043c02c9a..c8aa5c02635 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -39,6 +39,7 @@
 #endif
 
 #define __glibcxx_want_span
+#define __glibcxx_want_span_initializer_list
 #include 
 
 #ifdef __cpp_lib_span // C++ >= 20 && concepts
@@ -46,6 +47,9 @@
 #include 
 #include 
 #include 
+#ifdef __cpp_lib_span_initializer_list
+# include 
+#endif
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -228,6 +232,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	: _M_ptr(ranges::data(__range)), _M_extent(ranges::size(__range))
 	{ }
 
+#if __cpp_lib_span_initializer_list >= 202311L // >= C++26
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-list-lifetime"
+	constexpr
+	explicit(extent != dynamic_extent)
+	span(initializer_list __il)
+	requires (is_const_v<_Type>)
+	: _M_ptr(__il.begin()), _M_extent(__il.size())
+	{
+	}
+#pragma GCC diagnostic pop
+#endif
+
   constexpr
   span(const span&) noexcept = default;
 
diff --git a/libstdc++-v3/testsuite/23_containers/span/init_list_cons.cc b/libstdc++-v3/testsuite/23_containers/span/init_list_cons.cc
new file mode 100644
index 000..1dc30ab1a50
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/span/init_list_cons.cc
@@ -0,0 +1,65 @@
+// { dg-do compile { target c++26 } }
+
+#include 
+#include 
+
+#if !defined(__cpp_lib_span_initializer_list)
+# error

Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-12-19 Thread Jennifer Schmitz

> On 19 Dec 2024, at 11:14, Richard Sandiford  wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Jennifer Schmitz  writes:
>> @@ -8834,22 +8834,7 @@ vectorizable_store (vec_info *vinfo,
>>  {
>>if (costing_p)
>>  {
>> -   /* Only need vector extracting when there are more
>> -  than one stores.  */
>> -   if (nstores > 1)
>> - inside_cost
>> -   += record_stmt_cost (cost_vec, 1, vec_to_scalar,
>> -stmt_info, slp_node,
>> -0, vect_body);
>> -   /* Take a single lane vector type store as scalar
>> -  store to avoid ICE like 110776.  */
>> -   if (VECTOR_TYPE_P (ltype)
>> -   && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U))
>> - n_adjacent_stores++;
>> -   else
>> - inside_cost
>> -   += record_stmt_cost (cost_vec, 1, scalar_store,
>> -stmt_info, 0, vect_body);
>> +   n_adjacent_stores++;
>>continue;
>>  }
>>tree newref, newoff;
>> @@ -8905,9 +8890,26 @@ vectorizable_store (vec_info *vinfo,
>>   if (costing_p)
>>  {
>>if (n_adjacent_stores > 0)
>> - vect_get_store_cost (vinfo, stmt_info, slp_node, n_adjacent_stores,
>> -  alignment_support_scheme, misalignment,
>> -  &inside_cost, cost_vec);
>> + {
>> +   /* Take a single lane vector type store as scalar
>> +  store to avoid ICE like 110776.  */
>> +   if (VECTOR_TYPE_P (ltype)
>> +   && known_ne (TYPE_VECTOR_SUBPARTS (ltype), 1U))
> 
> Sorry to ask, since it's pre-existing, but could you change this to
> maybe_ne while you're there?  nunits==1+1X should be treated as a vector
> rather than a scalar.
Sure, I made the change (see patch below) and re-validated on aarch64.

It would also be good to check for performance regressions, now that we have a 
patch to test:
I will run SPEC2017 with -mcpu=generic and -mcpu=native on Grace, but we would 
appreciate help with benchmarking on other platforms.
Tamar, would you still be willing to test the patch on other platforms?

If there are no other changes necessary and assuming there are no performance 
regressions, I was planning to commit the patch in January after returning from 
christmas break.

In the meantime I wish everyone happy holidays.
Jennifer

This patch removes the AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS tunable and
use_new_vector_costs entry in aarch64-tuning-flags.def and makes the
AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS paths in the backend the
default. To that end, the function aarch64_use_new_vector_costs_p and its uses
were removed. To prevent costing vec_to_scalar operations with 0, as
described in
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665481.html,
we adjusted vectorizable_store such that the variable n_adjacent_stores
also covers vec_to_scalar operations. This way vec_to_scalar operations
are not costed individually, but as a group.
As suggested by Richard Sandiford, the "known_ne" in the multilane-check
was replaced by "maybe_ne" in order to treat nunits==1+1X as a vector
rather than a scalar.

Two tests were adjusted due to changes in codegen. In both cases, the
old code performed loop unrolling once, but the new code does not:
Example from gcc.target/aarch64/sve/strided_load_2.c (compiled with
-O2 -ftree-vectorize -march=armv8.2-a+sve -mtune=generic -moverride=tune=none):
f_int64_t_32:
cbz w3, .L92
mov x4, 0
uxtwx3, w3
+   cntdx5
+   whilelo p7.d, xzr, x3
+   mov z29.s, w5
mov z31.s, w2
-   whilelo p6.d, xzr, x3
-   mov x2, x3
-   index   z30.s, #0, #1
-   uqdecd  x2
-   ptrue   p5.b, all
-   whilelo p7.d, xzr, x2
+   index   z30.d, #0, #1
+   ptrue   p6.b, all
.p2align 3,,7
 .L94:
-   ld1dz27.d, p7/z, [x0, #1, mul vl]
-   ld1dz28.d, p6/z, [x0]
-   movprfx z29, z31
-   mul z29.s, p5/m, z29.s, z30.s
-   incwx4
-   uunpklo z0.d, z29.s
-   uunpkhi z29.d, z29.s
-   ld1dz25.d, p6/z, [x1, z0.d, lsl 3]
-   ld1dz26.d, p7/z, [x1, z29.d, lsl 3]
-   add z25.d, z28.d, z25.d
+   ld1dz27.d, p7/z, [x0, x4, lsl 3]
+   movprfx z28, z31
+   mul z28.s, p6/m, z28.s, z30.s
+   ld1dz26.d, p7/z, [x1, z28.d, uxtw 3]
add z26.d, z27.d, z26.d
-   st1dz26.d, p7, [x0, #1, mul vl]
-   whilelo p7.d, x4, x2
-   st1dz25.d, p6, [x0]
-   incwz30.s
-   incbx0, all, mul #2
-   whilelo p6.d, x4, x3
+   st1dz26.d, p7, [x0,

[PATCH] testsuite/118127: Pass fortran tests on ppc64le for IEEE128 long doubles

2024-12-19 Thread Siddhesh Poyarekar

Denormal behaviour is well defined for IEEE128 long doubles, so don't
XFAIL some gfortran tests on ppc64le when configured with the IEEE128
long double ABI.

gcc/testsuite/ChangeLog:

PR testsuite/118127
* gfortran.dg/default_format_2.f90: Don't xfail for
ppc_ieee128_ok.
* gfortran.dg/default_format_denormal_2.f90: Likewise.
* gfortran.dg/large_real_kind_form_io_2.f90: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/testsuite/gfortran.dg/default_format_2.f90  | 2 +-
 gcc/testsuite/gfortran.dg/default_format_denormal_2.f90 | 2 +-
 gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90 | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/default_format_2.f90 
b/gcc/testsuite/gfortran.dg/default_format_2.f90
index 5ad7b3a6429..e76dce33b35 100644
--- a/gcc/testsuite/gfortran.dg/default_format_2.f90
+++ b/gcc/testsuite/gfortran.dg/default_format_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail powerpc*-apple-darwin* powerpc*-*-linux* } }
+! { dg-do run { xfail { { powerpc*-apple-darwin* || powerpc*-*-linux* } && { ! 
ppc_ieee128_ok } } } }
 ! { dg-require-effective-target fortran_large_real }
 ! Test XFAILed on these platforms because the system's printf() lacks
 ! proper support for denormalized long doubles. See PR24685
diff --git a/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90 
b/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90
index e9ccf5e8f61..1872396764d 100644
--- a/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90
+++ b/gcc/testsuite/gfortran.dg/default_format_denormal_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail powerpc*-*-* } }
+! { dg-do run { xfail { powerpc*-*-* && { ! ppc_ieee128_ok } } } }
 ! { dg-require-effective-target fortran_large_real }
 ! Test XFAILed on this platform because the system's printf() lacks
 ! proper support for denormalized long doubles. See PR24685
diff --git a/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90 
b/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90
index 34b8aec462c..3a4a9e1b078 100644
--- a/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90
+++ b/gcc/testsuite/gfortran.dg/large_real_kind_form_io_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail powerpc*-apple-darwin* powerpc*-*-linux* } }
+! { dg-do run { xfail { { powerpc*-apple-darwin* || powerpc*-*-linux* } && { ! 
ppc_ieee128_ok } } } }
 ! Test XFAILed on these platforms because the system's printf() lacks
 ! proper support for denormalized long doubles. See PR24685
 ! { dg-require-effective-target fortran_large_real }
-- 
2.47.1

[committed]

2024-12-19 Thread Tobias Burnus


Fallout from looking at the additional definition document:
I noticed that "nec" was missing from GCC's vendor trait
selector list.

That's only a minor issue as only effect is that for unknown
vendors, a warning is printed:
  "unknown property ‘nec’ of ‘vendor’ selector"
Still, it makes sense to support all vendors and not only
a subset. The current list (and the 5.0 and 5.1 lists) are
available under https://www.openmp.org/specifications/
All of them include NEC ("nec") but not the original
internal list (see some history digging in the attached
commit).

Committed asr15-6373-ga104766914e98d

Tobias
commit a104766914e98ded9b991f1dac9ad22e815a3acc
Author: Tobias Burnus 
Date:   Thu Dec 19 17:27:41 2024 +0100

OpenMP: Add 'nec' as to the 'vendor' context-selector list

For unknown vendors using in a context selector such as
   match(implementation={vendor(...)})
GCC prints a warning like:
   warning: unknown property 'nec' of 'vendor' selector

While all known vendors (including the vendor 'unknown') are silently
accepted, only "gnu" counts as matched by GCC.

The list of known vendors is published in OpenMP's additional
definition document (or, previously, the context deﬁnitions document).
While the initial list did not contain 'nec', it was added quite early
but GCC missed this addition, which this commit rectifies.

Some history:
* GCC added the list in r10-3744-g94e7f906ca5c73 (Oct 2019)
* At spec level, 'pgi' was replaced by 'nvidia' in Nov 2019, but
  GCC (since r10-4639-gd0ec7c935f0c96, Nov 2019) and LLVM recognize
  both vendor names.
* 'nec' was then added in Dec 2019 and is present in
  "Context Deﬁnitions for the OpenMP API Speciﬁcation Version 5.0
  – Version 1.0", but only this commit adds it.
* 'hpe' (as alias for 'cray') was added to the spec in Nov 2020 but
  to GCC only in r14-6720-gd0603dfe9d3bc7 (Dec 2023).

gcc/
* omp-general.cc (vendor_properties): Add "nec".
---
 gcc/omp-general.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
index 9837ca021f0..78c38a6d715 100644
--- a/gcc/omp-general.cc
+++ b/gcc/omp-general.cc
@@ -1163,7 +1163,7 @@ static const char *const kind_properties[] =
   { "host", "nohost", "cpu", "gpu", "fpga", "any", NULL };
 static const char *const vendor_properties[] =
   { "amd", "arm", "bsc", "cray", "fujitsu", "gnu", "hpe", "ibm", "intel",
-"llvm", "nvidia", "pgi", "ti", "unknown", NULL };
+"llvm", "nec", "nvidia", "pgi", "ti", "unknown", NULL };
 static const char *const extension_properties[] =
   { NULL };
 static const char *const atomic_default_mem_order_properties[] =

Re: [PATCH 1/2] libstdc++: Implement C++23 (P0429R9)

2024-12-19 Thread Patrick Palka

On Fri, 25 Oct 2024, Patrick Palka wrote:

> On Wed, 16 Oct 2024, Patrick Palka wrote:
> 
> > On Mon, 30 Sep 2024, Patrick Palka wrote:
> > 
> > > This implements the C++23 container adaptors std::flat_map and
> > > std::flat_multimap from P0429R9.  The implementation is shared
> > > as much as possible between the two adaptors via a common base
> > > class that's parameterized according to key uniqueness.
> > > 
> > > The main known issues are:
> > > 
> > >   * the range insert() overload exceeds its complexity requirements
> > > since an idiomatic efficient implementation needs a non-buggy
> > > ranges::inplace_merge
> > >   * exception safety is likely incomplete/buggy
> > >   * unimplemented from_range_t constructors and insert_range function
> > >   * the main workhorse function _M_try_emplace is probably buggy
> > > buggy wrt its handling of the hint parameter and could be simplified
> > >   * more extensive testcases are a WIP
> > > 
> > > The iterator type is encoded as a {pointer, index} pair instead of an
> > > {iterator, iterator} pair.  I'm not sure which encoding is preferable?
> > > It seems the latter would allow for better debuggability when the
> > > underlying iterators are debug iterators.
> > 
> > Here's v2 which adds somewhat more tests and uses the std:: algos
> > instead of ranges:: algos where possible, along with some other
> > very minor cleanups.
> 
> Now also available in PR form at:
> https://forge.sourceware.org/gcc/gcc-TEST/pulls/9

Series pushed to trunk as r15-{6370,6371,6372}.

> 
> > 
> > -- >8 --
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > * include/Makefile.am: Add new header .
> > * include/Makefile.in: Regenerate.
> > * include/bits/stl_function.h (__transparent_comparator): Define.
> > * include/bits/utility.h (sorted_unique_t): Define for C++23.
> > (sorted_unique): Likewise.
> > (sorted_equivalent_t): Likewise.
> > (sorted_equivalent): Likewise.
> > * include/bits/version.def (flat_map): Define.
> > * include/bits/version.h: Regenerate.
> > * include/std/flat_map: New file.
> > * testsuite/23_containers/flat_map/1.cc: New test.
> > * testsuite/23_containers/flat_multimap/1.cc: New test.
> > ---
> >  libstdc++-v3/include/Makefile.am  |1 +
> >  libstdc++-v3/include/Makefile.in  |1 +
> >  libstdc++-v3/include/bits/stl_function.h  |6 +
> >  libstdc++-v3/include/bits/utility.h   |8 +
> >  libstdc++-v3/include/bits/version.def |8 +
> >  libstdc++-v3/include/bits/version.h   |   10 +
> >  libstdc++-v3/include/std/flat_map | 1475 +
> >  .../testsuite/23_containers/flat_map/1.cc |  123 ++
> >  .../23_containers/flat_multimap/1.cc  |  106 ++
> >  9 files changed, 1738 insertions(+)
> >  create mode 100644 libstdc++-v3/include/std/flat_map
> >  create mode 100644 libstdc++-v3/testsuite/23_containers/flat_map/1.cc
> >  create mode 100644 libstdc++-v3/testsuite/23_containers/flat_multimap/1.cc
> > 
> > diff --git a/libstdc++-v3/include/Makefile.am 
> > b/libstdc++-v3/include/Makefile.am
> > index 422a0f4bd0a..632bbafa63e 100644
> > --- a/libstdc++-v3/include/Makefile.am
> > +++ b/libstdc++-v3/include/Makefile.am
> > @@ -70,6 +70,7 @@ std_headers = \
> > ${std_srcdir}/deque \
> > ${std_srcdir}/execution \
> > ${std_srcdir}/filesystem \
> > +   ${std_srcdir}/flat_map \
> > ${std_srcdir}/format \
> > ${std_srcdir}/forward_list \
> > ${std_srcdir}/fstream \
> > diff --git a/libstdc++-v3/include/Makefile.in 
> > b/libstdc++-v3/include/Makefile.in
> > index 9fd4ab4848c..1ac963c4415 100644
> > --- a/libstdc++-v3/include/Makefile.in
> > +++ b/libstdc++-v3/include/Makefile.in
> > @@ -426,6 +426,7 @@ std_freestanding = \
> >  @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/deque \
> >  @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/execution \
> >  @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/filesystem \
> > +@GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/flat_map \
> >  @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/format \
> >  @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/forward_list \
> >  @GLIBCXX_HOSTED_TRUE@  ${std_srcdir}/fstream \
> > diff --git a/libstdc++-v3/include/bits/stl_function.h 
> > b/libstdc++-v3/include/bits/stl_function.h
> > index c9123ccecae..c579ba9f47b 100644
> > --- a/libstdc++-v3/include/bits/stl_function.h
> > +++ b/libstdc++-v3/include/bits/stl_function.h
> > @@ -1426,6 +1426,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >template
> >  using __has_is_transparent_t
> >= typename __has_is_transparent<_Func, _SfinaeType>::type;
> > +
> > +#if __cpp_concepts
> > +  template
> > +concept __transparent_comparator
> > +  = requires { typename _Func::is_transparent; };
> > +#endif
> >  #endif
> >  
> >  _GLIBCXX_END_NAMESPACE_VERSION
> > diff --git a/libstdc++-v3/include/bits/utility.h 
> > b/libstdc++-v3/include/bits/utility.h
> > index 4a6c16dc2e0..9e10ce2cb1c 100644
> > --- a/libs

Re: [PATCH] c++: Fix up maybe_init_list_as_array for RAW_DATA_CST [PR118124]

2024-12-19 Thread Jason Merrill


On 12/19/24 11:10 AM, Jakub Jelinek wrote:

Hi!

The previous patch made me look around some more and I found
maybe_init_list_as_array doesn't handle RAW_DATA_CSTs correctly either,
while the RAW_DATA_CST is properly split during finish_compound_literal,
it was using CONSTRUCTOR_NELTS as the size of the arrays, which is wrong,
RAW_DATA_CST could stand for far more initializers.

Fixed thusly, ok for trunk if it passes bootstrap/regtest?

2024-12-19  Jakub Jelinek  

PR c++/118124
* call.cc (maybe_init_list_as_array): Adjust len for RAW_DATA_CST
elements.
(convert_like_internal): Use length from init's type instead of
len when handling the maybe_init_list_as_array case.

* g++.dg/cpp0x/initlist-opt5.C: New test.

--- gcc/cp/call.cc.jj   2024-12-19 16:10:12.977071898 +0100
+++ gcc/cp/call.cc  2024-12-19 16:55:40.953546502 +0100
@@ -4386,7 +4386,13 @@ maybe_init_list_as_array (tree elttype,
if (!is_xible (INIT_EXPR, elttype, copy_argtypes))
  return NULL_TREE;
  
-  tree arr = build_array_of_n_type (init_elttype, CONSTRUCTOR_NELTS (init));

+  unsigned int len = CONSTRUCTOR_NELTS (init);
+  if (INTEGRAL_TYPE_P (init_elttype))
+for (constructor_elt &e: CONSTRUCTOR_ELTS (init))
+  if (TREE_CODE (e.value) == RAW_DATA_CST)
+   len += RAW_DATA_LENGTH (e.value) - 1;


Really seems like we could use a function to ask how many elements a 
CONSTRUCTOR initializes, perhaps as a wrapper around 
categorize_ctor_elements?



+  tree arr = build_array_of_n_type (init_elttype, len);
arr = finish_compound_literal (arr, init, tf_none);
DECL_MERGEABLE (TARGET_EXPR_SLOT (arr)) = true;
return arr;
@@ -8768,7 +8774,9 @@ convert_like_internal (conversion *convs
  {
elttype = cp_build_qualified_type (elttype, cp_type_quals (elttype)
| TYPE_QUAL_CONST);
-   array = build_array_of_n_type (elttype, len);
+   tree index_type = TYPE_DOMAIN (TREE_TYPE (init));
+   array = build_cplus_array_type (elttype, index_type);
+   len = TREE_INT_CST_LOW (TYPE_MAX_VALUE (index_type)) + 1;
array = build_vec_init_expr (array, init, complain);
array = get_target_expr (array);
array = cp_build_addr_expr (array, complain);
--- gcc/testsuite/g++.dg/cpp0x/initlist-opt5.C.jj   2024-12-19 
16:56:44.113675894 +0100
+++ gcc/testsuite/g++.dg/cpp0x/initlist-opt5.C  2024-12-19 16:37:28.335605902 
+0100
@@ -0,0 +1,23 @@
+// PR c++/118124
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2" }
+
+namespace std {
+template  struct initializer_list {
+private:
+  const T *_M_array;
+  decltype (sizeof 0) _M_len;
+};
+}
+struct B {
+  B (int);
+};
+struct A {
+  A (std::initializer_list);
+};
+A a { 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7,
+  8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
+  0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1,
+  2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3,
+  4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5,
+  6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5 };

Jakub

[PATCH] arm: [MVE intrinsics] Fix moves of tuples (PR target/118131)

2024-12-19 Thread Christophe Lyon

Commit r15-6245-g4f4e13dd235b introduced new modes for MVE tuples, but
missed adding support for them in a few places.

Adding them to the list in arm_attr_length_move_neon is not sufficient
since we later face another ICE where the compiler does not know how
to split move of such data.

The patch therefore enhances the define_splits for OI and XI moves in
neon.md, via the introduction of new iterators.

In addition, it seems consistent to update output_move_neon such that
VALID_NEON_*_MODE are used only when TARGET_NEON.

gcc/ChangeLog:

PR target/118131
* config/arm/arm.cc (output_move_neon): Check TARGET_NEON as
needed.
(arm_attr_length_move_neon): Add support for V2x and V4x MVE tuple
modes.
* config/arm/iterators.md (VSTRUCT2, VSTRUCT4): New.
* config/arm/neon.md: Use VSTRUCT2 instead of OI and VSTRUCT4
instead of XI in define_split.
---
 gcc/config/arm/arm.cc   | 25 +++--
 gcc/config/arm/iterators.md | 18 ++
 gcc/config/arm/neon.md  |  8 
 3 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index ecdb1bd136e..7a750e02d61 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -20775,11 +20775,13 @@ output_move_neon (rtx *operands)
   nregs = REG_NREGS (reg) / 2;
   gcc_assert (VFP_REGNO_OK_FOR_DOUBLE (regno)
  || NEON_REGNO_OK_FOR_QUAD (regno));
-  gcc_assert (VALID_NEON_DREG_MODE (mode)
- || VALID_NEON_QREG_MODE (mode)
- || VALID_NEON_STRUCT_MODE (mode)
+  gcc_assert ((TARGET_NEON
+  && (VALID_NEON_DREG_MODE (mode)
+  || VALID_NEON_QREG_MODE (mode)
+  || VALID_NEON_STRUCT_MODE (mode)))
  || (TARGET_HAVE_MVE
- && VALID_MVE_STRUCT_MODE (mode)));
+ && (VALID_MVE_MODE (mode)
+ || VALID_MVE_STRUCT_MODE (mode;
   gcc_assert (MEM_P (mem));
 
   addr = XEXP (mem, 0);
@@ -20882,8 +20884,9 @@ output_move_neon (rtx *operands)
   return "";
 }
 
-/* Compute and return the length of neon_mov, where  is
-   one of VSTRUCT modes: EI, OI, CI or XI.  */
+/* Compute and return the length of neon_mov, where  is one of
+   VSTRUCT modes: EI, OI, CI or XI for Neon, and V2x16QI, V2x8HI, V2x4SI,
+   V2x8HF, V2x4SF, V2x16QI, V2x8HI, V2x4SI, V2x8HF, V2x4SF for MVE.  */
 int
 arm_attr_length_move_neon (rtx_insn *insn)
 {
@@ -20900,10 +20903,20 @@ arm_attr_length_move_neon (rtx_insn *insn)
{
case E_EImode:
case E_OImode:
+   case E_V2x16QImode:
+   case E_V2x8HImode:
+   case E_V2x4SImode:
+   case E_V2x8HFmode:
+   case E_V2x4SFmode:
  return 8;
case E_CImode:
  return 12;
case E_XImode:
+   case E_V4x16QImode:
+   case E_V4x8HImode:
+   case E_V4x4SImode:
+   case E_V4x8HFmode:
+   case E_V4x4SFmode:
  return 16;
default:
  gcc_unreachable ();
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index cfe712ceda9..6756e29721c 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -152,6 +152,24 @@ (define_mode_iterator VSTRUCT [(EI "!TARGET_HAVE_MVE") OI
   (V4x4SF "TARGET_HAVE_MVE_FLOAT")
   ])
 
+;; Structure types of the same size as OImode
+(define_mode_iterator VSTRUCT2 [OI
+  (V2x16QI "TARGET_HAVE_MVE")
+  (V2x8HI "TARGET_HAVE_MVE")
+  (V2x4SI "TARGET_HAVE_MVE")
+  (V2x8HF "TARGET_HAVE_MVE_FLOAT")
+  (V2x4SF "TARGET_HAVE_MVE_FLOAT")
+  ])
+
+;; Structure types of the same size as XImode
+(define_mode_iterator VSTRUCT4 [XI
+  (V4x16QI "TARGET_HAVE_MVE")
+  (V4x8HI "TARGET_HAVE_MVE")
+  (V4x4SI "TARGET_HAVE_MVE")
+  (V4x8HF "TARGET_HAVE_MVE_FLOAT")
+  (V4x4SF "TARGET_HAVE_MVE_FLOAT")
+  ])
+
 ;; Opaque structure types used in table lookups (except vtbl1/vtbx1).
 (define_mode_iterator VTAB [TI EI OI])
 
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 6892b7b0f44..cfd8520c2ea 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -215,8 +215,8 @@ (define_split
 })
 
 (define_split
-  [(set (match_operand:OI 0 "s_register_operand" "")
-   (match_operand:OI 1 "s_register_operand" ""))]
+  [(set (match_operand:VSTRUCT2 0 "s_register_operand" "")
+   (match_operand:VSTRUCT2 1 "s_register_operand" ""))]
   "(TARGET_NEON || TARGET_HAVE_MVE)&& reload_completed"
   [(set (match_dup 0) (match_dup 1))
(set (match_dup 2) (match_dup 3))]
@@ -256,8 +256,8 @@ (define_split
 })
 
 (define_split
-  [(set (match_operand:XI 0 "

Re: [PATCH] Fix comment typos in tree-assume.cc

2024-12-19 Thread Andrew Carlotti

On Thu, Dec 19, 2024 at 09:22:02AM -0500, Andrew MacLeod wrote:
> I have no issues. ok by me.ï¿½ I clearly need a proofreader :-)
> 
> Andrew

Thanks! It applies cleanly to your gcc-14 backport, so I've pushed to that
branch as well.

> On 12/18/24 11:22, Andrew Carlotti wrote:
> > I think this counts as obvious, but I'll leave it a few days before 
> > committing
> > in case Andrew (or anyone else) disagrees.
> > 
> > gcc/ChangeLog:
> > 
> > * tree-assume.cc: Fix comment typos.
> > 
> > 
> > diff --git a/gcc/tree-assume.cc b/gcc/tree-assume.cc
> > index 
> > 883338bcef1e41e15a67fd015834d74319ca11af..9a934f21dc039c0b8f5af717510752d7008ed493
> >  100644
> > --- a/gcc/tree-assume.cc
> > +++ b/gcc/tree-assume.cc
> > @@ -36,16 +36,16 @@ along with GCC; see the file COPYING3.  If not see
> >   #include "tree-cfg.h"
> >   #include "gimple-pretty-print.h"
> > -// An assume query utilizes the current range query to implelemtn the 
> > assume
> > +// An assume query utilizes the current range query to implement the assume
> >   // keyword.
> >   // For any return value of 1 from the function, it attempts to determine
> > -// which paths leads to a 1 value being returned. On those paths, what
> > +// which paths lead to a 1 value being returned. On those paths, it 
> > determines
> >   // the ranges of any ssa_names listed in bitmap P (usually the parm list 
> > for
> > -// the function) are, and combined them all.
> > +// the function), and combines them all.
> >   // These ranges are then set as the global ranges for those parms in this
> >   // function.
> > -// Other functions which then refer to this function in an assume builtin
> > -// will then pick up these ranges for the paramters via the inferred range
> > +// Other functions which refer to this function in an assume builtin
> > +// will then pick up these ranges for the parameters via the inferred range
> >   // mechanism.
> >   //   See gimple-range-infer.cc::gimple_infer_range::check_assume_func ()
> >   //
> > @@ -57,11 +57,11 @@ along with GCC; see the file COPYING3.  If not see
> >   //
> >   // a small temporary assume function consisting of
> >   // assume_f1 (int x) { return x == 1 || x == 4; }
> > -// is constructed by the front end, and optimzed, at the very end of
> > +// is constructed by the front end, and optimized, at the very end of
> >   // optimization, instead of generating code, we instead invoke the assume 
> > pass
> >   // which uses this query to set the the global value of parm x to 
> > [1,1][4,4]
> >   //
> > -// Meanwhile., my_Fund has been rewritten to be:
> > +// Meanwhile., my_func has been rewritten to be:
> >   //
> >   // my_func (int x_2)
> >   // {
> > @@ -70,12 +70,12 @@ along with GCC; see the file COPYING3.  If not see
> >   //   if (x_2 == 3)
> >   //
> >   // When ranger is processing the assume_builtin_call, it looks up the 
> > global
> > -// value of the paramter in assume_f1, which is [1,1][4,4].  It then 
> > registers
> > +// value of the parameter in assume_f1, which is [1,1][4,4].  It then 
> > registers
> >   // and inferred range at this statement setting the value x_2 to 
> > [1,1][4,4]
> >   //
> > -// Any uses of x_2 after this statement will now utilzie this inferred 
> > range.
> > +// Any uses of x_2 after this statement will now utilize this inferred 
> > range.
> >   //
> > -// When VRP precoesses if (x_2 == 3), it picks up the inferred range, and
> > +// When VRP processes if (x_2 == 3), it picks up the inferred range, and
> >   // determines that x_2 can never be 3, and will rewrite the branch to
> >   //   if (0 != 0)
> > @@ -109,7 +109,7 @@ assume_query::assume_query (function *f, bitmap p) : 
> > m_parm_list (p),
> >  m_func (f)
> >   {
> > basic_block exit_bb = EXIT_BLOCK_PTR_FOR_FN (f);
> > -  // If there is more than one precessor to the exit block, bail.
> > +  // If there is more than one predecessor to the exit block, bail.
> > if (!single_pred_p (exit_bb))
> >   return;
> > @@ -130,7 +130,7 @@ assume_query::assume_query (function *f, bitmap p) : 
> > m_parm_list (p),
> > if (!irange::supports_p (lhs_type))
> >   return;
> > -  // Only values of interest are when the return value is 1.  The defintion
> > +  // Only values of interest are when the return value is 1.  The 
> > definition
> > // of the return value must be in the same block, or we have
> > // complicated flow control we don't understand, and just return.
> > unsigned prec = TYPE_PRECISION (lhs_type);
> > @@ -169,7 +169,7 @@ assume_query::assume_query (function *f, bitmap p) : 
> > m_parm_list (p),
> >  }
> >   }
> > -// This function Will update all the current value of interesting 
> > parameters.
> > +// This function will update all the current values of interesting 
> > parameters.
> >   // It tries, in order:
> >   //a) a range found via path calculations.
> >   //b) range of the parm at SRC point in the IL. (either edge o

[committed] libgomp.texi: Update 'arch' context-selector description

2024-12-19 Thread Tobias Burnus


Found when reviewing the metadirective/context-selector patch,
"[PATCH v5 02/10] OpenMP: Re-work and extend context selector resolution",
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671295.html
as it touches the related code (without modifying how GCC handles this).

Namely, as I wrote at [1], we did not list 'cpu' as supported 'kind'
for the host nor 'nonhost' for the GPU devices and also not 'any' as
being supported for both.

Well, since the just committed patch (r15-6367-g570d4e4c68535e),
https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html
now also lists any/cpu/nohost. (See attached patch.)

Thanks,

Tobias

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671954.html
commit 570d4e4c68535ee4e5b2d82ad02a65fa1ec04112
Author: Tobias Burnus 
Date:   Thu Dec 19 16:06:21 2024 +0100

libgomp.texi: Update 'arch' context-selector description

* libgomp.texi (OpenMP Context Selectors): Document that 'kind' also
accepts 'cpu'/'any' on host and 'any'/'nohost' on 'nohost' devices.
---
 libgomp/libgomp.texi | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 6b8000c696f..4e0ed993b2c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -6687,9 +6687,10 @@ smaller number.  On non-host devices, the value of the
 @c has to be implemented; cf. also PR target/105640.
 @c For offload devices, add *additionally* gcc/config/*/t-omp-device.
 
-For the host compiler, @code{kind} always matches @code{host}; for the
-offloading architectures AMD GCN and Nvidia PTX, @code{kind} always matches
-@code{gpu}.  For the x86 family of computers, AMD GCN and Nvidia PTX
+For the host compiler, @code{kind} always matches @code{host}, @code{cpu}
+and @code{any}; for the offloading architectures AMD GCN and Nvidia PTX,
+@code{kind} always matches @code{nohost}, @code{gpu} and @code{any}.
+For the x86 family of computers, AMD GCN and Nvidia PTX
 the following traits are supported in addition; while OpenMP is supported
 on more architectures, GCC currently does not match any @code{arch} or
 @code{isa} traits for those.

Re: [PATCH v3 0/5] c++/modules: Implement P1815 "Translation-unit-local entities"

2024-12-19 Thread Jason Merrill


On 12/18/24 9:17 AM, Nathaniel Shead wrote:

On Tue, Dec 17, 2024 at 03:58:38PM -0500, Jason Merrill wrote:

On 11/27/24 3:53 AM, Nathaniel Shead wrote:

Gentle ping for this series:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665108.html

Most of the patches no longer applied cleanly to trunk since the last
time I pinged this so I'm attaching newly rebased patches.

One slight adjustment I've included as well is a test in internal-4_b.C
for exposures of namespace aliases, as in:

namespace { namespace internal {} }
export namespace exposure = internal;

By the standard this appears to be well-formed; we currently error, and
I think this might be the desired behaviour (an easy workaround is to
wrap the alias in an anonymous namespace), but thought I'd at least test
the existing behaviour here.

Tested modules.exp on x86_64-pc-linux-gnu, OK for trunk if full
bootstrap+regtest succeeds?


I tweaked some of these patches a bit; OK with these changes, or without
patch #2 if you'd rather not make that change.


Thanks.  I will include patch #2 (with an additional line in invoke.texi
to note that the presence of explicit instantiations will silence the
warning.)


 From 03134a00bdc0f53cb30fde284808be71811e29e7 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Wed, 11 Dec 2024 11:02:34 -0500
Subject: [PATCH] c++: adjust "Detect exposures of TU-local entities"
To: gcc-patches@gcc.gnu.org

This patch adjusts the handling of the injected-class-name to be the same as
the class name itself.

gcc/cp/ChangeLog:

* tree.cc (decl_linkage): Treat DECL_SELF_REFERENCE_P like
DECL_IMPLICIT_TYPEDEF_P.
* module.cc (depset::hash::is_tu_local_entity): Likewise.
(depset::hash::is_tu_local_value): Fix formatting.
---
  gcc/cp/module.cc | 10 +-
  gcc/cp/tree.cc   | 16 
  2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 5d11e85f41d..823884e97f3 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -13247,10 +13247,11 @@ depset::hash::is_tu_local_entity (tree decl, bool 
explain/*=false*/)
  {
gcc_checking_assert (DECL_P (decl));
  
-  /* An explicit type alias is not an entity, and so is never TU-local.  */

+  /* An explicit type alias is not an entity, and so is never TU-local.
+ Neither are the built-in declarations of 'int' and such.  */
if (TREE_CODE (decl) == TYPE_DECL
-  && !DECL_IMPLICIT_TYPEDEF_P (decl)
-  && !DECL_SELF_REFERENCE_P (decl))
+  && (is_typedef_decl (decl)
+ || !OVERLOAD_TYPE_P (TREE_TYPE (decl
  return false;
  
location_t loc = DECL_SOURCE_LOCATION (decl);

@@ -13348,7 +13349,6 @@ depset::hash::is_tu_local_entity (tree decl, bool 
explain/*=false*/)
   these aren't really TU-local.  */
if (TREE_CODE (decl) == TYPE_DECL
&& TYPE_ANON_P (type)
-  && !DECL_SELF_REFERENCE_P (decl)
/* An enum with an enumerator name for linkage.  */
&& !(UNSCOPED_ENUM_P (type) && TYPE_VALUES (type)))
  {
@@ -13473,7 +13473,7 @@ depset::hash::is_tu_local_value (tree decl, tree expr, 
bool explain)
   of reference type refer is TU-local and is usable in constant
   expressions.  */
if (TREE_CODE (e) == CONSTRUCTOR && AGGREGATE_TYPE_P (TREE_TYPE (e)))
-for (auto& f : CONSTRUCTOR_ELTS (e))
+for (auto &f : CONSTRUCTOR_ELTS (e))
if (is_tu_local_value (decl, f.value, explain))
return true;
  
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc

index 939d2b060fb..260c16418a1 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -5894,17 +5894,17 @@ decl_linkage (tree decl)
   linkage first, and then transform that into a concrete
   implementation.  */
  
-  /* An explicit type alias has no linkage.  */

+  /* An explicit type alias has no linkage.  Nor do the built-in declarations
+ of 'int' and such.  */
if (TREE_CODE (decl) == TYPE_DECL
-  && !DECL_IMPLICIT_TYPEDEF_P (decl)
-  && !DECL_SELF_REFERENCE_P (decl))
+  && !DECL_IMPLICIT_TYPEDEF_P (decl))
  {
-  /* But this could be a typedef name for linkage purposes, in which
-case we're interested in the linkage of the main decl.  */
-  if (decl == TYPE_NAME (TYPE_MAIN_VARIANT (TREE_TYPE (decl
-   decl = TYPE_MAIN_DECL (TREE_TYPE (decl));
-  else
+  if (is_typedef_decl (decl)
+ || !OVERLOAD_TYPE_P (TREE_TYPE (decl)))
return lk_none;
+  /* But this could be a typedef name for linkage purposes or injected
+class name; look to the implicit typedef for linkage.  */
+  decl = TYPE_MAIN_DECL (TREE_TYPE (decl));
  }
  
/* Namespace-scope entities with no name usually have no linkage.  */

--
2.47.1



Unfortunately, however, this patch regresses internal-1.C and
internal-4_b.C when applied to patch #1, though they test cleanly when
testing the whole series.

For a simple example, consider

   export module M;
   namespace { struct X {}; }
   void foo(

Re: [PATCH] c++, v2: Disallow [[deprecated]] on types other than class/enum definitions [PR110345]

2024-12-19 Thread Jason Merrill


On 12/19/24 5:00 AM, Jakub Jelinek wrote:

On Wed, Dec 18, 2024 at 04:44:55PM +0100, Jakub Jelinek wrote:

The first check would flag something that is used in the wild, e.g.
g++.dg/Wmissing-attributes.C
gcc.dg/gnu23-attrs-2.c
g++.dg/cpp0x/gen-attrs-81.C
g++.dg/warn/Wdangling-reference17.C
g++.dg/warn/Wdangling-reference20.C
tests would be affected by it (at least if pedantic), including
 header which uses this on
   /// If you write a replacement %unexpected handler, it must be of this type.
   typedef void (*_GLIBCXX11_DEPRECATED unexpected_handler) ();
(not to mention the diagnostic wording is C++ish).
E.g. gnu23-attrs-2.c has
typedef int A[2];
__typeof__ (int [[gnu::deprecated]]) var1; /* { dg-warning "deprecated" } */
__typeof__ (A [[gnu::deprecated]]) var2; /* { dg-warning "deprecated" } */
__typeof__ (int [3] [[gnu::deprecated]]) var3; /* { dg-warning "deprecated" } */
tests.


E.g.
typedef int * D * T;
T b;
typedef __typeof__ (*b) U;
currently works both in C and C++ for D [[gnu::deprecated]] and
__attribute__((deprecated)) and warns
a.C:12:1: warning: type is deprecated [-Wdeprecated-declarations]
12 | typedef __typeof__ (*b) U;
   | ^~~


Interesting, I thought it was useless, but clearly others disagree!

The patch is OK, then.

Jason

Re: [PATCH v2] c++: ICE in TARGET_EXPR evaluation in cp_fold_r [PR117980]

2024-12-19 Thread Jason Merrill


On 12/18/24 1:57 PM, Marek Polacek wrote:

On Tue, Dec 17, 2024 at 11:43:45AM -0500, Jason Merrill wrote:

On 12/12/24 1:42 PM, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This ICE started with the recent prvalue optimization (r15-6052).  In
cp_fold_r we have:

if (tree &init = TARGET_EXPR_INITIAL (stmt))
  {
cp_walk_tree (&init, cp_fold_r, data, NULL);
// ...
  tree folded = maybe_constant_init (init, TARGET_EXPR_SLOT (stmt));

What can happen here is that originally the TARGET_EXPR is:

  TARGET_EXPR >>
&TARGET_EXPR }> 

but after the first cp_walk_tree we fold the D.2707 TARGET_EXPR into:

  TARGET_EXPR  

and then we pass the EXPR_STMT to maybe_constant_init, with D.2707 as
the object.  But their types don't match anymore, so we crash.  We'd
have to pass D.2707.it as the object for it to work.

But I don't think we need to pass any object to maybe_constant_init;
it'll grab the appropriate one itself.


Hmm, it seems to me that the crash is happening because the deduced type is
wrong, and so with this change we'll end up only producing an initializer
for the first member.  Because cp_fold_r throws away the information that
initialized_type needs to know the type of the complete object being
initialized.


Thanks for catching that.
  

What if we move cp_fold_r of init after the maybe_constant_init?


Sadly, the same crash.

But your idea works, so this is it in a patch form.  Thanks again.


Great.

I think with this patch we should be able to remove this now-redundant case:


  if (constexpr_dtor)
/* Used for destructors of array elements.  */
type = TREE_TYPE (object);


OK with that change.

Jason

Re: [PATCH 1/2] c++: subsumption of complex constraints [PR118069]

2024-12-19 Thread Jason Merrill


On 12/18/24 11:56 AM, Patrick Palka wrote:

On Wed, 18 Dec 2024, Jason Merrill wrote:


On 12/17/24 10:43 AM, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?  Shall we also backport this to release branches?
It's not a regression but seems like a safe fix for an inconvenient
issue.


OK for trunk and 14.

I wonder about using __builtin_mul_overflow et al (wrapped in a "non-overflow
integer" class template?) to fail better on an even more extreme testcase.


That sounds prudent because the testcase in the PR almost overflows even
when using unsigned HOST_WIDE_INT (and still overflows when using signed
HOST_WIDE_INT)!

Conveniently we already have three-parameter versions of add_hwi / mul_hwi
that we can use here to track overflow.  But rather than explicitly
tracking overflow which would be a bit cumbersome without some kind of
class template abstraction, it seems all we really need to saturating
addition/multiplication helpers that clamp an overflowed operation to
HOST_WIDE_INT_MAX.

To that end I added add_sat_hwi / mul_sat_hwi functions to hwint.h
(since they seem like generally useful operation) and used it in
cnf/dnf_size_r.  Like so?  Bootstrapped and regtested on
x86_64-pc-linux-gnu.


OK.


-- >8 --

Subject: [PATCH] c++: integer overflow during subsumption [PR118069]

For the testcase in the PR we hang during constraint subsumption
ultimately because one of the constraints is complex enough that its
conjunctive normal form is calculated to have more than 2^31 clauses,
which causes the size calculation (through an int) to overflow and so
the optimization in subsumes_constraints_nonnull

   if (dnf_size (lhs) <= cnf_size (rhs))
 // iterate over DNF of LHS
   else
 // iterate over CNF of RHS

incorrectly decides to loop over the CNF (billions of clauses) instead
of the DNF (thousands of clauses).

I haven't verified that the result of cnf_size is correct for the
problematic constraint but integer overflow is definitely plausible
given that CNF/DNF can be exponentially larger than the original
constraint in the worst case.

This patch fixes this by using a 64-bit saturating arithmetic during
these size calculations via new add/mul_sat_hwi functions so that
overflow is less likely and if it does occur we handle it gracefully.
It should be highly unlikely that both the DNF and CNF size calculations
overflow, and if they do then it doesn't matter which form we select,
subsumption will take forever either way. The testcase now compiles in
~3 seconds on my machine after this change.

PR c++/118069

gcc/ChangeLog:

* hwint.h (add_sat_hwi): New function.
(mul_sat_hwi): Likewise.

gcc/cp/ChangeLog:

* logic.cc (dnf_size_r): Use HOST_WIDE_INT instead of int, and
handle overflow gracefully via add_sat_hwi and mul_sat_hwi.
(cnf_size_r): Likewise.
(dnf_size): Use HOST_WIDE_INT instead of int.
(cnf_size): Likewise.
---
  gcc/cp/logic.cc | 68 +++--
  gcc/hwint.h | 26 +++
  2 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
index 9d8edb74099..6b4bf1dfb7d 100644
--- a/gcc/cp/logic.cc
+++ b/gcc/cp/logic.cc
@@ -349,7 +349,7 @@ atomic_p (tree t)
 distributing.  In general, a conjunction for which this flag is set
 is considered a disjunction for the purpose of counting.  */
  
-static std::pair

+static std::pair
  dnf_size_r (tree t)
  {
if (atomic_p (t))
@@ -360,9 +360,9 @@ dnf_size_r (tree t)
   the results.  */
tree lhs = TREE_OPERAND (t, 0);
tree rhs = TREE_OPERAND (t, 1);
-  std::pair p1 = dnf_size_r (lhs);
-  std::pair p2 = dnf_size_r (rhs);
-  int n1 = p1.first, n2 = p2.first;
+  auto p1 = dnf_size_r (lhs);
+  auto p2 = dnf_size_r (rhs);
+  HOST_WIDE_INT n1 = p1.first, n2 = p2.first;
bool d1 = p1.second, d2 = p2.second;
  
if (disjunction_p (t))

@@ -376,22 +376,24 @@ dnf_size_r (tree t)
{
  if (disjunction_p (rhs) || (conjunction_p (rhs) && d2))
/* Both P and Q are disjunctions.  */
-   return std::make_pair (n1 + n2, d1 | d2);
+   return std::make_pair (add_sat_hwi (n1, n2), d1 | d2);
  else
/* Only LHS is a disjunction.  */
-   return std::make_pair (1 + n1 + n2, d1 | d2);
+   return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+  d1 | d2);
  gcc_unreachable ();
}
if (conjunction_p (lhs))
{
  if ((disjunction_p (rhs) && d1) || (conjunction_p (rhs) && d1 && d2))
/* Both P and Q are disjunctions.  */
-   return std::make_pair (n1 + n2, d1 | d2);
+   return std::make_pair (add_sat_hwi (n1, n2), d1 | d2);
  if (disjunction_p (rhs)
  || (conjunction_p (rhs) && d1 != d2)
  || (atomic_p (rhs) && d1))
/* Either LHS or RHS

[PATCH] c++: Fix ICEs with large initializer lists or ones including #embed [PR118124]

2024-12-19 Thread Jakub Jelinek

Hi!

The following testcases ICE due to RAW_DATA_CST not being handled where it
should be during ck_list conversions.

The last 2 testcases started ICEing with r15-6339 committed yesterday
(speedup of large initializers), the first two already with r15-5958
(#embed optimization for C++).

For conversion to initializer_list or char/signed char
we can optimize and keep RAW_DATA_CST with adjusted type if we report
narrowing errors if needed, for others this converts each element
separately.

Ok for trunk if this passes bootstrap/regtest?  Wouldn't like to
leave this broken over Christmas holidays.

2024-12-19  Jakub Jelinek  

PR c++/118124
* call.cc (convert_like_internal): Handle RAW_DATA_CST in
ck_list handling.  Formatting fixes.

* g++.dg/cpp/embed-15.C: New test.
* g++.dg/cpp/embed-16.C: New test.
* g++.dg/cpp0x/initlist-opt3.C: New test.
* g++.dg/cpp0x/initlist-opt4.C: New test.

--- gcc/cp/call.cc.jj   2024-12-11 17:27:52.481221310 +0100
+++ gcc/cp/call.cc  2024-12-19 16:10:12.977071898 +0100
@@ -8766,8 +8766,8 @@ convert_like_internal (conversion *convs
 
if (tree init = maybe_init_list_as_array (elttype, expr))
  {
-   elttype = cp_build_qualified_type
- (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST);
+   elttype = cp_build_qualified_type (elttype, cp_type_quals (elttype)
+   | TYPE_QUAL_CONST);
array = build_array_of_n_type (elttype, len);
array = build_vec_init_expr (array, init, complain);
array = get_target_expr (array);
@@ -8775,13 +8775,75 @@ convert_like_internal (conversion *convs
  }
else if (len)
  {
-   tree val; unsigned ix;
-
+   tree val;
+   unsigned ix;
tree new_ctor = build_constructor (init_list_type_node, NULL);
 
/* Convert all the elements.  */
FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (expr), ix, val)
  {
+   if (TREE_CODE (val) == RAW_DATA_CST)
+ {
+   tree elt_type;
+   conversion *next;
+   if (convs->u.list[ix]->kind == ck_std
+   && (elt_type = convs->u.list[ix]->type)
+   && (TREE_CODE (elt_type) == INTEGER_TYPE
+   || is_byte_access_type (elt_type))
+   && TYPE_PRECISION (elt_type) == CHAR_BIT
+   && (next = next_conversion (convs->u.list[ix]))
+   && next->kind == ck_identity)
+ {
+   if (!TYPE_UNSIGNED (elt_type)
+   && (complain & tf_warning)
+   && (TYPE_UNSIGNED (TREE_TYPE (val))
+   || (TYPE_PRECISION (TREE_TYPE (val))
+   > CHAR_BIT)))
+ for (int i = 0; i < RAW_DATA_LENGTH (val); ++i)
+   if (RAW_DATA_SCHAR_ELT (val, i) < 0)
+ {
+   location_t loc
+ = cp_expr_loc_or_input_loc (val);
+   int savederrorcount = errorcount;
+   permerror_opt (loc, OPT_Wnarrowing,
+  "narrowing conversion of %qd "
+  "from %qH to %qI",
+  RAW_DATA_UCHAR_ELT (val, i),
+  TREE_TYPE (val), elt_type);
+   if (errorcount != savederrorcount)
+ return error_mark_node;
+ }
+   tree sub = copy_node (val);
+   TREE_TYPE (sub) = elt_type;
+   CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (new_ctor),
+   NULL_TREE, sub);
+ }
+   else
+ {
+   for (int i = 0; i < RAW_DATA_LENGTH (val); ++i)
+ {
+   tree elt
+ = build_int_cst (TREE_TYPE (val),
+  RAW_DATA_UCHAR_ELT (val, i));
+   tree sub
+ = convert_like (convs->u.list[ix], elt,
+ fn, argnum, false, false,
+ /*nested_p=*/true, complain);
+   if (sub == error_mark_node)
+ return sub;
+   if (!check_narrowing (TREE_TYPE (sub), elt,
+ complain))
+ return error_mark_node;
+

[PATCH] c++: Fix up maybe_init_list_as_array for RAW_DATA_CST [PR118124]

2024-12-19 Thread Jakub Jelinek

Hi!

The previous patch made me look around some more and I found
maybe_init_list_as_array doesn't handle RAW_DATA_CSTs correctly either,
while the RAW_DATA_CST is properly split during finish_compound_literal,
it was using CONSTRUCTOR_NELTS as the size of the arrays, which is wrong,
RAW_DATA_CST could stand for far more initializers.

Fixed thusly, ok for trunk if it passes bootstrap/regtest?

2024-12-19  Jakub Jelinek  

PR c++/118124
* call.cc (maybe_init_list_as_array): Adjust len for RAW_DATA_CST
elements.
(convert_like_internal): Use length from init's type instead of
len when handling the maybe_init_list_as_array case.

* g++.dg/cpp0x/initlist-opt5.C: New test.

--- gcc/cp/call.cc.jj   2024-12-19 16:10:12.977071898 +0100
+++ gcc/cp/call.cc  2024-12-19 16:55:40.953546502 +0100
@@ -4386,7 +4386,13 @@ maybe_init_list_as_array (tree elttype,
   if (!is_xible (INIT_EXPR, elttype, copy_argtypes))
 return NULL_TREE;
 
-  tree arr = build_array_of_n_type (init_elttype, CONSTRUCTOR_NELTS (init));
+  unsigned int len = CONSTRUCTOR_NELTS (init);
+  if (INTEGRAL_TYPE_P (init_elttype))
+for (constructor_elt &e: CONSTRUCTOR_ELTS (init))
+  if (TREE_CODE (e.value) == RAW_DATA_CST)
+   len += RAW_DATA_LENGTH (e.value) - 1;
+
+  tree arr = build_array_of_n_type (init_elttype, len);
   arr = finish_compound_literal (arr, init, tf_none);
   DECL_MERGEABLE (TARGET_EXPR_SLOT (arr)) = true;
   return arr;
@@ -8768,7 +8774,9 @@ convert_like_internal (conversion *convs
  {
elttype = cp_build_qualified_type (elttype, cp_type_quals (elttype)
| TYPE_QUAL_CONST);
-   array = build_array_of_n_type (elttype, len);
+   tree index_type = TYPE_DOMAIN (TREE_TYPE (init));
+   array = build_cplus_array_type (elttype, index_type);
+   len = TREE_INT_CST_LOW (TYPE_MAX_VALUE (index_type)) + 1;
array = build_vec_init_expr (array, init, complain);
array = get_target_expr (array);
array = cp_build_addr_expr (array, complain);
--- gcc/testsuite/g++.dg/cpp0x/initlist-opt5.C.jj   2024-12-19 
16:56:44.113675894 +0100
+++ gcc/testsuite/g++.dg/cpp0x/initlist-opt5.C  2024-12-19 16:37:28.335605902 
+0100
@@ -0,0 +1,23 @@
+// PR c++/118124
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2" }
+
+namespace std {
+template  struct initializer_list {
+private:
+  const T *_M_array;
+  decltype (sizeof 0) _M_len;
+};
+}
+struct B {
+  B (int);
+};
+struct A {
+  A (std::initializer_list);
+};
+A a { 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7,
+  8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
+  0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1,
+  2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3,
+  4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5,
+  6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5 };

Jakub

Re: [PATCH] c++: Fix up maybe_unused attribute handling [PR110345]

2024-12-19 Thread Jason Merrill


On 12/18/24 10:53 AM, Jakub Jelinek wrote:

On Tue, Dec 17, 2024 at 05:54:36PM -0500, Jason Merrill wrote:

On 9/5/24 3:27 AM, Jakub Jelinek wrote:

When adding test coverage for maybe_unused attribute, I've run into
several things:
1) similarly to deprecated attribute, the attribute shouldn't pedantically
 appertain to types other than class/enumeration definitions
2) similarly to deprecated attribute, the attribute shouldn't pedantically
 appertain to unnamed bit-fields
3) the standard says that it can appertain to identifier labels, but
 we handled it silently also on case and default labels


As with deprecated, wouldn't it be an improvement to warn for the GNU
attribute in these cases as well?


Similar to the deprecated case, I see the 1) case e.g. in
g++.dg/cpp0x/gen-attrs-22.C
c-c++-common/Wunused-var-11.c
g++.dg/ext/attrib61.C
c-c++-common/Wunused-var-11.c
g++.dg/warn/Wunused-local-typedefs.C
tests and 3) case in gcc.dg/c23-attr-maybe_unused-1.c


OK.


I don't see any 2) cases in the GCC testsuite, haven't searched for it in other
codebases.  But I'm still worried of changing something that has been
accepted for 23 years without a standard telling it is not ok anymore (which
is the case of [[maybe_unused]] in C++.


Hmm, we usually aren't extremely cautious about adding warnings.  And it 
seems completely useless to attach these attributes something you can't 
refer to in any way.  But since we're handling types differently, might 
as well handle this in the separate function as well.


The patch is OK.


From what I see, even when the C added some pedantic diagnostics for
standard attributes, it added them just to the standard ones and not the GNU
ones, e.g. handle_std_noreturn_attribute adds those on top of
handle_noreturn_attribute.

I think PR27648 has been filed long before the advent of fuzzers, so maybe
it came from some real-world code or somebody at least tried to use it like
that.

Jakub

Re: [PATCH 2/2] RISC-V: Add intrinsic testcases for SiFive Xsfvcp extensions.

2024-12-19 Thread Kito Cheng

Could you reduce the test files? just one test for each instruction is
fine, you don't need to put all tests into gcc source tree.

e.g. only pick test_sf_vc_v_xvw_u8mf4 and test_sf_vc_v_xvw_se_u16m4
for sf.vc.v.xvw

Re: [PATCH] c and c++: Make sure LHS and RHS has identical named types [PR116060]

2024-12-19 Thread Torbjorn SVENSSON




On 2024-12-17 22:29, Jason Merrill wrote:

On 12/17/24 1:44 PM, Torbjorn SVENSSON wrote:

Hi Jason,

Thanks for the quick feedback!

On 2024-12-16 17:11, Jason Merrill wrote:

On 12/16/24 7:16 AM, Torbjörn SVENSSON wrote:

Hi,

I've reg-tested this patch on both the trunk and the releases/gcc-14
branches for x86_64-linux-gnu and arm-none-eabi and it no longer fails
for any of the out-of-bounds-diagram* tests on any of the 2 platforms.

I'm a bit puzzled if the C++ part is enough, but I can't think of a way
to trigger anything that show the wrong output after my change.
Do you think that I need to add any additional tests? I think the
existing test covers the problem well enough.

Ok for trunk and releases/gcc-14?


This won't be a candidate for backporting to 14.


Ok!




--

gcc/ChangeLog:

PR c/116060
c/c-typeck.cc: Make sure left hand side and right hand side has
identical named types to aid diagnostic output.
cp/call.cc: Likewise.


I've also split this into one block for gcc/c/ChangeLog and one for gcc/ 
cp/ChangeLog as mentioned by Marek in the other review.



gcc/testsuite/ChangeLog:

PR c/116060
c-c++-common/analyzer/out-of-bounds-diagram-8.c: Update to
correct type.
c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise.
gcc.dg/analyzer/out-of-bounds-diagram-10.c: Likewise.

Signed-off-by: Torbjörn SVENSSON 
---
  gcc/c/c-typeck.cc |  3 ++
  gcc/cp/call.cc    |  9 ++
  .../analyzer/out-of-bounds-diagram-11.c   | 28 +--
  .../analyzer/out-of-bounds-diagram-8.c    | 28 +--
  .../analyzer/out-of-bounds-diagram-10.c   | 28 +--
  5 files changed, 54 insertions(+), 42 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 902898d1944..e3e85d1ecde 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -7831,6 +7831,9 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
    if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (rhstype))
  {
    warn_for_address_of_packed_member (type, orig_rhs);
+  if (type != rhstype)
+    /* Convert RHS to TYPE in order to not loose TYPE in diagnostics.  */
+    rhs = convert (type, rhs);
    return rhs;
  }
diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index c8420db568e..d859ce9a2d6 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -1319,6 +1319,9 @@ standard_conversion (tree to, tree from, tree expr, bool 
c_cast_p,
  {
    if (CLASS_TYPE_P (to) && conv->kind == ck_rvalue)
  conv->type = qualified_to;
+  else if (from != to)
+    /* Use TO in order to not loose TO in diagnostics.  */


"lose"


+    conv->type = to;
    return conv;
  }
@@ -8741,6 +8744,12 @@ convert_like_internal (conversion *convs, tree expr, 
tree fn, int argnum,
 continue to warn about uses of EXPR as an integer, rather than as a
 pointer.  */
  expr = build_int_cst (totype, 0);
+  if (TREE_CODE (expr) == NON_LVALUE_EXPR && TREE_TYPE (expr) != totype)


You might check !obvalue_p (expr) instead of just NON_LVALUE_EXPR?


Appears to work as expected with !obvalue_p(expr), thanks!




+    {
+  /* Use TOTYPE in order to not loose TOTYPE in diagnostics.  */


"lose"


+   expr = copy_node (expr);
+   TREE_TYPE (expr) = totype;
+    }


Let's use cp_fold_convert instead of manually optimizing the conversion.


I've tried to use cp_fold_convert and cp_convert, but neither of them work 
(either during when building, or in the case of cp_fold_convert, ICE when 
running the regression test).
For cp_fold_convert, I see the following stack trace in the ICE (just one of 
many examples):

Testing torture/pr47333.C,   -O2 -flto -fuse-linker-plugin -fno-fat-lto- objects
doing compile
Executing on host: /tmp/build/gcc/testsuite/g++/../../xg++ -B/tmp/build/ 
gcc/testsuite/g++/../../ /home/user/gcc/gcc/testsuite/g++.dg/torture/ pr47333.C 
-fdiagnostics-plain-output  -nostdinc++ -I/tmp/build/x86_64- 
pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I/tmp/build/ 
x86_64-pc-linux-gnu/libstdc++-v3/include -I/home/user/gcc/libstdc++-v3/ 
libsupc++ -I/home/user/gcc/libstdc++-v3/include/backward -I/home/user/ 
gcc/libstdc++-v3/testsuite/util -fmessage-length=0   -O2 -flto -fuse- 
linker-plugin -fno-fat-lto-objects  -Wno-template-body  -S -o pr47333.s    
(timeout = 300)
spawn -ignore SIGHUP /tmp/build/gcc/testsuite/g++/../../xg++ -B/tmp/ 
build/gcc/testsuite/g++/../../ /home/user/gcc/gcc/testsuite/g++.dg/ 
torture/pr47333.C -fdiagnostics-plain-output -nostdinc++ -I/tmp/build/ 
x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I/tmp/ 
build/x86_64-pc-linux-gnu/libstdc++-v3/include -I/home/user/gcc/libstdc+ 
+-v3/libsupc++ -I/home/user/gcc/libstdc++-v3/include/backward -I/home/ 
user/gcc/libstdc++-v3/testsuite/util -fmessage-length=0 -O2 -flto -fuse- 
linker-plugin -fno-fat-lto-objects -Wno-template-body -S -o pr47333.s

Re: [PATCH] strub: accept indirection of volatile pointer types [PR118007]

2024-12-19 Thread Richard Biener

On Thu, Dec 19, 2024 at 3:39 AM Alexandre Oliva  wrote:
>
> We don't want to indirect pointers in strub wrappers, because it
> generally isn't profitable, but if the argument is volatile, then we
> must use indirection to preserve access patterns, so amend the
> assertion check.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

>
> for  gcc/ChangeLog
>
> PR middle-end/118007
> * ipa-strub.cc (pass_ipa_strub::execute): Accept indirecting
> volatile args of pointer types.
>
> for  gcc/testsuite/ChangeLog
>
> PR middle-end/118007
> * gcc.dg/strub-pr118007.c: New.
> ---
>  gcc/ipa-strub.cc  |   13 +++--
>  gcc/testsuite/gcc.dg/strub-pr118007.c |5 +
>  2 files changed, 12 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/strub-pr118007.c
>
> diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
> index 29ba143b4620a..6b3f5b078f29d 100644
> --- a/gcc/ipa-strub.cc
> +++ b/gcc/ipa-strub.cc
> @@ -2881,12 +2881,13 @@ pass_ipa_strub::execute (function *)
>&& (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (nparm)))
><= 4 * UNITS_PER_WORD
> {
> - /* No point in indirecting pointer types.  Presumably they
> -won't ever pass the size-based test above, but check the
> -assumption here, because getting this wrong would mess
> -with attribute access and possibly others.  We deal with
> -fn spec below.  */
> - gcc_checking_assert (!POINTER_TYPE_P (TREE_TYPE (nparm)));
> + /* No point in indirecting pointer types, unless they're
> +volatile.  Presumably they won't ever pass the size-based
> +test above, but check the assumption here, because
> +getting this wrong would mess with attribute access and
> +possibly others.  We deal with fn spec below.  */
> + gcc_checking_assert (!POINTER_TYPE_P (TREE_TYPE (nparm))
> +  || TREE_THIS_VOLATILE (parm));
>
>   indirect_nparms.add (nparm);
>
> diff --git a/gcc/testsuite/gcc.dg/strub-pr118007.c 
> b/gcc/testsuite/gcc.dg/strub-pr118007.c
> new file mode 100644
> index 0..6c24cad652968
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/strub-pr118007.c
> @@ -0,0 +1,5 @@
> +/* { dg-require-effective-target strub } */
> +/* { dg-do compile } */
> +/* { dg-options "-fstrub=all -O2" } */
> +
> +void rb_ec_error_print(struct rb_execution_context_struct *volatile) {}
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH] add options to control ifcombine

2024-12-19 Thread Richard Biener

On Thu, Dec 19, 2024 at 6:53 AM Alexandre Oliva  wrote:
>
>
> Introduce flags to disable ifcombine as a whole, or its new components.
>
> Disable the potentially quadratic noncontiguous ifcombine at -O1.
> Adjust the tests that expected it with -O to use -O2 instead.
>
> Is this of interest?  I've made it mostly for PR118032, but turned it
> into a proper patch because it seems to make some sense to be able to
> disable these more expensive and less stable features selectively.
> Regstrapping on x86_64-linux-gnu.  Ok to install?
>
>
> for  gcc/ChangeLog
>
> * common.opt (fcombine-conditionals): New.
> (fcombine-field-conditionals): New.
> (fcombine-noncontiguous-conditionals): New.
> * doc/invoke.texi: Document them.
> * tree-ssa-ifcombine.cc (ifcombine_ifandif): Don't call
> fold_truth_andor if -fno-combine-field-conditionals.
> (tree_ssa_ifcombine_bb): Quit after the first attempt under
> -fno-combine-noncontiguous-conditionals.
> (pass_tree_ifcombine::gate): New.
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/field-merge-1.c: Bump to -O2.
> * gcc.dg/field-merge-2.c: Likewise.
> * gcc.dg/field-merge-3.c: Likewise.
> * gcc.dg/field-merge-4.c: Likewise.
> * gcc.dg/field-merge-5.c: Likewise.
> * gcc.dg/field-merge-6.c: Likewise.
> * gcc.dg/field-merge-7.c: Likewise.
> * gcc.dg/field-merge-8.c: Likewise.
> * gcc.dg/field-merge-9.c: Likewise.
> * gcc.dg/field-merge-10.c: Likewise.
> * gcc.dg/field-merge-11.c: Likewise.
> * gcc.dg/field-merge-13.c: Likewise.
> * gcc.dg/field-merge-14.c: Likewise.
> * gcc.dg/field-merge-15.c: Likewise.
> * gcc.dg/field-merge-16.c: Likewise.
> ---
>  gcc/common.opt|   12 +++
>  gcc/doc/invoke.texi   |   30 +++
>  gcc/testsuite/gcc.dg/field-merge-1.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-10.c |2 +-
>  gcc/testsuite/gcc.dg/field-merge-11.c |2 +-
>  gcc/testsuite/gcc.dg/field-merge-13.c |2 +-
>  gcc/testsuite/gcc.dg/field-merge-14.c |2 +-
>  gcc/testsuite/gcc.dg/field-merge-15.c |2 +-
>  gcc/testsuite/gcc.dg/field-merge-16.c |2 +-
>  gcc/testsuite/gcc.dg/field-merge-2.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-3.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-4.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-5.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-6.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-7.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-8.c  |2 +-
>  gcc/testsuite/gcc.dg/field-merge-9.c  |2 +-
>  gcc/tree-ssa-ifcombine.cc |   37 
> ++---
>  18 files changed, 82 insertions(+), 27 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 1b72826d44b11..aa5ad79741126 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1241,6 +1241,18 @@ fcode-hoisting
>  Common Var(flag_code_hoisting) Optimization
>  Enable code hoisting.
>
> +fcombine-conditionals
> +Common Var(flag_tree_ifcombine) Init(-1) Optimization
> +Combine conditionals, with the ifcombine optimization pass.
> +
> +fcombine-field-conditionals
> +Common Var(flag_tree_ifcombine_fieldmerge) Init(-1) Optimization
> +Combine conditionals involving separate fields
> +
> +fcombine-noncontiguous-conditionals
> +Common Var(flag_tree_ifcombine_noncontig) Init(-1) Optimization
> +Combine conditionals from noncontiguous blocks

Please don't use -1 initializers, instead populate
opts.cc:default_options_table.

IMO options where it's not clear how they interact are bad.  Does
-fcombine-field-conditionals enable -fcombine-conditionals?  Does
-fno-combine-field-conditionals disable it?  A
-fcombine-conditionals= might be more obvious here,
though with two independent features (fields and non-contiguous)
a set of  might be better, but that has no generic option
machinery support.

I'd go for a pragmatic solution, add -fcombine-conditionals and
document that with -fexpensive-optimizations we're enabling
non-contiguous support and a --param to limit it's search depth.
I think we don't need the field-conditionals flag at this point
(fold always did this), if you want one for debugging I suggest
to add a --param for it.

>  fcombine-stack-adjustments
>  Common Var(flag_combine_stack_adjustments) Optimization
>  Looks for opportunities to reduce stack adjustments and stack references.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 8ed5536365f79..317b1dd233a66 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -570,6 +570,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fassociative-math  -fauto-profile  -fauto-profile[=@var{path}]
>  -fauto-inc-dec  -fbranch-probabilities
>  -fcaller-saves
> +-fcombine-conditionals -fcombine-field-conditionals
> +-fcombine-noncontiguous-conditionals
>  -fcombine-stack-adjustments  -fconserve-st

Re: [PATCH] Fix comment typos in tree-assume.cc

2024-12-19 Thread Andrew MacLeod


I have no issues. ok by me.  I clearly need a proofreader :-)

Andrew

On 12/18/24 11:22, Andrew Carlotti wrote:

I think this counts as obvious, but I'll leave it a few days before committing
in case Andrew (or anyone else) disagrees.

gcc/ChangeLog:

* tree-assume.cc: Fix comment typos.


diff --git a/gcc/tree-assume.cc b/gcc/tree-assume.cc
index 
883338bcef1e41e15a67fd015834d74319ca11af..9a934f21dc039c0b8f5af717510752d7008ed493
 100644
--- a/gcc/tree-assume.cc
+++ b/gcc/tree-assume.cc
@@ -36,16 +36,16 @@ along with GCC; see the file COPYING3.  If not see
  #include "tree-cfg.h"
  #include "gimple-pretty-print.h"
  
-// An assume query utilizes the current range query to implelemtn the assume

+// An assume query utilizes the current range query to implement the assume
  // keyword.
  // For any return value of 1 from the function, it attempts to determine
-// which paths leads to a 1 value being returned. On those paths, what
+// which paths lead to a 1 value being returned. On those paths, it determines
  // the ranges of any ssa_names listed in bitmap P (usually the parm list for
-// the function) are, and combined them all.
+// the function), and combines them all.
  // These ranges are then set as the global ranges for those parms in this
  // function.
-// Other functions which then refer to this function in an assume builtin
-// will then pick up these ranges for the paramters via the inferred range
+// Other functions which refer to this function in an assume builtin
+// will then pick up these ranges for the parameters via the inferred range
  // mechanism.
  //   See gimple-range-infer.cc::gimple_infer_range::check_assume_func ()
  //
@@ -57,11 +57,11 @@ along with GCC; see the file COPYING3.  If not see
  //
  // a small temporary assume function consisting of
  // assume_f1 (int x) { return x == 1 || x == 4; }
-// is constructed by the front end, and optimzed, at the very end of
+// is constructed by the front end, and optimized, at the very end of
  // optimization, instead of generating code, we instead invoke the assume pass
  // which uses this query to set the the global value of parm x to [1,1][4,4]
  //
-// Meanwhile., my_Fund has been rewritten to be:
+// Meanwhile., my_func has been rewritten to be:
  //
  // my_func (int x_2)
  // {
@@ -70,12 +70,12 @@ along with GCC; see the file COPYING3.  If not see
  //   if (x_2 == 3)
  //
  // When ranger is processing the assume_builtin_call, it looks up the global
-// value of the paramter in assume_f1, which is [1,1][4,4].  It then registers
+// value of the parameter in assume_f1, which is [1,1][4,4].  It then registers
  // and inferred range at this statement setting the value x_2 to [1,1][4,4]
  //
-// Any uses of x_2 after this statement will now utilzie this inferred range.
+// Any uses of x_2 after this statement will now utilize this inferred range.
  //
-// When VRP precoesses if (x_2 == 3), it picks up the inferred range, and
+// When VRP processes if (x_2 == 3), it picks up the inferred range, and
  // determines that x_2 can never be 3, and will rewrite the branch to
  //   if (0 != 0)
  
@@ -109,7 +109,7 @@ assume_query::assume_query (function *f, bitmap p) : m_parm_list (p),

 m_func (f)
  {
basic_block exit_bb = EXIT_BLOCK_PTR_FOR_FN (f);
-  // If there is more than one precessor to the exit block, bail.
+  // If there is more than one predecessor to the exit block, bail.
if (!single_pred_p (exit_bb))
  return;
  
@@ -130,7 +130,7 @@ assume_query::assume_query (function *f, bitmap p) : m_parm_list (p),

if (!irange::supports_p (lhs_type))
  return;
  
-  // Only values of interest are when the return value is 1.  The defintion

+  // Only values of interest are when the return value is 1.  The definition
// of the return value must be in the same block, or we have
// complicated flow control we don't understand, and just return.
unsigned prec = TYPE_PRECISION (lhs_type);
@@ -169,7 +169,7 @@ assume_query::assume_query (function *f, bitmap p) : 
m_parm_list (p),
 }
  }
  
-// This function Will update all the current value of interesting parameters.

+// This function will update all the current values of interesting parameters.
  // It tries, in order:
  //a) a range found via path calculations.
  //b) range of the parm at SRC point in the IL. (either edge or stmt)
@@ -423,9 +423,9 @@ public:
bool gate (function *fun) final override { return fun->assume_function; }
unsigned int execute (function *fun) final override
  {
-  // Create a bitmap of all the paramters in this function.
-  // Invoke the assume_query to detemine what values these parameters
-  // have when the function returns TRUE, and set the globals value of
+  // Create a bitmap of all the parameters in this function.
+  // Invoke the assume_query to determine what values these parameters
+  // have when the function returns

[PING^2] C++ patches ping

2024-12-19 Thread Simon Martin

Hi,

Could I please have feedback on the following patches?

PR c++/109918: Unexpected -Woverloaded-virtual with virtual conversion 
operators
   => https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665650.html

PR c++/114292: ICE with a generic (templated) lambda capturing a 
constant for VLA allocation
  => https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671590.html

PR c++/117775: Internal compiler error when deriving from lambda 
function with invalid body
   => 
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671426.html

PR c++/114619:ICE with -fno-elide-constructors in C++14 mode for 
non-constant initializer in array new
   => 
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668373.html

PR c++/114858: Compilation Hang and Excessive RAM Consumption in GCC 
with invalid input
=> 
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664686.html

Thanks!
Simon

Re: [PATCH] vect: Do not use partial vectors when emulating vectors [PR116351].

2024-12-19 Thread Robin Dapp

> I wonder if LOOP_VINFO_LENS is really empty here?  If not, who recorded
> the len and why did that not disable partial vectors?

It's not empty.  vectorizable_operation fills it for a vectype of vector short
(4).  Before (in vector_type_mode), we determined that a vector long (1) has an
integer mode with the same size so it is not discarded.

-- 
Regards
 Robin

Re: [PATCH] SVE intrinsics: Fold svmul and svdiv by -1 to svneg for unsigned types

2024-12-19 Thread Jennifer Schmitz



> On 19 Dec 2024, at 12:24, Richard Sandiford  wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Jennifer Schmitz  writes:
>> @@ -3672,6 +3673,48 @@ gimple_folder::fold_pfalse ()
>>   return nullptr;
>> }
>> 
>> +/* Convert the lhs and all non-boolean vector-type operands to TYPE.
>> +   Pass the converted variables to the callback FP, and finally convert the
>> +   result back to the original type. Add the necessary conversion 
>> statements.
>> +   Return the new call.  */
>> +gimple *
>> +gimple_folder::convert_and_fold (tree type,
>> +  gimple *(*fp) (gimple_folder &,
>> + tree, vec &))
>> +{
>> +  gcc_assert (VECTOR_TYPE_P (type)
>> +   && TYPE_MODE (type) != VNx16BImode);
>> +  tree old_ty = TREE_TYPE (lhs);
>> +  gimple_seq stmts = NULL;
>> +  tree lhs_conv, op, op_ty, t;
>> +  gimple *g, *new_stmt;
> 
> Sorry for the last-minute minor request, but: it would be nice to declare
> these at the point of initialisation, for consistency with the rest of the
> function.
Done.
> 
>> +  bool convert_lhs_p = !useless_type_conversion_p (type, old_ty);
>> +  lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs;
>> +  unsigned int num_args = gimple_call_num_args (call);
>> +  auto_vec args_conv;
>> +  args_conv.safe_grow (num_args);
>> +  for (unsigned int i = 0; i < num_args; ++i)
>> +{
>> +  op = gimple_call_arg (call, i);
>> +  op_ty = TREE_TYPE (op);
>> +  args_conv[i] =
>> + (VECTOR_TYPE_P (op_ty)
>> +  && TYPE_MODE (op_ty) != VNx16BImode
>> +  && !useless_type_conversion_p (op_ty, type))
>> + ? gimple_build (&stmts, VIEW_CONVERT_EXPR, type, op) : op;
>> +}
>> +
>> +  new_stmt = fp (*this, lhs_conv, args_conv);
>> +  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>> +  if (convert_lhs_p)
>> +{
>> +  t = build1 (VIEW_CONVERT_EXPR, old_ty, lhs_conv);
>> +  g = gimple_build_assign (lhs, VIEW_CONVERT_EXPR, t);
>> +  gsi_insert_after (gsi, g, GSI_SAME_STMT);
>> +}
>> +  return new_stmt;
>> +}
>> +
>> /* Fold the call to constant VAL.  */
>> gimple *
>> gimple_folder::fold_to_cstu (poly_uint64 val)
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h 
>> b/gcc/config/aarch64/aarch64-sve-builtins.h
>> index 6f22868f9b3..02ae098ed32 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins.h
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.h
>> @@ -421,6 +421,7 @@ public:
>>   tree scalar_type (unsigned int) const;
>>   tree vector_type (unsigned int) const;
>>   tree tuple_type (unsigned int) const;
>> +  tree signed_type (unsigned int) const;
>>   unsigned int elements_per_vq (unsigned int) const;
>>   machine_mode vector_mode (unsigned int) const;
>>   machine_mode tuple_mode (unsigned int) const;
>> @@ -649,6 +650,8 @@ public:
>>   gcall *redirect_call (const function_instance &);
>>   gimple *redirect_pred_x ();
>>   gimple *fold_pfalse ();
>> +  gimple *convert_and_fold (tree, gimple *(*) (gimple_folder &,
>> +tree, vec &));
>> 
>>   gimple *fold_to_cstu (poly_uint64);
>>   gimple *fold_to_pfalse ();
>> @@ -884,6 +887,20 @@ find_type_suffix (type_class_index tclass, unsigned int 
>> element_bits)
>>   gcc_unreachable ();
>> }
>> 
>> +/* Return the type suffix of the signed type of width ELEMENT_BITS.  */
>> +inline type_suffix_index
>> +signed_type_suffix_index (unsigned int element_bits)
>> +{
>> +  switch (element_bits)
>> +  {
>> +  case 8: return TYPE_SUFFIX_s8;
>> +  case 16: return TYPE_SUFFIX_s16;
>> +  case 32: return TYPE_SUFFIX_s32;
>> +  case 64: return TYPE_SUFFIX_s64;
>> +  }
>> +  gcc_unreachable ();
>> +}
>> +
> 
> We could drop this and instead replace calls with:
> 
>  find_type_suffix (TYPE_signed, element_bits)
Done.
> 
> 
>> /* Return the single field in tuple type TYPE.  */
>> inline tree
>> tuple_type_field (tree type)
>> @@ -1049,6 +1066,20 @@ function_instance::tuple_type (unsigned int i) const
>>   return acle_vector_types[num_vectors - 1][type_suffix (i).vector_type];
>> }
>> 
>> +/* Return the signed vector type of width ELEMENT_BITS.  */
>> +inline tree
>> +function_instance::signed_type (unsigned int element_bits) const
>> +{
>> +  switch (element_bits)
>> +  {
>> +  case 8: return acle_vector_types[0][VECTOR_TYPE_svint8_t];
>> +  case 16: return acle_vector_types[0][VECTOR_TYPE_svint16_t];
>> +  case 32: return acle_vector_types[0][VECTOR_TYPE_svint32_t];
>> +  case 64: return acle_vector_types[0][VECTOR_TYPE_svint64_t];
>> +  }
>> +  gcc_unreachable ();
>> +}
>> +
> 
> And for this, I think we should instead make:
> 
> /* Return the vector type associated with TYPE.  */
> static tree
> get_vector_type (sve_type type)
> {
>  auto vector_type = type_suffixes[type.type].vector_type;
>  return acle_vector_types[type.num_vectors - 1][vector_type];
> }
> 
> public, or perhaps just define it inline in aarch64-sve-builtins.h.
Done. Thanks for pointing out the existing fun

Re: [PATCH v3 0/5] c++/modules: Implement P1815 "Translation-unit-local entities"

2024-12-19 Thread Nathaniel Shead

On Thu, Dec 19, 2024 at 10:38:09AM -0500, Jason Merrill wrote:
> On 12/18/24 9:17 AM, Nathaniel Shead wrote:
> > On Tue, Dec 17, 2024 at 03:58:38PM -0500, Jason Merrill wrote:
> > > On 11/27/24 3:53 AM, Nathaniel Shead wrote:
> > > > Gentle ping for this series:
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665108.html
> > > > 
> > > > Most of the patches no longer applied cleanly to trunk since the last
> > > > time I pinged this so I'm attaching newly rebased patches.
> > > > 
> > > > One slight adjustment I've included as well is a test in internal-4_b.C
> > > > for exposures of namespace aliases, as in:
> > > > 
> > > > namespace { namespace internal {} }
> > > > export namespace exposure = internal;
> > > > 
> > > > By the standard this appears to be well-formed; we currently error, and
> > > > I think this might be the desired behaviour (an easy workaround is to
> > > > wrap the alias in an anonymous namespace), but thought I'd at least test
> > > > the existing behaviour here.
> > > > 
> > > > Tested modules.exp on x86_64-pc-linux-gnu, OK for trunk if full
> > > > bootstrap+regtest succeeds?
> > > 
> > > I tweaked some of these patches a bit; OK with these changes, or without
> > > patch #2 if you'd rather not make that change.
> > 
> > Thanks.  I will include patch #2 (with an additional line in invoke.texi
> > to note that the presence of explicit instantiations will silence the
> > warning.)
> > 
> > >  From 03134a00bdc0f53cb30fde284808be71811e29e7 Mon Sep 17 00:00:00 2001
> > > From: Jason Merrill 
> > > Date: Wed, 11 Dec 2024 11:02:34 -0500
> > > Subject: [PATCH] c++: adjust "Detect exposures of TU-local entities"
> > > To: gcc-patches@gcc.gnu.org
> > > 
> > > This patch adjusts the handling of the injected-class-name to be the same 
> > > as
> > > the class name itself.
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * tree.cc (decl_linkage): Treat DECL_SELF_REFERENCE_P like
> > >   DECL_IMPLICIT_TYPEDEF_P.
> > >   * module.cc (depset::hash::is_tu_local_entity): Likewise.
> > >   (depset::hash::is_tu_local_value): Fix formatting.
> > > ---
> > >   gcc/cp/module.cc | 10 +-
> > >   gcc/cp/tree.cc   | 16 
> > >   2 files changed, 13 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > index 5d11e85f41d..823884e97f3 100644
> > > --- a/gcc/cp/module.cc
> > > +++ b/gcc/cp/module.cc
> > > @@ -13247,10 +13247,11 @@ depset::hash::is_tu_local_entity (tree decl, 
> > > bool explain/*=false*/)
> > >   {
> > > gcc_checking_assert (DECL_P (decl));
> > > -  /* An explicit type alias is not an entity, and so is never TU-local.  
> > > */
> > > +  /* An explicit type alias is not an entity, and so is never TU-local.
> > > + Neither are the built-in declarations of 'int' and such.  */
> > > if (TREE_CODE (decl) == TYPE_DECL
> > > -  && !DECL_IMPLICIT_TYPEDEF_P (decl)
> > > -  && !DECL_SELF_REFERENCE_P (decl))
> > > +  && (is_typedef_decl (decl)
> > > +   || !OVERLOAD_TYPE_P (TREE_TYPE (decl
> > >   return false;
> > > location_t loc = DECL_SOURCE_LOCATION (decl);
> > > @@ -13348,7 +13349,6 @@ depset::hash::is_tu_local_entity (tree decl, bool 
> > > explain/*=false*/)
> > >these aren't really TU-local.  */
> > > if (TREE_CODE (decl) == TYPE_DECL
> > > && TYPE_ANON_P (type)
> > > -  && !DECL_SELF_REFERENCE_P (decl)
> > > /* An enum with an enumerator name for linkage.  */
> > > && !(UNSCOPED_ENUM_P (type) && TYPE_VALUES (type)))
> > >   {
> > > @@ -13473,7 +13473,7 @@ depset::hash::is_tu_local_value (tree decl, tree 
> > > expr, bool explain)
> > >of reference type refer is TU-local and is usable in constant
> > >expressions.  */
> > > if (TREE_CODE (e) == CONSTRUCTOR && AGGREGATE_TYPE_P (TREE_TYPE (e)))
> > > -for (auto& f : CONSTRUCTOR_ELTS (e))
> > > +for (auto &f : CONSTRUCTOR_ELTS (e))
> > > if (is_tu_local_value (decl, f.value, explain))
> > >   return true;
> > > diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
> > > index 939d2b060fb..260c16418a1 100644
> > > --- a/gcc/cp/tree.cc
> > > +++ b/gcc/cp/tree.cc
> > > @@ -5894,17 +5894,17 @@ decl_linkage (tree decl)
> > >linkage first, and then transform that into a concrete
> > >implementation.  */
> > > -  /* An explicit type alias has no linkage.  */
> > > +  /* An explicit type alias has no linkage.  Nor do the built-in 
> > > declarations
> > > + of 'int' and such.  */
> > > if (TREE_CODE (decl) == TYPE_DECL
> > > -  && !DECL_IMPLICIT_TYPEDEF_P (decl)
> > > -  && !DECL_SELF_REFERENCE_P (decl))
> > > +  && !DECL_IMPLICIT_TYPEDEF_P (decl))
> > >   {
> > > -  /* But this could be a typedef name for linkage purposes, in which
> > > -  case we're interested in the linkage of the main decl.  */
> > > -  if (decl == TYPE_NAME (TYPE_MAIN_VARIANT (TREE_TYPE (decl
> > > - decl = TYPE_MAIN_DECL (TREE_TYPE (

Re: [PATCH 1/2] RISC-V: Add intrinsics support for SiFive Xsfvcp extensions.

2024-12-19 Thread Kito Cheng

> diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
> b/gcc/config/riscv/genrvv-type-indexer.cc
> index a2974269adc..e3b845d156e 100644
> --- a/gcc/config/riscv/genrvv-type-indexer.cc
> +++ b/gcc/config/riscv/genrvv-type-indexer.cc
> @@ -303,6 +303,8 @@ main (int argc, const char **argv)
> fprintf (fp, "  /*UNSIGNED_EEW%d_LMUL1_INTERPRET*/ %s,\n", eew,
>  inttype (eew, LMUL1_LOG2, /* unsigned_p */true).c_str ());
>
> +   fprintf (fp, "  /*X2*/ INVALID,\n");
> +

We don't need X2 here, could you check how wadd.vv and wadd.vx
implement? use double_trunc_vector should work in this case once you
adjust the base SEW/LMUL right.

Re: [PATCH] c++, v2: Disallow [[deprecated]] on types other than class/enum definitions [PR110345]

2024-12-19 Thread Jakub Jelinek

On Wed, Dec 18, 2024 at 04:44:55PM +0100, Jakub Jelinek wrote:
> The first check would flag something that is used in the wild, e.g.
> g++.dg/Wmissing-attributes.C
> gcc.dg/gnu23-attrs-2.c
> g++.dg/cpp0x/gen-attrs-81.C
> g++.dg/warn/Wdangling-reference17.C
> g++.dg/warn/Wdangling-reference20.C
> tests would be affected by it (at least if pedantic), including
>  header which uses this on
>   /// If you write a replacement %unexpected handler, it must be of this type.
>   typedef void (*_GLIBCXX11_DEPRECATED unexpected_handler) ();
> (not to mention the diagnostic wording is C++ish).
> E.g. gnu23-attrs-2.c has
> typedef int A[2];
> __typeof__ (int [[gnu::deprecated]]) var1; /* { dg-warning "deprecated" } */
> __typeof__ (A [[gnu::deprecated]]) var2; /* { dg-warning "deprecated" } */
> __typeof__ (int [3] [[gnu::deprecated]]) var3; /* { dg-warning "deprecated" } 
> */
> tests.

E.g.
typedef int * D * T;
T b;
typedef __typeof__ (*b) U;
currently works both in C and C++ for D [[gnu::deprecated]] and
__attribute__((deprecated)) and warns
a.C:12:1: warning: type is deprecated [-Wdeprecated-declarations]
   12 | typedef __typeof__ (*b) U;
  | ^~~

Jakub

Re: [Fortran, Patch, PR57598] Fix coarray STOP

2024-12-19 Thread Damian Rouson

I don’t think the standard requires providing the stop code to the OS, but
it recommends doing so.  So this is a great idea.  Thanks for working on
coarray features.

Damian

On Thu, Dec 19, 2024 at 04:14 Andre Vehreschild  wrote:

> Hi all,
>
> attached patch fixes a rather old open issue, that I stumbled upon
> while trying to figure, why a test failed on the command line but not
> in the testsuite. The implementation of the STOP command in caf_single
> did not hand the errorcode over to the OS, as does non-caf STOP and as
> it is required by the standard. So I fixed that. I also added reporting
> of exceptions to the coarray (ERROR)? STOP routines. For this I have
> exported the existing function of the regular gfortran runtime library.
> I tried to do this via iexport_proto, but was never able to access the
> routine from the caf-library. I always got linker errors.
>
> After fixing caf-STOP the testsuite reported one regression, which I
> also fixed in send_by_ref.
>
> Bootstrapped and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
> mainline?
>
> Regards,
> Andre
> --
> Andre Vehreschild * Email: vehre ad gcc dot gnu dot org
>

Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-12-19 Thread Richard Biener

On Wed, Dec 18, 2024 at 6:30 PM Jennifer Schmitz  wrote:
>
>
>
> > On 17 Dec 2024, at 18:57, Richard Biener  wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> >> Am 16.12.2024 um 09:10 schrieb Jennifer Schmitz :
> >>
> >> 
> >>
> >>> On 14 Dec 2024, at 09:32, Richard Biener  wrote:
> >>>
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> > Am 13.12.2024 um 18:00 schrieb Jennifer Schmitz :
> 
>  
> 
> > On 13 Dec 2024, at 13:40, Richard Biener  
> > wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> >> On Thu, Dec 12, 2024 at 5:27 PM Jennifer Schmitz  
> >> wrote:
> >>
> >>
> >>
> >>> On 6 Dec 2024, at 08:41, Jennifer Schmitz  wrote:
> >>>
> >>>
> >>>
>  On 5 Dec 2024, at 20:07, Richard Sandiford 
>   wrote:
> 
>  External email: Use caution opening links or attachments
> 
> 
>  Jennifer Schmitz  writes:
> >> On 5 Dec 2024, at 11:44, Richard Biener  wrote:
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> On Thu, 5 Dec 2024, Jennifer Schmitz wrote:
> >>
> >>>
> >>>
>  On 17 Oct 2024, at 19:23, Richard Sandiford 
>   wrote:
> 
>  External email: Use caution opening links or attachments
> 
> 
>  Jennifer Schmitz  writes:
> > [...]
> > Looking at the diff of the vect dumps (below is a section of 
> > the diff for strided_store_2.c), it seemed odd that 
> > vec_to_scalar operations cost 0 now, instead of the previous 
> > cost of 2:
> >
> > +strided_store_1.c:38:151: note:=== vectorizable_operation 
> > ===
> > +strided_store_1.c:38:151: note:vect_model_simple_cost: 
> > inside_cost = 1, prologue_cost = 0 .
> > +strided_store_1.c:38:151: note:   ==> examining statement: *_6 
> > = _7;
> > +strided_store_1.c:38:151: note:   vect_is_simple_use: operand 
> > _3 + 1.0e+0, type of def:internal
> > +strided_store_1.c:38:151: note:   Vectorizing an unaligned 
> > access.
> > +Applying pattern match.pd:236, generic-match-9.cc:4128
> > +Applying pattern match.pd:5285, generic-match-10.cc:4234
> > +strided_store_1.c:38:151: note:   vect_model_store_cost: 
> > inside_cost = 12, prologue_cost = 0 .
> > *_2 1 times unaligned_load (misalign -1) costs 1 in body
> > -_3 + 1.0e+0 1 times scalar_to_vec costs 1 in prologue
> > _3 + 1.0e+0 1 times vector_stmt costs 1 in body
> > -_7 1 times vec_to_scalar costs 2 in body
> > + 1 times vector_load costs 1 in prologue
> > +_7 1 times vec_to_scalar costs 0 in body
> > _7 1 times scalar_store costs 1 in body
> > -_7 1 times vec_to_scalar costs 2 in body
> > +_7 1 times vec_to_scalar costs 0 in body
> > _7 1 times scalar_store costs 1 in body
> > -_7 1 times vec_to_scalar costs 2 in body
> > +_7 1 times vec_to_scalar costs 0 in body
> > _7 1 times scalar_store costs 1 in body
> > -_7 1 times vec_to_scalar costs 2 in body
> > +_7 1 times vec_to_scalar costs 0 in body
> > _7 1 times scalar_store costs 1 in body
> >
> > Although the aarch64_use_new_vector_costs_p flag was used in 
> > multiple places in aarch64.cc, the location that causes this 
> > behavior is this one:
> > unsigned
> > aarch64_vector_costs::add_stmt_cost (int count, 
> > vect_cost_for_stmt kind,
> > stmt_vec_info stmt_info, slp_tree,
> > tree vectype, int misalign,
> > vect_cost_model_location where)
> > {
> > [...]
> > /* Try to get a more accurate cost by looking at STMT_INFO 
> > instead
> > of just looking at KIND.  */
> > -  if (stmt_info && aarch64_use_new_vector_costs_p ())
> > +  if (stmt_info)
> > {
> > /* If we scalarize a strided store, the vectorizer costs one
> > vec_to_scalar for each element.  However, we can store the first
> > element using an FP store without a separate extract step.  */
> > if (vect_is_store_elt_extraction (kind, stmt_info))
> > count -= 1;
> >
> > stmt_cost = aarch64_detect_scalar_stmt_subtype (m_vinfo, kind,
> >

Re: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-19 Thread Richard Sandiford

Tamar Christina  writes:
>> >  ;; 2 element quad vector modes.
>> >  (define_mode_iterator VQ_2E [V2DI V2DF])
>> >
>> > @@ -1678,7 +1686,15 @@ (define_mode_attr VHALF [(V8QI "V4QI")  (V16QI
>> "V8QI")
>> > (V2DI "DI")(V2SF  "SF")
>> > (V4SF "V2SF")  (V4HF "V2HF")
>> > (V8HF "V4HF")  (V2DF  "DF")
>> > -   (V8BF "V4BF")])
>> > +   (V8BF "V4BF")
>> > +   (VNx16QI "VNx8QI") (VNx8QI "VNx4QI")
>> > +   (VNx4QI "VNx2QI")  (VNx2QI "QI")
>> > +   (VNx8HI "VNx4HI")  (VNx4HI "VNx2HI") (VNx2HI "HI")
>> > +   (VNx8HF "VNx4HF")  (VNx4HF "VNx2HF") (VNx2HF "HF")
>> > +   (VNx8BF "VNx4BF")  (VNx4BF "VNx2BF") (VNx2BF "BF")
>> > +   (VNx4SI "VNx2SI")  (VNx2SI "SI")
>> > +   (VNx4SF "VNx2SF")  (VNx2SF "SF")
>> > +   (VNx2DI "DI")  (VNx2DF "DF")])
>> 
>> Are the x2 entries necessary, given that the new uses are restricted
>> to NO2E?
>> 
>
> No, but I wanted to keep the symmetry with the Adv. SIMD modes.   Since the
> mode attributes don't really control the number of alternatives I thought it 
> would
> be better to have the attributes be "fully" defined rather than only the 
> subset I use.

But these are variable-length modes, so DI is only half of VNx2DI for
the minimum vector length.  It's less than half for Neoverse V1 or A64FX.

IMO it'd be better to leave them out for now and defined them when needed,
at which point the right choice would be more obvious.

Thanks,
Richard

>
> gcc/ChangeLog:
>
>   PR target/96342
>   * config/aarch64/aarch64-sve.md (vec_init): New.
>   (@aarch64_pack_partial): New.
>   * config/aarch64/aarch64.cc (aarch64_sve_expand_vector_init_subvector): 
> New.
>   * config/aarch64/iterators.md (SVE_NO2E): New.
>   (VHALF, Vhalf): Add SVE partial vectors.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/96342
>   * gcc.target/aarch64/vect-simd-clone-2.c: New test.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> -- inline copy of patch --
>
> diff --git a/gcc/config/aarch64/aarch64-sve.md 
> b/gcc/config/aarch64/aarch64-sve.md
> index 
> a72ca2a500d394598268c6adfe717eed94a304b3..8ed4221dbe5c49db97b37f186365fa391900eadb
>  100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -2839,6 +2839,16 @@ (define_expand "vec_init"
>}
>  )
>  
> +(define_expand "vec_init"
> +  [(match_operand:SVE_NO2E 0 "register_operand")
> +   (match_operand 1 "")]
> +  "TARGET_SVE"
> +  {
> +aarch64_sve_expand_vector_init (operands[0], operands[1]);
> +DONE;
> +  }
> +)
> +
>  ;; Shift an SVE vector left and insert a scalar into element 0.a
>  (define_insn "vec_shl_insert_"
>[(set (match_operand:SVE_FULL 0 "register_operand")
> @@ -9289,6 +9299,19 @@ (define_insn "vec_pack_trunc_"
>"uzp1\t%0., %1., %2."
>  )
>  
> +;; Integer partial pack packing two partial SVE types into a single full SVE
> +;; type of the same element type.  Use UZP1 on the wider type, which discards
> +;; the high part of each wide element.  This allows to concat SVE partial 
> types
> +;; into a wider vector.
> +(define_insn "@aarch64_pack_partial"
> +  [(set (match_operand:SVE_NO2E 0 "register_operand" "=w")
> + (vec_concat:SVE_NO2E
> +   (match_operand: 1 "register_operand" "w")
> +   (match_operand: 2 "register_operand" "w")))]
> +  "TARGET_SVE"
> +  "uzp1\t%0., %1., %2."
> +)
> +
>  ;; -
>  ;;  [INT<-INT] Unpacks
>  ;; -
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> de4c0a0783912b54ac35d7c818c24574b27a4ca0..40214e318f3c4e30e619d96073b253887c973efc
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -24859,6 +24859,17 @@ aarch64_sve_expand_vector_init (rtx target, rtx vals)
>  v.quick_push (XVECEXP (vals, 0, i));
>v.finalize ();
>  
> +  /* If we have two elements and are concatting vector.  */
> +  machine_mode elem_mode = GET_MODE (v.elt (0));
> +  if (nelts == 2 && VECTOR_MODE_P (elem_mode))
> +{
> +  /* We've failed expansion using a dup.  Try using a cheeky truncate. */
> +  rtx arg0 = force_reg (elem_mode, v.elt(0));
> +  rtx arg1 = force_reg (elem_mode, v.elt(1));
> +  emit_insn (gen_aarch64_pack_partial (mode, target, arg0, arg1));
> +  return;
> +}
> +
>/* If neither sub-vectors of v could be initialized specially,
>   then use INSR to insert all elements from v into TARGET.
>   ??? This might not be optimal for vectors with large
> @@ -24870,6 +24881,30 @@ aarch64_sve_expand_vector_init (rtx target, rtx vals)
>  aarch64

Re: [PATCH] avoid trying to set block in barriers [PR113506]

2024-12-19 Thread Richard Biener

On Thu, Dec 19, 2024 at 3:36 AM Alexandre Oliva  wrote:
>
>
> When we emit a sequence before a preexisting insn and naming a BB to
> store in the insns, we will attempt to store the BB even in barriers
> present in the sequence.
>
> Barriers don't expect blocks, and rtl checking catches the problem.
>
> When emitting after a preexisting insn, we skip the block setting in
> barriers.  Change the before emitter to do so as well.
>
> Regstrapped on x86_64-linux-gnu.  Testcase used to reproduce the problem
> and to confirm the fix with a cross to riscv32-elf configured with rtl
> checking.  Ok to install?

OK

>
> for  gcc/ChangeLog
>
> PR middle-end/113506
> * emit-rtl.cc (add_insn_before): Don't set the block of a
> barrier.
>
> for  gcc/testsuite/ChangeLog
>
> PR middle-end/113506
> * gcc.target/riscv/pr113506.c: New.
> ---
>  gcc/emit-rtl.cc   |6 --
>  gcc/testsuite/gcc.target/riscv/pr113506.c |   15 +++
>  2 files changed, 19 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr113506.c
>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index a556692e8a02a..3af6849a29bc6 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -4369,9 +4369,11 @@ add_insn_before (rtx_insn *insn, rtx_insn *before, 
> basic_block bb)
>  {
>add_insn_before_nobb (insn, before);
>
> +  if (BARRIER_P (insn))
> +return;
> +
>if (!bb
> -  && !BARRIER_P (before)
> -  && !BARRIER_P (insn))
> +  && !BARRIER_P (before))
>  bb = BLOCK_FOR_INSN (before);
>
>if (bb)
> diff --git a/gcc/testsuite/gcc.target/riscv/pr113506.c 
> b/gcc/testsuite/gcc.target/riscv/pr113506.c
> new file mode 100644
> index 0..404dda9fd532d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr113506.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fchecking=1 -Os -fno-tree-coalesce-vars 
> -finline-stringops" } */
> +
> +typedef unsigned v32su __attribute__((vector_size (32)));
> +
> +v32su foo_v32su_4;
> +
> +unsigned
> +foo (v32su v32su_2)
> +{
> +  v32su_2 *= v32su_2;
> +  if (foo_v32su_4[3])
> +v32su_2 &= (v32su){};
> +  return v32su_2[1];
> +}
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH] testsuite: arm: C++26 uses __equal() instead of operator==()

2024-12-19 Thread Torbjorn SVENSSON





On 2024-12-19 11:48, Richard Earnshaw (lists) wrote:

On 18/12/2024 19:57, Torbjörn SVENSSON wrote:

Ok for trunk?

--

Update test case to align with used function in C++26.

gcc/testsuite/ChangeLog:

* g++.dg/abi/arm_rtti1.C: Check for expected symbol in C++26.

Signed-off-by: Torbjörn SVENSSON 


OK.


Pushed as r15-6365-g898f333413d.

Kind regards,
Torbjörn



R.

---
  gcc/testsuite/g++.dg/abi/arm_rtti1.C | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/abi/arm_rtti1.C 
b/gcc/testsuite/g++.dg/abi/arm_rtti1.C
index 74f00033d9a..5ebae26e670 100644
--- a/gcc/testsuite/g++.dg/abi/arm_rtti1.C
+++ b/gcc/testsuite/g++.dg/abi/arm_rtti1.C
@@ -2,7 +2,8 @@
  // { dg-options "-O2" }
  // Check that, even when optimizing, we emit an out-of-line call to
  // the type-info comparison function.
-// { dg-final { scan-assembler _ZNKSt9type_infoeqERKS_ } }
+// { dg-final { scan-assembler _ZNKSt9type_infoeqERKS_ { target { ! c++26 } } 
} }
+// { dg-final { scan-assembler _ZNKSt9type_info7__equalERKS_ { target { c++26 
} } } }
  
  #include

[Fortran, Patch, PR57598] Fix coarray STOP

2024-12-19 Thread Andre Vehreschild

Hi all,

attached patch fixes a rather old open issue, that I stumbled upon
while trying to figure, why a test failed on the command line but not
in the testsuite. The implementation of the STOP command in caf_single
did not hand the errorcode over to the OS, as does non-caf STOP and as
it is required by the standard. So I fixed that. I also added reporting
of exceptions to the coarray (ERROR)? STOP routines. For this I have
exported the existing function of the regular gfortran runtime library.
I tried to do this via iexport_proto, but was never able to access the
routine from the caf-library. I always got linker errors.

After fixing caf-STOP the testsuite reported one regression, which I
also fixed in send_by_ref.

Bootstrapped and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
mainline?

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gcc dot gnu dot org
From f6e3e34c33be7e8d8753079b9b26f9f4044ccd26 Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Wed, 18 Dec 2024 12:43:39 +0100
Subject: [PATCH] Fortran: Fix caf_stop_numeric and reporting exceptions from
 caf [PR57598]

Caf_stop_numeric always exited with code 0, which is wrong.  It shall
behave like regular stop.  Add reporting exceptions to caf's stop
handlers.  For this the existing library routine had to be exported.

libgfortran/ChangeLog:

	PR fortran/57598

	* caf/single.c (_gfortran_caf_stop_numeric): Report exceptions
	on stop. And fix send_by_ref.
	(_gfortran_caf_stop_str): Same.
	(_gfortran_caf_error_stop_str): Same.
	(_gfortran_caf_error_stop): Same.
	* gfortran.map: Add report_exception for export.
	* libgfortran.h (report_exception): Add to internal export.
	* runtime/stop.c (report_exception): Same.
---
 libgfortran/caf/single.c   | 19 +++
 libgfortran/gfortran.map   |  1 +
 libgfortran/libgfortran.h  |  3 +++
 libgfortran/runtime/stop.c |  7 +--
 4 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/libgfortran/caf/single.c b/libgfortran/caf/single.c
index 41da970e830..0ffbffa1d2b 100644
--- a/libgfortran/caf/single.c
+++ b/libgfortran/caf/single.c
@@ -263,13 +263,17 @@ _gfortran_caf_sync_images (int count __attribute__ ((unused)),
 *stat = 0;
 }

+extern void _gfortran_report_exception (void);

 void
 _gfortran_caf_stop_numeric(int stop_code, bool quiet)
 {
   if (!quiet)
-fprintf (stderr, "STOP %d\n", stop_code);
-  exit (0);
+{
+  _gfortran_report_exception ();
+  fprintf (stderr, "STOP %d\n", stop_code);
+}
+  exit (stop_code);
 }


@@ -278,6 +282,7 @@ _gfortran_caf_stop_str(const char *string, size_t len, bool quiet)
 {
   if (!quiet)
 {
+  _gfortran_report_exception ();
   fputs ("STOP ", stderr);
   while (len--)
 	fputc (*(string++), stderr);
@@ -292,6 +297,7 @@ _gfortran_caf_error_stop_str (const char *string, size_t len, bool quiet)
 {
   if (!quiet)
 {
+  _gfortran_report_exception ();
   fputs ("ERROR STOP ", stderr);
   while (len--)
 	fputc (*(string++), stderr);
@@ -373,7 +379,10 @@ void
 _gfortran_caf_error_stop (int error, bool quiet)
 {
   if (!quiet)
-fprintf (stderr, "ERROR STOP %d\n", error);
+{
+  _gfortran_report_exception ();
+  fprintf (stderr, "ERROR STOP %d\n", error);
+}
   exit (error);
 }

@@ -2131,14 +2140,16 @@ send_by_ref (caf_reference_t *ref, size_t *i, size_t *src_index,
 	  /* Assume that the rank and the dimensions fit for copying src
 		 to dst.  */
 	  GFC_DESCRIPTOR_DTYPE (dst) = GFC_DESCRIPTOR_DTYPE (src);
+	  GFC_DESCRIPTOR_SPAN (dst) = GFC_DESCRIPTOR_SPAN (src);
 	  stride_dst = 1;
+	  dst->offset = 0;
 	  for (size_t d = 0; d < src_rank; ++d)
 		{
 		  extent_dst = GFC_DIMENSION_EXTENT (src->dim[d]);
 		  GFC_DIMENSION_LBOUND (dst->dim[d]) = 1;
 		  GFC_DIMENSION_UBOUND (dst->dim[d]) = extent_dst;
 		  GFC_DIMENSION_STRIDE (dst->dim[d]) = stride_dst;
-		  dst->offset = -extent_dst;
+		  dst->offset -= stride_dst;
 		  stride_dst *= extent_dst;
 		}
 	  /* Null the data-pointer to make register_component allocate
diff --git a/libgfortran/gfortran.map b/libgfortran/gfortran.map
index f58edc52e3c..851df211eee 100644
--- a/libgfortran/gfortran.map
+++ b/libgfortran/gfortran.map
@@ -1997,4 +1997,5 @@ GFORTRAN_15 {
 _gfortran_sminloc1_8_m2;
 _gfortran_sminloc1_8_m4;
 _gfortran_sminloc1_8_m8;
+_gfortran_report_exception;
 } GFORTRAN_14;
diff --git a/libgfortran/libgfortran.h b/libgfortran/libgfortran.h
index aaa9222c43b..cf3dda07d3d 100644
--- a/libgfortran/libgfortran.h
+++ b/libgfortran/libgfortran.h
@@ -986,6 +986,9 @@ internal_proto(filename_from_unit);

 /* stop.c */

+extern void report_exception (void);
+iexport_proto (report_exception);
+
 extern _Noreturn void stop_string (const char *, size_t, bool);
 export_proto(stop_string);

diff --git a/libgfortran/runtime/stop.c b/libgfortran/runtime/stop.c
index 2eefe21a9e9..3ac5beff6bb 100644
--- a/libgfortran/runtime/stop.c
+++ b/libgfortran/runtime/stop.c
@@ -38,7 +38,10 @@ se

Re: [PATCH] testsuite: arm: Check for short circuit instructions [PR103298]

2024-12-19 Thread Richard Earnshaw (lists)

On 18/12/2024 16:24, Torbjörn SVENSSON wrote:
> Changes since v1:
> 
> - Updated the commit message to reflect the changes (including the subject).
> - Replaced the POP/BEQ checks with chesk for {cmp,mov,orr,and}{eq,ne}.
> - Removed the size check
> 
> 
> Ok for trunk and releases/gcc-14?
> Should I also push this to releases/gcc-13 and releases/gcc-12 as this is a
> regression in r12-5301-g04520645038?
> 
> --
> 
> Instead of checking that a certain transformation is not used by
> counting the number of return instructions and the number of BEQ
> instructions, check that none of CMP, MOV, ORR and AND instructions are
> suffixed with EQ or NE.
> Also removed size check as it's very unstable (depends on optimization
> in use).
> 
> gcc/testsuite/ChangeLog:
> 
>   PR testsuite/103298
>   * gcc.target/arm/pr43920-2.c: Change to assembler pattern
>   "(cmp|mov|orr|and)(eq|ne)" for the check. Remove size check.
> 
> Signed-off-by: Torbjörn SVENSSON 

OK

R.

Re: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-19 Thread Richard Sandiford

Tamar Christina  writes:
> gcc/ChangeLog:
>
>   PR target/96342
>   * config/aarch64/aarch64-sve.md (vec_init): New.
>   (@aarch64_pack_partial): New.
>   * config/aarch64/aarch64.cc (aarch64_sve_expand_vector_init_subvector): 
> New.
>   * config/aarch64/iterators.md (SVE_NO2E): New.
>   (VHALF, Vhalf): Add SVE partial vectors.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/96342
>   * gcc.target/aarch64/vect-simd-clone-2.c: New test.

OK, thanks.

Richard

> Bootstrapped Regtested on aarch64-none-linux-gnu  and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> -- inline copy of patch --
>
> diff --git a/gcc/config/aarch64/aarch64-sve.md 
> b/gcc/config/aarch64/aarch64-sve.md
> index 
> a72ca2a500d394598268c6adfe717eed94a304b3..8ed4221dbe5c49db97b37f186365fa391900eadb
>  100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -2839,6 +2839,16 @@ (define_expand "vec_init"
>}
>  )
>  
> +(define_expand "vec_init"
> +  [(match_operand:SVE_NO2E 0 "register_operand")
> +   (match_operand 1 "")]
> +  "TARGET_SVE"
> +  {
> +aarch64_sve_expand_vector_init (operands[0], operands[1]);
> +DONE;
> +  }
> +)
> +
>  ;; Shift an SVE vector left and insert a scalar into element 0.
>  (define_insn "vec_shl_insert_"
>[(set (match_operand:SVE_FULL 0 "register_operand")
> @@ -9289,6 +9299,19 @@ (define_insn "vec_pack_trunc_"
>"uzp1\t%0., %1., %2."
>  )
>  
> +;; Integer partial pack packing two partial SVE types into a single full SVE
> +;; type of the same element type.  Use UZP1 on the wider type, which discards
> +;; the high part of each wide element.  This allows to concat SVE partial 
> types
> +;; into a wider vector.
> +(define_insn "@aarch64_pack_partial"
> +  [(set (match_operand:SVE_NO2E 0 "register_operand" "=w")
> + (vec_concat:SVE_NO2E
> +   (match_operand: 1 "register_operand" "w")
> +   (match_operand: 2 "register_operand" "w")))]
> +  "TARGET_SVE"
> +  "uzp1\t%0., %1., %2."
> +)
> +
>  ;; -
>  ;;  [INT<-INT] Unpacks
>  ;; -
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> de4c0a0783912b54ac35d7c818c24574b27a4ca0..40214e318f3c4e30e619d96073b253887c973efc
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -24859,6 +24859,17 @@ aarch64_sve_expand_vector_init (rtx target, rtx vals)
>  v.quick_push (XVECEXP (vals, 0, i));
>v.finalize ();
>  
> +  /* If we have two elements and are concatting vector.  */
> +  machine_mode elem_mode = GET_MODE (v.elt (0));
> +  if (nelts == 2 && VECTOR_MODE_P (elem_mode))
> +{
> +  /* We've failed expansion using a dup.  Try using a cheeky truncate. */
> +  rtx arg0 = force_reg (elem_mode, v.elt(0));
> +  rtx arg1 = force_reg (elem_mode, v.elt(1));
> +  emit_insn (gen_aarch64_pack_partial (mode, target, arg0, arg1));
> +  return;
> +}
> +
>/* If neither sub-vectors of v could be initialized specially,
>   then use INSR to insert all elements from v into TARGET.
>   ??? This might not be optimal for vectors with large
> @@ -24870,6 +24881,30 @@ aarch64_sve_expand_vector_init (rtx target, rtx vals)
>  aarch64_sve_expand_vector_init_insert_elems (target, v, nelts);
>  }
>  
> +/* Initialize register TARGET from the two vector subelements in PARALLEL
> +   rtx VALS.  */
> +
> +void
> +aarch64_sve_expand_vector_init_subvector (rtx target, rtx vals)
> +{
> +  machine_mode mode = GET_MODE (target);
> +  int nelts = XVECLEN (vals, 0);
> +
> +  gcc_assert (nelts == 2);
> +
> +  rtx arg0 = XVECEXP (vals, 0, 0);
> +  rtx arg1 = XVECEXP (vals, 0, 1);
> +
> +  /* If we have two elements and are concatting vector.  */
> +  machine_mode elem_mode = GET_MODE (arg0);
> +  gcc_assert (VECTOR_MODE_P (elem_mode));
> +
> +  arg0 = force_reg (elem_mode, arg0);
> +  arg1 = force_reg (elem_mode, arg1);
> +  emit_insn (gen_aarch64_pack_partial (mode, target, arg0, arg1));
> +  return;
> +}
> +
>  /* Check whether VALUE is a vector constant in which every element
> is either a power of 2 or a negated power of 2.  If so, return
> a constant vector of log2s, and flip CODE between PLUS and MINUS
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 
> 89c72b24aeb791adbbd3edfdb131478d52b248e6..34200b05a3abf6d51919313de1027aa4988bcb8d
>  100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -140,6 +140,10 @@ (define_mode_iterator VQ_I [V16QI V8HI V4SI V2DI])
>  ;; VQ without 2 element modes.
>  (define_mode_iterator VQ_NO2E [V16QI V8HI V4SI V8HF V4SF V8BF])
>  
> +;; SVE modes without 2 element modes.
> +(define_mode_iterator SVE_NO2E [VNx16QI VNx8QI VNx4QI VNx8HI VNx4HI VNx8HF
> + VNx4HF VNx8BF VNx4BF VNx4SI VNx4SF])
> +
>  ;;

Re: [Fortran, Patch, PR57598] Fix coarray STOP

2024-12-19 Thread Andre Vehreschild

Hi Damian,

well F2008 Note 8.30 states:

If the stop-code is an integer, it is recommended that the value also
be used as the process exit status, if the processor supports that
concept. 

and in F2018 its not even "just a note" anymore, but in 11.4 §2 same
sentence as in the note.

At least to get determined behavior, both CAF and non-CAF programs
should behave the same.

- Andre



 On Thu, 19 Dec 2024 05:25:08 -0800
Damian Rouson  wrote:

> I don’t think the standard requires providing the stop code to the
> OS, but it recommends doing so.  So this is a great idea.  Thanks for
> working on coarray features.
> 
> Damian
> 
> On Thu, Dec 19, 2024 at 04:14 Andre Vehreschild  wrote:
> 
> > Hi all,
> >
> > attached patch fixes a rather old open issue, that I stumbled upon
> > while trying to figure, why a test failed on the command line but
> > not in the testsuite. The implementation of the STOP command in
> > caf_single did not hand the errorcode over to the OS, as does
> > non-caf STOP and as it is required by the standard. So I fixed
> > that. I also added reporting of exceptions to the coarray (ERROR)?
> > STOP routines. For this I have exported the existing function of
> > the regular gfortran runtime library. I tried to do this via
> > iexport_proto, but was never able to access the routine from the
> > caf-library. I always got linker errors.
> >
> > After fixing caf-STOP the testsuite reported one regression, which I
> > also fixed in send_by_ref.
> >
> > Bootstrapped and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
> > mainline?
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gcc dot gnu dot org
> >



-- 
Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen
Tel.: +49 178 3837536 * ve...@gmx.de

Re: [PATCH] vect: Do not use partial vectors when emulating vectors [PR116351].

2024-12-19 Thread Richard Biener

On Wed, Dec 18, 2024 at 6:48 PM Robin Dapp  wrote:
>
> Hi,
>
> in PR116351 we try to vectorize with -march=...zve32x which does not
> have 64-bit vector element sizes, don't find a proper mode and end up
> using word_mode = DImode.
>
> vect_verify_loop_lens calls get_len_load_store_mode which asserts
> VECTOR_MODE_P (vecmode) so DImode will cause an ICE.
>
> In check_load_store_for_partial_vectors we disable partial vectors
> when emulating vectors so this patch does the same in tree-vect-loop.cc
> before vect_verify_loop_lens is called.
>
> Is this the correct thing to do or should we have taken another
> turn somewhere else?

  else /* !LOOP_VINFO_LENS (loop_vinfo).is_empty () */
if (!vect_verify_loop_lens (loop_vinfo))
  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;

I wonder if LOOP_VINFO_LENS is really empty here?  If not, who recorded
the len and why did that not disable partial vectors?

Maybe

  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
  && !LOOP_VINFO_MASKS (loop_vinfo).is_empty ()
  && !LOOP_VINFO_LENS (loop_vinfo).is_empty ())

should become !(LOOP_VINFO_MASKS (loop_vinfo).is_empty () ^
LOOP_VINFO_LENS (loop_vinfo).is_empty ()))?

> Bootstrapped and regtested on x86, aarch64, and power10.
> Regtested on rv64gcv_zvl512b.
>
> Regards
>  Robin
>
> PR target/116351
>
> gcc/ChangeLog:
>
> * tree-vect-loop.cc: Disable partial vectors for emulated
> vectors.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/pr116351.c: New test.
> ---
>  .../gcc.target/riscv/rvv/autovec/pr116351.c   | 15 +++
>  gcc/tree-vect-loop.cc | 10 ++
>  2 files changed, 25 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116351.c
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116351.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116351.c
> new file mode 100644
> index 000..ed1b985d8fa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116351.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64imd_zve32x -mrvv-vector-bits=zvl" } */
> +
> +int a, b, c;
> +short d, e, f;
> +long (g) (long h) { return h; }
> +
> +void i ()
> +{
> +  for (; b; ++b)
> +{
> +  f = 5 >> a ? d : d << a;
> +  e &= c | g (f);
> +}
> +}
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 4f401cd2d0c..2cf5a05b70c 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -3021,6 +3021,16 @@ start_over:
>LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
>  }
>
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +  && !VECTOR_MODE_P (loop_vinfo->vector_mode))
> +{
> +  if (dump_enabled_p ())
> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +"can't operate on partial vectors when emulating"
> +" vector operations.\n");
> +  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
> +}
> +
>/* If we still have the option of using partial vectors,
>   check whether we can generate the necessary loop controls.  */
>if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
> --
> 2.47.1
>

Re: [PATCH] testsuite: Fix toplevel-asm-1.c failure for riscv

2024-12-19 Thread Richard Biener




> Am 18.12.2024 um 21:28 schrieb Jakub Jelinek :
> 
> On Wed, Dec 18, 2024 at 01:19:43PM +0100, Andreas Schwab wrote:
>>> On Dez 12 2024, Jakub Jelinek wrote:
>>> 
>>> The intent was to test %cN because %N doesn't DTRT on various targets.
>>> I have a patch to add %ccN support which should then work even on riscv
>>> hopefully, but unfortunately it hasn't been fully reviewed yet.
>> 
>> That didn't change toplevel-asm-1, so the failure remains.
> 
> Yes, I've only committed what was approved.
> 
> The following patch ought to fix this (and if there are other targets which
> don't really support %cN for SYMBOL_REFs even with -fno-pic, they can be
> added there too; I think it is useful to test %cN on the targets where it
> works though).
> 
> Tested on x86_64-linux and with cross to riscv64-linux, ok for trunk?

Ok

> 2024-12-18  Jakub Jelinek  
> 
>* c-c++-common/toplevel-asm-1.c: Use %cc3 %cc4 instead of %c3 %c4
>on riscv.
> 
> --- gcc/testsuite/c-c++-common/toplevel-asm-1.c.jj2024-12-05 
> 09:24:54.788005777 +0100
> +++ gcc/testsuite/c-c++-common/toplevel-asm-1.c2024-12-18 
> 19:06:33.567248675 +0100
> @@ -8,7 +8,12 @@ enum E { E0, E1 = sizeof (struct S) + 15
> int v[42];
> void foo (void) {}
> 
> +/* Not all targets can use %cN even in non-pic code.  */
> +#if defined(__riscv)
> +asm ("# %0 %1 %2 %cc3 %cc4 %5 %% %="
> +#else
> asm ("# %0 %1 %2 %c3 %c4 %5 %% %="
> +#endif
>  :: "i" (sizeof (struct S)),
>"i" (__builtin_offsetof (struct S, c)),
>"i" (E1),
> 
> 
>Jakub
>

[PATCH 4/4] arm, testsuite: add +simd to arm_v8_3a_complex_neon_ok

2024-12-19 Thread Christophe Lyon

The vect testsuite adds -mfpu=neon before the arm_v8_3a_complex_neon
flags via check_vect_support_and_set_flags, so before this change
testcases are compiled with -mfpu=neon (and no -march/-mfloat-abi
flag) with an arm-linux-gnueabihf toolchain configured using
--with-float=hard --with-fpu=vfpv3-d16 --with-arch=armv7-a

However, when computing et_arm_v8_3a_complex_neon_flags, -mfpu=neon is
not used, and the first attempt with flags="" uses
-mcpu=unset -march=armv8.3-a, resulting in
error:  #error "__ARM_FEATURE_COMPLEX not defined"

The same occurs with -mfloat-abi=softfp -mfpu=auto -mcpu=unset -march=armv8.3-a,
but -mfloat-abi=hard -mfpu=auto -mcpu=unset -march=armv8.3-a fails with:
error: '-mfloat-abi=hard': selected architecture lacks an FPU

So finally et_arm_v8_3a_complex_neon_flags is empty and
check_effective_target_arm_v8_3a_complex_neon_ok_nocache returns 0,
leading to the behavior described in the first paragraph.

Since -march=armv8.3-a alone does not enable any FPU, it looks like an
oversight and +simd should be added (this also gives some sense to the
-mfpu=auto option which is used along -mfloat-abi=XXX, and to _neon_
in the effective-target name).

Adding +simd thus means that trying with flags="" will still fail (now
using -mcpu=unset -march=armv8.3-a+simd):
error:  #error "__ARM_FEATURE_COMPLEX not defined"

but -mfloat-abi=softfp -mfpu=auto -mcpu=unset -march=armv8.3-a+simd
succeeds and testcases are now compiled with
-mfpu=neon -mfloat-abi=softfp -mfpu=auto -mcpu=unset -march=armv8.3-a+simd
leading to compilation errors when they #include  or :
error: gnu/stubs-soft.h: No such file or directory

Adding #include  to the sample code for the
effective-targets solves this problem and we now compile the tests
with
-mfpu=neon -mfloat-abi=hard -mfpu=auto -mcpu=unset -march=armv8.3-a+simd

This patch does not enable more tests, but now we have
FAIL: gcc.dg/vect/complex/fast-math-complex-mls-float.c scan-tree-dump vect 
"Found COMPLEX_ADD_ROT270"
which used to pass with -mfpu=neon, instead of the now implied
neon-fp-armv8, but that looks like a bug?

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_arm_v8_3a_complex_neon_ok_nocache): Add
+simd, include stdint.h.
---
 gcc/testsuite/lib/target-supports.exp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 9f4e2700dd2..59eb38344ed 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -13256,8 +13256,9 @@ proc 
check_effective_target_arm_v8_3a_complex_neon_ok_nocache { } {
#if !defined (__ARM_FEATURE_COMPLEX)
#error "__ARM_FEATURE_COMPLEX not defined"
#endif
-   } "$flags -mcpu=unset -march=armv8.3-a"] } {
-   set et_arm_v8_3a_complex_neon_flags "$flags -mcpu=unset 
-march=armv8.3-a"
+   #include 
+   } "$flags -mcpu=unset -march=armv8.3-a+simd"] } {
+   set et_arm_v8_3a_complex_neon_flags "$flags -mcpu=unset 
-march=armv8.3-a+simd"
return 1;
}
 }
-- 
2.34.1

Re: [PATCH v3] testsuite: arm: Use effective-target for memset-inline* tests

2024-12-19 Thread Torbjorn SVENSSON





On 2024-12-19 11:46, Richard Earnshaw (lists) wrote:

On 18/12/2024 18:45, Torbjörn SVENSSON wrote:

Changes since v1:

- Split tests into two parts. One part for doing asm checkes. Another part
   for doing run test as these require hardware to be available.
- Changed existing tests to be "compile" instead of "run".

Changes since v2:

- Applied the same fix to memset-inline-8.c and memset-inline-9.c since
   they also fail for the same reason.

Ok for trunk and releases/gcc-14?

--

Split tests into 2 parts:
- The first part checkes the assmbler generated.
- The second part does the run test and this part now requires
   effective-target arm_neon_hw.

gcc/testsuite/ChangeLog:

* gcc.target/arm/memset-inline-4.c: Only check assembler output.
* gcc.target/arm/memset-inline-5.c: Likewise.
* gcc.target/arm/memset-inline-6.c: Likewise.
* gcc.target/arm/memset-inline-8.c: Likewise.
* gcc.target/arm/memset-inline-9.c: Likewise.
* gcc.target/arm/memset-inline-4-exe.c: New test.
* gcc.target/arm/memset-inline-5-exe.c: Likewise.
* gcc.target/arm/memset-inline-6-exe.c: Likewise.
* gcc.target/arm/memset-inline-8-exe.c: Likewise.
* gcc.target/arm/memset-inline-9-exe.c: Likewise.

Signed-off-by: Torbjörn SVENSSON 


OK.


Pushed as r15-6366-g8462a5fdbfe and r14.2.0-577-g4bbb74c75c0.

Kind regards,
Torbjörn



R.


---
  gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c | 7 +++
  gcc/testsuite/gcc.target/arm/memset-inline-4.c | 2 +-
  gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c | 7 +++
  gcc/testsuite/gcc.target/arm/memset-inline-5.c | 2 +-
  gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c | 7 +++
  gcc/testsuite/gcc.target/arm/memset-inline-6.c | 2 +-
  gcc/testsuite/gcc.target/arm/memset-inline-8-exe.c | 7 +++
  gcc/testsuite/gcc.target/arm/memset-inline-8.c | 2 +-
  gcc/testsuite/gcc.target/arm/memset-inline-9-exe.c | 7 +++
  gcc/testsuite/gcc.target/arm/memset-inline-9.c | 2 +-
  10 files changed, 40 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c
  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c
  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c
  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-8-exe.c
  create mode 100644 gcc/testsuite/gcc.target/arm/memset-inline-9-exe.c

diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c 
b/gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c
new file mode 100644
index 000..fef6c4365e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/memset-inline-4-exe.c
@@ -0,0 +1,7 @@
+/* { dg-do run } */
+/* { dg-skip-if "Don't inline memset using neon instructions" { ! 
arm_tune_string_ops_prefer_neon } } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-save-temps -O2 -fno-inline" } */
+/* { dg-add-options "arm_neon" } */
+
+#include "./memset-inline-4.c"
diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-4.c 
b/gcc/testsuite/gcc.target/arm/memset-inline-4.c
index 5d7223ef2c0..6eb2a9d18a3 100644
--- a/gcc/testsuite/gcc.target/arm/memset-inline-4.c
+++ b/gcc/testsuite/gcc.target/arm/memset-inline-4.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do compile } */
  /* { dg-skip-if "Don't inline memset using neon instructions" { ! 
arm_tune_string_ops_prefer_neon } } */
  /* { dg-options "-save-temps -O2 -fno-inline" } */
  /* { dg-add-options "arm_neon" } */
diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c 
b/gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c
new file mode 100644
index 000..a52a527ea13
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/memset-inline-5-exe.c
@@ -0,0 +1,7 @@
+/* { dg-do run } */
+/* { dg-skip-if "Don't inline memset using neon instructions" { ! 
arm_tune_string_ops_prefer_neon } } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-save-temps -O2 -fno-inline" } */
+/* { dg-add-options "arm_neon" } */
+
+#include "./memset-inline-5.c"
diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-5.c 
b/gcc/testsuite/gcc.target/arm/memset-inline-5.c
index 6e7ae65eef4..0f55c7b8c88 100644
--- a/gcc/testsuite/gcc.target/arm/memset-inline-5.c
+++ b/gcc/testsuite/gcc.target/arm/memset-inline-5.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do compile } */
  /* { dg-skip-if "Don't inline memset using neon instructions" { ! 
arm_tune_string_ops_prefer_neon } } */
  /* { dg-options "-save-temps -O2 -fno-inline" } */
  /* { dg-add-options "arm_neon" } */
diff --git a/gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c 
b/gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c
new file mode 100644
index 000..8e58d681023
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/memset-inline-6-exe.c
@@ -0,0 +1,7 @@
+/* { dg-do run } */
+/* { dg-skip-if "Don't inline memset using neon instructions" { ! 
arm_tune_string_ops_prefer_neon } } */
+/* { dg-requir

[PATCH 1/4] arm, testsuite: remove duplicate dg-add-options arm_v8_3a_complex_neon

2024-12-19 Thread Christophe Lyon

These two testcases have twice the same dg-add-options
arm_v8_3a_complex_neon, the patch removes one of them.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/complex-operations-run.c: Remove duplicate
dg-add-options arm_v8_3a_complex_neon.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c:
Likewise.
---
 gcc/testsuite/gcc.dg/vect/complex/complex-operations-run.c   | 1 -
 .../vect/complex/fast-math-bb-slp-complex-add-pattern-double.c   | 1 -
 2 files changed, 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/complex/complex-operations-run.c 
b/gcc/testsuite/gcc.dg/vect/complex/complex-operations-run.c
index 5a68ff02951..2f916ab4adf 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/complex-operations-run.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/complex-operations-run.c
@@ -1,6 +1,5 @@
 /* { dg-require-effective-target vect_complex_add_double } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
-/* { dg-add-options arm_v8_3a_complex_neon } */
 
 #include 
 #include 
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c
 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c
index e820356de0f..2cd7eb25b3e 100644
--- 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c
+++ 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c
@@ -1,5 +1,4 @@
 /* { dg-do compile } */
-/* { dg-add-options arm_v8_3a_complex_neon } */
 /* { dg-additional-options "-ffast-math -fno-tree-loop-vectorize" } */
 /* { dg-require-effective-target vect_double } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
-- 
2.34.1

[PATCH 2/4] arm, testsuite: fix fast-math-bb-slp-complex-mla-float.c dg-add-options

2024-12-19 Thread Christophe Lyon

The test uses floats, not fp16 so it should use arm_v8_3a_complex_neon
instead of arm_v8_3a_fp16_complex_neon.

This makes it PASS on arm-linux-gnueabihf instead of being UNRESOLVED.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Use
arm_v8_3a_complex_neon.
---
 .../gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
index 61026bef715..7b2a8dd261c 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_float } */
 /* { dg-additional-options "-ffast-math -fdump-tree-vect-details" } */
-/* { dg-add-options arm_v8_3a_fp16_complex_neon } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
 
 #define TYPE float
 #define N 16
-- 
2.34.1

[PATCH 3/4] arm, testsuite: fix arm_v8_3a_fp16_complex_neon_ok

2024-12-19 Thread Christophe Lyon

Without this patch, testcases using arm_v8_3a_fp16_complex_neon fail
to compile on arm-linux-gnueabihf with
fatal error: gnu/stubs-soft.h: No such file or directory
because they are actually compiled with
-mfloat-abi=softfp -mfpu=auto -mcpu=unset -march=armv8.3-a+fp16

Fix this by including stdint.h in the sample code for the effective-target.

This makes these tests PASS instead of being UNRESOLVED:
fast-math-bb-slp-complex-add-half-float.c
fast-math-bb-slp-complex-mla-half-float.c
fast-math-bb-slp-complex-mls-half-float.c
fast-math-bb-slp-complex-mul-half-float.c
fast-math-complex-add-half-float.c
fast-math-complex-mla-half-float.c
fast-math-complex-mls-half-float.c
fast-math-complex-mul-half-float.c

except for two new
FAIL: gcc.dg/vect/complex/fast-math-complex-mls-half-float.c scan-tree-dump 
vect "Found COMPLEX_ADD_ROT270"
(and the same with -flto -ffat-lto-objects)

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_arm_v8_3a_fp16_complex_neon_ok_nocache):
Include stdint.h.
---
 gcc/testsuite/lib/target-supports.exp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index a16e9534ccd..9f4e2700dd2 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -13298,6 +13298,7 @@ proc 
check_effective_target_arm_v8_3a_fp16_complex_neon_ok_nocache { } {
#if !defined (__ARM_FEATURE_COMPLEX)
#error "__ARM_FEATURE_COMPLEX not defined"
#endif
+   #include 
} "$flags -mcpu=unset -march=armv8.3-a+fp16"] } {
set et_arm_v8_3a_fp16_complex_neon_flags \
"$flags -mcpu=unset -march=armv8.3-a+fp16"
-- 
2.34.1

Re: [PATCH v2] libstdc++: add initializer_list constructor to std::span (P2447)

2024-12-19 Thread Jonathan Wakely

On Thu, 12 Dec 2024 at 14:24, Giuseppe D'Angelo
 wrote:
>
> Hi,
>
> On 12/12/2024 01:04, Jonathan Wakely wrote:
> >> I'll prepare a patch to do that,
> > Et voila:
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671432.html
>
> Thanks! All done, new patch is attached.

I was about to push this and realised it's missing a Signed-off-by
tag. I assume you meant to contribute this under the DCO terms, as
with your previous patches?

[PATCH 3/4] aarch64: Add missing makefile dependency

2024-12-19 Thread Richard Sandiford

gcc/
* config/aarch64/t-aarch64 (aarch64-builtins.o): Depend on
aarch64-simd-pragma-builtins.def.
---
 gcc/config/aarch64/t-aarch64 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64
index dfb159d1da6..3219871e8d7 100644
--- a/gcc/config/aarch64/t-aarch64
+++ b/gcc/config/aarch64/t-aarch64
@@ -55,6 +55,7 @@ aarch64-builtins.o: 
$(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) \
   $(DIAGNOSTIC_CORE_H) $(OPTABS_H) \
   $(srcdir)/config/aarch64/aarch64-simd-builtins.def \
   $(srcdir)/config/aarch64/aarch64-simd-builtin-types.def \
+  $(srcdir)/config/aarch64/aarch64-simd-pragma-builtins.def \
   aarch64-builtin-iterators.h
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/aarch64/aarch64-builtins.cc
-- 
2.25.1

Re: [PATCH] SVE intrinsics: Fold svmul and svdiv by -1 to svneg for unsigned types

2024-12-19 Thread Richard Sandiford

Jennifer Schmitz  writes:
> @@ -3672,6 +3673,48 @@ gimple_folder::fold_pfalse ()
>return nullptr;
>  }
>  
> +/* Convert the lhs and all non-boolean vector-type operands to TYPE.
> +   Pass the converted variables to the callback FP, and finally convert the
> +   result back to the original type. Add the necessary conversion statements.
> +   Return the new call.  */
> +gimple *
> +gimple_folder::convert_and_fold (tree type,
> +  gimple *(*fp) (gimple_folder &,
> + tree, vec &))
> +{
> +  gcc_assert (VECTOR_TYPE_P (type)
> +   && TYPE_MODE (type) != VNx16BImode);
> +  tree old_ty = TREE_TYPE (lhs);
> +  gimple_seq stmts = NULL;
> +  tree lhs_conv, op, op_ty, t;
> +  gimple *g, *new_stmt;

Sorry for the last-minute minor request, but: it would be nice to declare
these at the point of initialisation, for consistency with the rest of the
function.

> +  bool convert_lhs_p = !useless_type_conversion_p (type, old_ty);
> +  lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs;
> +  unsigned int num_args = gimple_call_num_args (call);
> +  auto_vec args_conv;
> +  args_conv.safe_grow (num_args);
> +  for (unsigned int i = 0; i < num_args; ++i)
> +{
> +  op = gimple_call_arg (call, i);
> +  op_ty = TREE_TYPE (op);
> +  args_conv[i] =
> + (VECTOR_TYPE_P (op_ty)
> +  && TYPE_MODE (op_ty) != VNx16BImode
> +  && !useless_type_conversion_p (op_ty, type))
> + ? gimple_build (&stmts, VIEW_CONVERT_EXPR, type, op) : op;
> +}
> +
> +  new_stmt = fp (*this, lhs_conv, args_conv);
> +  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
> +  if (convert_lhs_p)
> +{
> +  t = build1 (VIEW_CONVERT_EXPR, old_ty, lhs_conv);
> +  g = gimple_build_assign (lhs, VIEW_CONVERT_EXPR, t);
> +  gsi_insert_after (gsi, g, GSI_SAME_STMT);
> +}
> +  return new_stmt;
> +}
> +
>  /* Fold the call to constant VAL.  */
>  gimple *
>  gimple_folder::fold_to_cstu (poly_uint64 val)
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h 
> b/gcc/config/aarch64/aarch64-sve-builtins.h
> index 6f22868f9b3..02ae098ed32 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.h
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.h
> @@ -421,6 +421,7 @@ public:
>tree scalar_type (unsigned int) const;
>tree vector_type (unsigned int) const;
>tree tuple_type (unsigned int) const;
> +  tree signed_type (unsigned int) const;
>unsigned int elements_per_vq (unsigned int) const;
>machine_mode vector_mode (unsigned int) const;
>machine_mode tuple_mode (unsigned int) const;
> @@ -649,6 +650,8 @@ public:
>gcall *redirect_call (const function_instance &);
>gimple *redirect_pred_x ();
>gimple *fold_pfalse ();
> +  gimple *convert_and_fold (tree, gimple *(*) (gimple_folder &,
> +tree, vec &));
>  
>gimple *fold_to_cstu (poly_uint64);
>gimple *fold_to_pfalse ();
> @@ -884,6 +887,20 @@ find_type_suffix (type_class_index tclass, unsigned int 
> element_bits)
>gcc_unreachable ();
>  }
>  
> +/* Return the type suffix of the signed type of width ELEMENT_BITS.  */
> +inline type_suffix_index
> +signed_type_suffix_index (unsigned int element_bits)
> +{
> +  switch (element_bits)
> +  {
> +  case 8: return TYPE_SUFFIX_s8;
> +  case 16: return TYPE_SUFFIX_s16;
> +  case 32: return TYPE_SUFFIX_s32;
> +  case 64: return TYPE_SUFFIX_s64;
> +  }
> +  gcc_unreachable ();
> +}
> +

We could drop this and instead replace calls with:

  find_type_suffix (TYPE_signed, element_bits)


>  /* Return the single field in tuple type TYPE.  */
>  inline tree
>  tuple_type_field (tree type)
> @@ -1049,6 +1066,20 @@ function_instance::tuple_type (unsigned int i) const
>return acle_vector_types[num_vectors - 1][type_suffix (i).vector_type];
>  }
>  
> +/* Return the signed vector type of width ELEMENT_BITS.  */
> +inline tree
> +function_instance::signed_type (unsigned int element_bits) const
> +{
> +  switch (element_bits)
> +  {
> +  case 8: return acle_vector_types[0][VECTOR_TYPE_svint8_t];
> +  case 16: return acle_vector_types[0][VECTOR_TYPE_svint16_t];
> +  case 32: return acle_vector_types[0][VECTOR_TYPE_svint32_t];
> +  case 64: return acle_vector_types[0][VECTOR_TYPE_svint64_t];
> +  }
> +  gcc_unreachable ();
> +}
> +

And for this, I think we should instead make:

/* Return the vector type associated with TYPE.  */
static tree
get_vector_type (sve_type type)
{
  auto vector_type = type_suffixes[type.type].vector_type;
  return acle_vector_types[type.num_vectors - 1][vector_type];
}

public, or perhaps just define it inline in aarch64-sve-builtins.h.

OK with those changes, thanks.

Richard

[PATCH 0/4] aarch64: Add mf8 data movement intrinsics

2024-12-19 Thread Richard Sandiford

This series adds mf8 variants of what I'll loosely call the existing
"data movement" intrinsics, including the recent FEAT_LUT ones.
I think this completes the FP8 intrinsic definitions.

Sorry that the series is so late.  We did make a real effort to get
it done by the end of stage 1, but there were some unexpected hitches.
The current half-complete state of trunk means that we either need
to apply this patch or disable the existing __ARM_FEATURE_FP8 definition.

Tested on aarch64-linux-gnu and aarch64_be-elf (including ILP32).
I also tested advsimd-intrinsics.exp on arm-eabi to make sure that
the mf8 stuff was properly protected.

I'll commit this when the prerequisite x86 changes in:

  https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671924.html

are approved, unless there are no comments before then.

The main patch was co-authored by Saurabh.

Thanks,
Richard


Richard Sandiford (4):
  aarch64: Macroise simd_type definitions
  aarch64: Use mf8 instead of f8 in builtin definitions
  aarch64: Add missing makefile dependency
  aarch64: Add mf8 data movement intrinsics

 gcc/config/aarch64/aarch64-builtins.cc|  934 +++--
 gcc/config/aarch64/aarch64-builtins.h |2 +
 gcc/config/aarch64/aarch64-protos.h   |2 +
 .../aarch64/aarch64-simd-pragma-builtins.def  |  288 ++-
 gcc/config/aarch64/aarch64-simd.md|   60 +-
 gcc/config/aarch64/aarch64.cc |2 +-
 gcc/config/aarch64/aarch64.md |   16 +
 gcc/config/aarch64/iterators.md   |1 +
 gcc/config/aarch64/t-aarch64  |1 +
 .../aarch64/advsimd-intrinsics/arm-neon-ref.h |   40 +
 .../advsimd-intrinsics/compute-ref-data.h |   29 +
 .../aarch64/advsimd-intrinsics/vbsl.c |   20 +
 .../aarch64/advsimd-intrinsics/vcombine.c |   10 +
 .../aarch64/advsimd-intrinsics/vcreate.c  |9 +
 .../aarch64/advsimd-intrinsics/vdup-vmov.c|   34 +
 .../aarch64/advsimd-intrinsics/vdup_lane.c|   26 +
 .../aarch64/advsimd-intrinsics/vext.c |   18 +
 .../aarch64/advsimd-intrinsics/vget_high.c|5 +
 .../aarch64/advsimd-intrinsics/vld1.c |   14 +
 .../aarch64/advsimd-intrinsics/vld1_dup.c |   34 +
 .../aarch64/advsimd-intrinsics/vld1_lane.c|   14 +
 .../aarch64/advsimd-intrinsics/vld1x2.c   |   11 +-
 .../aarch64/advsimd-intrinsics/vld1x3.c   |8 +-
 .../aarch64/advsimd-intrinsics/vld1x4.c   |6 +-
 .../aarch64/advsimd-intrinsics/vldX.c |  134 ++
 .../aarch64/advsimd-intrinsics/vldX_dup.c |   76 +
 .../aarch64/advsimd-intrinsics/vldX_lane.c|   65 +-
 .../aarch64/advsimd-intrinsics/vrev.c |   38 +
 .../aarch64/advsimd-intrinsics/vset_lane.c|   16 +
 .../aarch64/advsimd-intrinsics/vshuffle.inc   |   14 +
 .../aarch64/advsimd-intrinsics/vst1_lane.c|   12 +
 .../aarch64/advsimd-intrinsics/vst1x2.c   |8 +-
 .../aarch64/advsimd-intrinsics/vst1x3.c   |8 +-
 .../aarch64/advsimd-intrinsics/vst1x4.c   |8 +-
 .../aarch64/advsimd-intrinsics/vstX_lane.c|   69 +
 .../aarch64/advsimd-intrinsics/vtbX.c |   59 +-
 .../aarch64/advsimd-intrinsics/vtrn.c |   20 +
 .../aarch64/advsimd-intrinsics/vtrn_half.c|   30 +
 .../aarch64/advsimd-intrinsics/vuzp.c |   20 +
 .../aarch64/advsimd-intrinsics/vuzp_half.c|   30 +
 .../aarch64/advsimd-intrinsics/vzip.c |   20 +
 .../aarch64/advsimd-intrinsics/vzip_half.c|   30 +
 gcc/testsuite/gcc.target/aarch64/simd/lut.c   |   90 +
 .../gcc.target/aarch64/simd/mf8_data_1.c  | 1822 +
 .../gcc.target/aarch64/simd/mf8_data_2.c  |   98 +
 .../gcc.target/aarch64/vdup_lane_1.c  |   99 +-
 .../gcc.target/aarch64/vdup_lane_2.c  |   45 +-
 gcc/testsuite/gcc.target/aarch64/vdup_n_1.c   |   60 +-
 .../gcc.target/aarch64/vect_copy_lane_1.c |   12 +-
 49 files changed, 4259 insertions(+), 208 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/mf8_data_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/mf8_data_2.c

-- 
2.25.1

[PATCH 1/4] aarch64: Macroise simd_type definitions

2024-12-19 Thread Richard Sandiford

This patch tries to regularise the definitions of the new pragma
simd types.  Not all of the new types are currently used, but they
will be by later patches.

gcc/
* config/aarch64/aarch64-builtins.cc (simd_types): Use one macro
invocation for each element type.
---
 gcc/config/aarch64/aarch64-builtins.cc | 65 +-
 1 file changed, 32 insertions(+), 33 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index ca1dc5a3e6a..bad97181cf6 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -1637,41 +1637,40 @@ struct simd_type {
 };
 
 namespace simd_types {
-  constexpr simd_type f8 { V8QImode, qualifier_modal_float };
-  constexpr simd_type f8q { V16QImode, qualifier_modal_float };
-  constexpr simd_type p8 { V8QImode, qualifier_poly };
-  constexpr simd_type p8q { V16QImode, qualifier_poly };
-  constexpr simd_type s8 { V8QImode, qualifier_none };
-  constexpr simd_type s8q { V16QImode, qualifier_none };
-  constexpr simd_type u8 { V8QImode, qualifier_unsigned };
-  constexpr simd_type u8q { V16QImode, qualifier_unsigned };
-
-  constexpr simd_type f16 { V4HFmode, qualifier_none };
-  constexpr simd_type f16q { V8HFmode, qualifier_none };
-  constexpr simd_type f16qx2 { V2x8HFmode, qualifier_none };
-  constexpr simd_type p16 { V4HImode, qualifier_poly };
-  constexpr simd_type p16q { V8HImode, qualifier_poly };
-  constexpr simd_type p16qx2 { V2x8HImode, qualifier_poly };
-  constexpr simd_type s16 { V4HImode, qualifier_none };
-  constexpr simd_type s16q { V8HImode, qualifier_none };
-  constexpr simd_type s16qx2 { V2x8HImode, qualifier_none };
-  constexpr simd_type u16 { V4HImode, qualifier_unsigned };
-  constexpr simd_type u16q { V8HImode, qualifier_unsigned };
-  constexpr simd_type u16qx2 { V2x8HImode, qualifier_unsigned };
-
-  constexpr simd_type bf16 { V4BFmode, qualifier_none };
-  constexpr simd_type bf16q { V8BFmode, qualifier_none };
-  constexpr simd_type bf16qx2 { V2x8BFmode, qualifier_none };
-
-  constexpr simd_type f32 { V2SFmode, qualifier_none };
-  constexpr simd_type f32q { V4SFmode, qualifier_none };
-  constexpr simd_type s32 { V2SImode, qualifier_none };
-  constexpr simd_type s32q { V4SImode, qualifier_none };
-
-  constexpr simd_type f64q { V2DFmode, qualifier_none };
-  constexpr simd_type s64q { V2DImode, qualifier_none };
+#define VARIANTS(BASE, D, Q, MODE, QUALIFIERS) \
+  constexpr simd_type BASE { V##D##MODE, QUALIFIERS }; \
+  constexpr simd_type BASE##x2 { V2x##D##MODE, QUALIFIERS };   \
+  constexpr simd_type BASE##x3 { V3x##D##MODE, QUALIFIERS };   \
+  constexpr simd_type BASE##x4 { V4x##D##MODE, QUALIFIERS };   \
+  constexpr simd_type BASE##q { V##Q##MODE, QUALIFIERS };  \
+  constexpr simd_type BASE##qx2 { V2x##Q##MODE, QUALIFIERS };  \
+  constexpr simd_type BASE##qx3 { V3x##Q##MODE, QUALIFIERS };  \
+  constexpr simd_type BASE##qx4 { V4x##Q##MODE, QUALIFIERS };  \
+  constexpr simd_type BASE##_scalar { MODE, QUALIFIERS };
+
+  VARIANTS (f8, 8, 16, QImode, qualifier_modal_float)
+  VARIANTS (p8, 8, 16, QImode, qualifier_poly)
+  VARIANTS (s8, 8, 16, QImode, qualifier_none)
+  VARIANTS (u8, 8, 16, QImode, qualifier_unsigned)
+
+  VARIANTS (bf16, 4, 8, BFmode, qualifier_none)
+  VARIANTS (f16, 4, 8, HFmode, qualifier_none)
+  VARIANTS (p16, 4, 8, HImode, qualifier_poly)
+  VARIANTS (s16, 4, 8, HImode, qualifier_none)
+  VARIANTS (u16, 4, 8, HImode, qualifier_unsigned)
+
+  VARIANTS (f32, 2, 4, SFmode, qualifier_none)
+  VARIANTS (p32, 2, 4, SImode, qualifier_poly)
+  VARIANTS (s32, 2, 4, SImode, qualifier_none)
+  VARIANTS (u32, 2, 4, SImode, qualifier_unsigned)
+
+  VARIANTS (f64, 1, 2, DFmode, qualifier_none)
+  VARIANTS (p64, 1, 2, DImode, qualifier_poly)
+  VARIANTS (s64, 1, 2, DImode, qualifier_none)
+  VARIANTS (u64, 1, 2, DImode, qualifier_unsigned)
 
   constexpr simd_type none { VOIDmode, qualifier_none };
+#undef VARIANTS
 }
 
 }
-- 
2.25.1

[PATCH 2/4] aarch64: Use mf8 instead of f8 in builtin definitions

2024-12-19 Thread Richard Sandiford

The intrinsic type suffix for modal floating-point types is _mf8,
so it's more convenient if we use that for the simd_types as well.

gcc/
* config/aarch64/aarch64-builtins.cc (simd_types::f8): Rename to...
(simd_types::mf8): ...this.
* config/aarch64/aarch64-simd-pragma-builtins.def: Update accordingly.
---
 gcc/config/aarch64/aarch64-builtins.cc|  2 +-
 .../aarch64/aarch64-simd-pragma-builtins.def  | 42 +--
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index bad97181cf6..9d1d0260e73 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -1648,7 +1648,7 @@ namespace simd_types {
   constexpr simd_type BASE##qx4 { V4x##Q##MODE, QUALIFIERS };  \
   constexpr simd_type BASE##_scalar { MODE, QUALIFIERS };
 
-  VARIANTS (f8, 8, 16, QImode, qualifier_modal_float)
+  VARIANTS (mf8, 8, 16, QImode, qualifier_modal_float)
   VARIANTS (p8, 8, 16, QImode, qualifier_poly)
   VARIANTS (s8, 8, 16, QImode, qualifier_none)
   VARIANTS (u8, 8, 16, QImode, qualifier_unsigned)
diff --git a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def 
b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
index 5dafa7bb6b9..8924262cc53 100644
--- a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def
@@ -91,24 +91,24 @@
 
 #undef ENTRY_VDOT_FPM
 #define ENTRY_VDOT_FPM(T)  \
-  ENTRY_TERNARY (vdot_##T##_mf8_fpm, T, T, f8, f8, \
+  ENTRY_TERNARY (vdot_##T##_mf8_fpm, T, T, mf8, mf8,   \
 UNSPEC_FDOT_FP8, FP8)  \
-  ENTRY_TERNARY (vdotq_##T##_mf8_fpm, T##q, T##q, f8q, f8q,\
+  ENTRY_TERNARY (vdotq_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q,  \
 UNSPEC_FDOT_FP8, FP8)  \
-  ENTRY_TERNARY_LANE (vdot_lane_##T##_mf8_fpm, T, T, f8, f8,   \
+  ENTRY_TERNARY_LANE (vdot_lane_##T##_mf8_fpm, T, T, mf8, mf8, \
  UNSPEC_FDOT_LANE_FP8, FP8)\
-  ENTRY_TERNARY_LANE (vdot_laneq_##T##_mf8_fpm, T, T, f8, f8q, \
+  ENTRY_TERNARY_LANE (vdot_laneq_##T##_mf8_fpm, T, T, mf8, mf8q,   \
  UNSPEC_FDOT_LANE_FP8, FP8)\
-  ENTRY_TERNARY_LANE (vdotq_lane_##T##_mf8_fpm, T##q, T##q, f8q, f8,   \
+  ENTRY_TERNARY_LANE (vdotq_lane_##T##_mf8_fpm, T##q, T##q, mf8q, mf8, \
  UNSPEC_FDOT_LANE_FP8, FP8)\
-  ENTRY_TERNARY_LANE (vdotq_laneq_##T##_mf8_fpm, T##q, T##q, f8q, f8q, \
+  ENTRY_TERNARY_LANE (vdotq_laneq_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q,\
  UNSPEC_FDOT_LANE_FP8, FP8)
 
 #undef ENTRY_FMA_FPM
 #define ENTRY_FMA_FPM(N, T, U) \
-  ENTRY_TERNARY (N##q_##T##_mf8_fpm, T##q, T##q, f8q, f8q, U, FP8) \
-  ENTRY_TERNARY_LANE (N##q_lane_##T##_mf8_fpm, T##q, T##q, f8q, f8, U, FP8) \
-  ENTRY_TERNARY_LANE (N##q_laneq_##T##_mf8_fpm, T##q, T##q, f8q, f8q, U, FP8)
+  ENTRY_TERNARY (N##q_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q, U, FP8)   \
+  ENTRY_TERNARY_LANE (N##q_lane_##T##_mf8_fpm, T##q, T##q, mf8q, mf8, U, FP8) \
+  ENTRY_TERNARY_LANE (N##q_laneq_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q, U, FP8)
 
 // faminmax
 #define REQUIRED_EXTENSIONS nonstreaming_only (AARCH64_FL_FAMINMAX)
@@ -131,18 +131,18 @@ ENTRY_TERNARY_VLUT16 (u)
 
 // fpm conversion
 #define REQUIRED_EXTENSIONS nonstreaming_only (AARCH64_FL_FP8)
-ENTRY_UNARY_VQ_BHF (vcvt1, f8, UNSPEC_F1CVTL_FP8, FP8)
-ENTRY_UNARY_VQ_BHF (vcvt1_high, f8q, UNSPEC_F1CVTL2_FP8, FP8)
-ENTRY_UNARY_VQ_BHF (vcvt1_low, f8q, UNSPEC_F1CVTL_FP8, FP8)
-ENTRY_UNARY_VQ_BHF (vcvt2, f8, UNSPEC_F2CVTL_FP8, FP8)
-ENTRY_UNARY_VQ_BHF (vcvt2_high, f8q, UNSPEC_F2CVTL2_FP8, FP8)
-ENTRY_UNARY_VQ_BHF (vcvt2_low, f8q, UNSPEC_F2CVTL_FP8, FP8)
-
-ENTRY_BINARY (vcvt_mf8_f16_fpm, f8, f16, f16, UNSPEC_FCVTN_FP8, FP8)
-ENTRY_BINARY (vcvtq_mf8_f16_fpm, f8q, f16q, f16q, UNSPEC_FCVTN_FP8, FP8)
-ENTRY_BINARY (vcvt_mf8_f32_fpm, f8, f32q, f32q, UNSPEC_FCVTN_FP8, FP8)
-
-ENTRY_TERNARY (vcvt_high_mf8_f32_fpm, f8q, f8, f32q, f32q,
+ENTRY_UNARY_VQ_BHF (vcvt1, mf8, UNSPEC_F1CVTL_FP8, FP8)
+ENTRY_UNARY_VQ_BHF (vcvt1_high, mf8q, UNSPEC_F1CVTL2_FP8, FP8)
+ENTRY_UNARY_VQ_BHF (vcvt1_low, mf8q, UNSPEC_F1CVTL_FP8, FP8)
+ENTRY_UNARY_VQ_BHF (vcvt2, mf8, UNSPEC_F2CVTL_FP8, FP8)
+ENTRY_UNARY_VQ_BHF (vcvt2_high, mf8q, UNSPEC_F2CVTL2_FP8, FP8)
+ENTRY_UNARY_VQ_BHF (vcvt2_low, mf8q, UNSPEC_F2CVTL_FP8, FP8)
+
+ENTRY_BINARY (vcvt_mf8_f16_fpm, mf8, f16, f16, UNSPEC_FCVTN_FP8, FP8)
+ENTRY_BINARY (vcvtq_mf8_f16_fpm, mf8q, f16q, f16q, UNSPEC_FCVTN_FP8, FP8)
+ENTRY_BINARY (vcvt_mf8_f32_fpm, mf8, f32q, f32q, UNSPEC_FCVTN_FP8, FP8)
+
+ENTRY_TERNARY (vcvt_high_mf8_f32_fpm, mf8q, mf8, f32q, f32q,
   UNSPEC_FCVTN2_FP8, FP8)
 #undef REQUIRED_EX

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-19 Thread Tamar Christina

> -Original Message-
> From: Richard Sandiford 
> Sent: Thursday, December 19, 2024 11:03 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH 7/7]AArch64: Implement vector concat of partial SVE 
> vectors
> 
> Tamar Christina  writes:
> >> >  ;; 2 element quad vector modes.
> >> >  (define_mode_iterator VQ_2E [V2DI V2DF])
> >> >
> >> > @@ -1678,7 +1686,15 @@ (define_mode_attr VHALF [(V8QI "V4QI")
> (V16QI
> >> "V8QI")
> >> >   (V2DI "DI")(V2SF  "SF")
> >> >   (V4SF "V2SF")  (V4HF "V2HF")
> >> >   (V8HF "V4HF")  (V2DF  "DF")
> >> > - (V8BF "V4BF")])
> >> > + (V8BF "V4BF")
> >> > + (VNx16QI "VNx8QI") (VNx8QI "VNx4QI")
> >> > + (VNx4QI "VNx2QI")  (VNx2QI "QI")
> >> > + (VNx8HI "VNx4HI")  (VNx4HI "VNx2HI") (VNx2HI 
> >> > "HI")
> >> > + (VNx8HF "VNx4HF")  (VNx4HF "VNx2HF") (VNx2HF 
> >> > "HF")
> >> > + (VNx8BF "VNx4BF")  (VNx4BF "VNx2BF") (VNx2BF 
> >> > "BF")
> >> > + (VNx4SI "VNx2SI")  (VNx2SI "SI")
> >> > + (VNx4SF "VNx2SF")  (VNx2SF "SF")
> >> > + (VNx2DI "DI")  (VNx2DF "DF")])
> >>
> >> Are the x2 entries necessary, given that the new uses are restricted
> >> to NO2E?
> >>
> >
> > No, but I wanted to keep the symmetry with the Adv. SIMD modes.   Since the
> > mode attributes don't really control the number of alternatives I thought it
> would
> > be better to have the attributes be "fully" defined rather than only the 
> > subset I
> use.
> 
> But these are variable-length modes, so DI is only half of VNx2DI for
> the minimum vector length.  It's less than half for Neoverse V1 or A64FX.
> 
> IMO it'd be better to leave them out for now and defined them when needed,
> at which point the right choice would be more obvious.
> 

OK.

gcc/ChangeLog:

PR target/96342
* config/aarch64/aarch64-sve.md (vec_init): New.
(@aarch64_pack_partial): New.
* config/aarch64/aarch64.cc (aarch64_sve_expand_vector_init_subvector): 
New.
* config/aarch64/iterators.md (SVE_NO2E): New.
(VHALF, Vhalf): Add SVE partial vectors.

gcc/testsuite/ChangeLog:

PR target/96342
* gcc.target/aarch64/vect-simd-clone-2.c: New test.

Bootstrapped Regtested on aarch64-none-linux-gnu  and no issues.

Ok for master?

Thanks,
Tamar

-- inline copy of patch --

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index 
a72ca2a500d394598268c6adfe717eed94a304b3..8ed4221dbe5c49db97b37f186365fa391900eadb
 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -2839,6 +2839,16 @@ (define_expand "vec_init"
   }
 )
 
+(define_expand "vec_init"
+  [(match_operand:SVE_NO2E 0 "register_operand")
+   (match_operand 1 "")]
+  "TARGET_SVE"
+  {
+aarch64_sve_expand_vector_init (operands[0], operands[1]);
+DONE;
+  }
+)
+
 ;; Shift an SVE vector left and insert a scalar into element 0.
 (define_insn "vec_shl_insert_"
   [(set (match_operand:SVE_FULL 0 "register_operand")
@@ -9289,6 +9299,19 @@ (define_insn "vec_pack_trunc_"
   "uzp1\t%0., %1., %2."
 )
 
+;; Integer partial pack packing two partial SVE types into a single full SVE
+;; type of the same element type.  Use UZP1 on the wider type, which discards
+;; the high part of each wide element.  This allows to concat SVE partial types
+;; into a wider vector.
+(define_insn "@aarch64_pack_partial"
+  [(set (match_operand:SVE_NO2E 0 "register_operand" "=w")
+   (vec_concat:SVE_NO2E
+ (match_operand: 1 "register_operand" "w")
+ (match_operand: 2 "register_operand" "w")))]
+  "TARGET_SVE"
+  "uzp1\t%0., %1., %2."
+)
+
 ;; -
 ;;  [INT<-INT] Unpacks
 ;; -
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
de4c0a0783912b54ac35d7c818c24574b27a4ca0..40214e318f3c4e30e619d96073b253887c973efc
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24859,6 +24859,17 @@ aarch64_sve_expand_vector_init (rtx target, rtx vals)
 v.quick_push (XVECEXP (vals, 0, i));
   v.finalize ();
 
+  /* If we have two elements and are concatting vector.  */
+  machine_mode elem_mode = GET_MODE (v.elt (0));
+  if (nelts == 2 && VECTOR_MODE_P (elem_mode))
+{
+  /* We've failed expansion using a dup.  Try using a cheeky truncate. */
+  rtx arg0 = force_reg (elem_mode, v.elt(0));
+  rtx arg1 = force_reg (elem_mode, v.elt(1));
+  emit_insn (gen_aarch64_pack_partial (mode, target, arg0, arg1));
+  return;
+}
+
   /* If neither sub-vectors of v could be initialize

Re: [PATCH] c++: Fix ICEs with large initializer lists or ones including #embed [PR118124]

2024-12-19 Thread Jason Merrill


On 12/19/24 11:07 AM, Jakub Jelinek wrote:

Hi!

The following testcases ICE due to RAW_DATA_CST not being handled where it
should be during ck_list conversions.

The last 2 testcases started ICEing with r15-6339 committed yesterday
(speedup of large initializers), the first two already with r15-5958
(#embed optimization for C++).

For conversion to initializer_list or char/signed char
we can optimize and keep RAW_DATA_CST with adjusted type if we report
narrowing errors if needed, for others this converts each element
separately.


Please add this paragraph as a comment.


Ok for trunk if this passes bootstrap/regtest?  Wouldn't like to
leave this broken over Christmas holidays.

2024-12-19  Jakub Jelinek  

PR c++/118124
* call.cc (convert_like_internal): Handle RAW_DATA_CST in
ck_list handling.  Formatting fixes.

* g++.dg/cpp/embed-15.C: New test.
* g++.dg/cpp/embed-16.C: New test.
* g++.dg/cpp0x/initlist-opt3.C: New test.
* g++.dg/cpp0x/initlist-opt4.C: New test.

--- gcc/cp/call.cc.jj   2024-12-11 17:27:52.481221310 +0100
+++ gcc/cp/call.cc  2024-12-19 16:10:12.977071898 +0100
@@ -8766,8 +8766,8 @@ convert_like_internal (conversion *convs
  
  	if (tree init = maybe_init_list_as_array (elttype, expr))

  {
-   elttype = cp_build_qualified_type
- (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST);
+   elttype = cp_build_qualified_type (elttype, cp_type_quals (elttype)
+   | TYPE_QUAL_CONST);
array = build_array_of_n_type (elttype, len);
array = build_vec_init_expr (array, init, complain);
array = get_target_expr (array);
@@ -8775,13 +8775,75 @@ convert_like_internal (conversion *convs
  }
else if (len)
  {
-   tree val; unsigned ix;
-
+   tree val;
+   unsigned ix;
tree new_ctor = build_constructor (init_list_type_node, NULL);
  
  	/* Convert all the elements.  */

FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (expr), ix, val)
  {
+   if (TREE_CODE (val) == RAW_DATA_CST)
+ {
+   tree elt_type;
+   conversion *next;
+   if (convs->u.list[ix]->kind == ck_std
+   && (elt_type = convs->u.list[ix]->type)
+   && (TREE_CODE (elt_type) == INTEGER_TYPE
+   || is_byte_access_type (elt_type))
+   && TYPE_PRECISION (elt_type) == CHAR_BIT
+   && (next = next_conversion (convs->u.list[ix]))
+   && next->kind == ck_identity)
+ {
+   if (!TYPE_UNSIGNED (elt_type)
+   && (complain & tf_warning)


We shouldn't check tf_warning here.


+   && (TYPE_UNSIGNED (TREE_TYPE (val))
+   || (TYPE_PRECISION (TREE_TYPE (val))
+   > CHAR_BIT)))
+ for (int i = 0; i < RAW_DATA_LENGTH (val); ++i)
+   if (RAW_DATA_SCHAR_ELT (val, i) < 0)
+ {


Instead, check tf_warning_or_error here, and return error_mark_node if 
it's not set.



+   location_t loc
+ = cp_expr_loc_or_input_loc (val);
+   int savederrorcount = errorcount;
+   permerror_opt (loc, OPT_Wnarrowing,
+  "narrowing conversion of %qd "
+  "from %qH to %qI",
+  RAW_DATA_UCHAR_ELT (val, i),
+  TREE_TYPE (val), elt_type);
+   if (errorcount != savederrorcount)
+ return error_mark_node;
+ }
+   tree sub = copy_node (val);
+   TREE_TYPE (sub) = elt_type;
+   CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (new_ctor),
+   NULL_TREE, sub);
+ }
+   else
+ {
+   for (int i = 0; i < RAW_DATA_LENGTH (val); ++i)
+ {
+   tree elt
+ = build_int_cst (TREE_TYPE (val),
+  RAW_DATA_UCHAR_ELT (val, i));
+   tree sub
+ = convert_like (convs->u.list[ix], elt,
+ fn, argnum, false, false,
+ /*nested_p=*/true, complain);
+   if (sub == error_mark_node)
+ return sub;

Re: [PATCH] c++: Implement for namespace statics CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-12-19 Thread Jason Merrill


On 12/18/24 11:41 AM, Jakub Jelinek wrote:

On Tue, Dec 17, 2024 at 05:34:38PM -0500, Jason Merrill wrote:

On 9/11/24 8:26 AM, Jakub Jelinek wrote:

On Wed, Sep 11, 2024 at 10:16:18PM +1000, Nathaniel Shead wrote:

In the header_module_p case, it is valid to have internal linkage
definitions (e.g. in an anonymous namespace), but in that case the
{static,tls}_aggregates lists should still be in place to be streamed
and everything should work as "normal".


As the patch doesn't touch the streaming of {static,tls}_aggregates
in that case, I guess that means CWG 2867 will not be fixed for those
cases (i.e. temporaries from the structured binding base initialization
will be destructed at the end of that initialization, rather than at the
end of subsequent get initializers); perhaps we should stream the
STATIC_INIT_DECOMP_*BASE_P flags say by streaming there integer_zero_node
or integer_one_node right before the decls and on streaming it back set
the flags again.


We don't stream *_aggregates at all; rather, read_var_def builds them up as
we read in the variables.  Can we set the appropriate flags at that time?


I know we don't stream *_aggregates, that makes it even harder.
We'd need to ensure that we stream either all the decls coming from a
namespace scope structured binding, or none of them, in the right order
(no idea if that is already guaranteed somehow or would need to be extra
ensured and where).
Then probably write_var_def would need to stream the
STATIC_INIT_DECOMP_NONBASE_P/STATIC_INIT_DECOMP_BASE_P flags (or arrange for
them to be streamed out later) and then in read_var_def stream that in (or
again arrange for it to be streamed in later).


Might need to add something to find_dependencies.


In order to implement that, guess I'd first need to have a testcase I can
look at it.  I don't see a single dg-do run testcase in g++.dg/modules/
though.  Would that be something like add *.H file with
// { dg-additional-options "-fmodule-header" }
// { dg-module-cmi {} }

and content like in one of the testcases in the patch with non-std stuff
moded all into anonymous namespace, then

// { dg-do run }
// { dg-additional-options "-fmodules-ts" }

import "name.H";

int
main ()
{
   // use the sb vars and perhaps assert counters have the expected state
}
?


Looks right.


For the !header_module_p case, we'll need a testcase too
to make sure it works properly.


For the !header_module_p case, the structured binding initialization would
be handled by the module .o, so importers wouldn't need to worry about it.


Ok.

Jakub

[patch 2/2, avr] Use new target hook to assemble a variable

2024-12-19 Thread Georg-Johann Lay


The "io", "io_low", and "address" attributes require to asm output
the definition of respective symbols in a manner that was not supported
until the introduction of the new target hook TARGET_ASM_VARIABLE.

The previous implementation of these attributes abused tls_common_section
which is a noswitch section.  Notice that the middle-end doesn't allow to
introduce own, custom noswitch sections.
   The tls_comm_section->noswitch.callback allowed for a custom asm output
regardless of -f[no-]data-sections and -f[no-]common.  However, it would
not work with checking enabled due to varasm.cc::assemble_variable()'s

  /* Emulated TLS had better not get this far.  */
  gcc_checking_assert (targetm.have_tls || !DECL_THREAD_LOCAL_P (decl));

This patch avoids that hack and uses the new TARGET_ASM_VARIABLE target
hook to output variables with the mentioned attributes.

Ok for trunk?

Johann

--

AVR: target/112952 - Use new TARGET_ASM_VARIABLE for io, io_low, address.

The "io", "io_low", and "address" attributes require to asm output
the definition of respective symbols in a manner that was not supported
until the introduction of the new target hook TARGET_ASM_VARIABLE.

The previous implementation of these attributes abused tls_common_section
which is a noswitch section.  Notice that the middle-end doesn't allow to
introduce own, custom noswitch sections.
   The tls_comm_section->noswitch.callback allowed for a custom asm output
regardless of -f[no-]data-sections and -f[no-]common.  However, it would
not work with checking enabled due to varasm.cc::assemble_variable()'s

  /* Emulated TLS had better not get this far.  */
  gcc_checking_assert (targetm.have_tls || !DECL_THREAD_LOCAL_P (decl));

This patch avoids that hack and uses the new TARGET_ASM_VARIABLE target
hook to output variables with the mentioned attributes.

PR target/112952
gcc/
* config/avr/avr.cc (avr_output_addr_attrib): Rename and
rewrite to avr_asm_variable.
(avr_asm_init_sections): Leave tls_comm_section alone.
(avr_encode_section_info) [io, io_low, address]: Don't
abuse tls_comm_section.
(TARGET_ASM_VARIABLE): New define to avr_asm_variable.
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 05a6905b5d6..a326d029412 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -11689,24 +11689,25 @@ avr_output_progmem_section_asm_op (const char *data)
 }
 
 
-/* A noswitch section callback to output symbol definitions for
-   attributes "io", "io_low" and "address".  */
+/* Implement `TARGET_ASM_VARIABLE'.  */
+/* Output symbol definitions for attributes "io", "io_low" and "address".
+   This hook is the only way to output the definitions in the required way
+   since the middle-end makes assumptions on the asm representation that
+   don't hold for these attributes.  */
 
 static bool
-avr_output_addr_attrib (tree decl, const char *name,
-			unsigned HOST_WIDE_INT /* size */,
-			unsigned HOST_WIDE_INT /* align */)
+avr_asm_variable (FILE *stream, tree decl, const char *name)
 {
-  gcc_assert (DECL_RTL_SET_P (decl));
+  rtx mem, symbol;
 
-  FILE *stream = asm_out_file;
-  bool local_p = ! DECL_WEAK (decl) && ! TREE_PUBLIC (decl);
-  rtx symbol, mem = DECL_RTL (decl);
-
-  if (mem != NULL_RTX && MEM_P (mem)
-  && SYMBOL_REF_P ((symbol = XEXP (mem, 0)))
+  if (VAR_P (decl)
+  && DECL_RTL_SET_P (decl)
+  && MEM_P (mem = DECL_RTL (decl))
+  && SYMBOL_REF_P (symbol = XEXP (mem, 0))
   && (SYMBOL_REF_FLAGS (symbol) & (SYMBOL_FLAG_IO | SYMBOL_FLAG_ADDRESS)))
 {
+  bool local_p = ! DECL_WEAK (decl) && ! TREE_PUBLIC (decl);
+
   if (! local_p)
 	{
 	  fprintf (stream, "\t%s\t", DECL_WEAK (decl) ? ".weak" : ".globl");
@@ -11735,8 +11736,6 @@ avr_output_addr_attrib (tree decl, const char *name,
   return true;
 }
 
-  gcc_unreachable ();
-
   return false;
 }
 
@@ -11754,7 +11753,6 @@ avr_asm_init_sections (void)
 readonly_data_section->unnamed.callback = avr_output_data_section_asm_op;
   data_section->unnamed.callback = avr_output_data_section_asm_op;
   bss_section->unnamed.callback = avr_output_bss_section_asm_op;
-  tls_comm_section->noswitch.callback = avr_output_addr_attrib;
 }
 
 
@@ -11974,18 +11972,11 @@ avr_encode_section_info (tree decl, rtx rtl, int new_decl_p)
 	}
 	  else
 	{
-	  /* PR112952: The only way to output a variable declaration in a
-		 custom manner is by means of a noswitch section callback.
-		 There are only three noswitch sections: comm_section,
-		 lcomm_section and tls_comm_section.  And there is no way to
-		 wire a custom noswitch section to a decl.  As lcomm_section
-		 is bypassed with -fdata-sections -fno-common, there is no
-		 other way than making use of tls_comm_section.  As we are
-		 using that section anyway, also use it in the public case.  */
-
-	  DECL_COMMON (decl) = 1;
+	  /* PR112952: Thanks to TARGET_ASM_VARIABLE, we can output
+		 a variable / symbol definition in a c

Re: The COBOL front end, in 8 notes

2024-12-19 Thread Joseph Myers

On Wed, 18 Dec 2024, James K. Lowden wrote:

> * Please make sure to do all regeneration with *unmodified* versions
> [Joseph]
>  - I don't understand. 

You appeared to have regenerated a configure script built with a version 
of autoconf from a GNU/Linux distribution that patched autoconf to add a 
variable runstatedir that's not in the relevant upstream version of 
autoconf (2.69).  Don't do that.  Everyone needs to use the same version 
of autoconf to regenerate checked-in configure scripts, which means 
unmodified 2.69 built directly from upstream sources, not a 
distribution-patched version of 2.69.

> * does not build on Darwin/macOS [Iain]
>  - We have built only on Linux, on aarch64 x86_64 (per arch(1)).
>  - We rely on support for 128-bit integer and floating point to meet
> ISO COBOL requirements.  
>  - We want to build on any 64-bit Posix OS, BE and LE.
>  - 32-bit architectures are not a consideration.

It's still not clear here what is about the host and what is about the 
target.

* It would be odd to care whether the host is 32-bit or 64-bit.

* If you care about whether the target is 32-bit or 64-bit, that might 
affect which multilibs you build the runtime library for - but the choice 
between 32-bit and 64-bit is a multilib choice on several common 
platforms, so you can't say "disable building the compiler for 32-bit 
targets", since x86_64-linux-gnu and i686-linux-gnu can be targets that 
are both 32-bit and 64-bit depending on whether -m32 / -m64 / -mx32 are 
passed to the compiler.

* A key overall principle here is: don't break default builds using 
--enable-languages=all.  If the compiler won't build for a particular host 
or target, arranging for the build of COBOL to be disabled there is much 
better than breaking the build.  Likewise for the runtime library.  Of 
course broad portability is nice, but if you're known not to be portable 
to some configuration, at least keep --enable-languages=all builds working 
there.

> * check asprintf [Joseph]
>  - The only examples I find where the returned value of asprintf is not
> checked is in symbols_dump(), which is used only to debug the front
> end. 

I'd still encourage moving to xasprintf everywhere to avoid needing to 
check it locally in each place; using such libiberty interfaces when 
available is less error-prone.

> * Static buffers with a PATH_MAX size will probably break the build on
> Hurd host. [Joseph]
>  - Hurd probably not relevant to COBOL
>  - PATH_MAX is Posix. There is no perfect solution. 

As per POSIX, "A definition of one of the symbolic constants in the 
following list shall be omitted from the  header on specific 
implementations where the corresponding value is equal to or greater than 
the stated minimum, but where the value can vary depending on the file to 
which it is applied. The actual value supported for a specific pathname 
shall be provided by the pathconf() function.".  You can't rely on there 
being any given maximum.

>  - If any provided filename is too long, the front end should report it
> as an error. That's what tar(1) does. 

The thing to do is just to pass the filename through the compiler (without 
a size limit, use dynamic memory allocation not fixed-size buffers) and 
give errors if opening it fails, at the point where it needs to be opened.  
It's a principle of the GNU Coding Standards not to have arbitrary fixed 
limits.

>  - Regarding translation support, we would prefer to add that soon
> after the front end is accepted. 

If you use the GCC diagnostic machinery (calling existing functions such 
as error_at and warning_at with gmsgid parameters), translations should 
mostly work automatically.

> * please make sure that every function has such a comment [Josef]
>  - There are 2074 functions, 1295 of which are static. I estimate the
> effort would require 86 man-days. 
>  - IMO this is not the best way to write documentation, nor a measure
> of its quality. 

How do you intend to ensure that people other than those who originally 
wrote the code (which practically also includes those who wrote it, ten 
years later when they've forgotten the details) can read and maintain it, 
and in particular can figure out what the intended interface between 
different functions is?  The comments are meant to be a guide to intent, 
both so someone reading a call to a function can figure out what it's 
meant to do without needing to read all the content of the function being 
called (only the comment), and so that if there seems to be a mismatch 
between expectations in the caller and in the callee, there is some 
indication of what was originally intended.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] simplify-rtx: Limit number of elts in when encoding.

2024-12-19 Thread Richard Sandiford

"Robin Dapp"  writes:
>> ...it's not clear to me that we should define the upper bits of the
>> byte to be zero.  What would rely on that?  Is it something that we'd
>> require for all loads and stores of such modes?
>
> Yes, I meant fewer than BITS_PER_UNIT bits in total.  As opposed to SVE's
> "balanced" mask representation in riscv's case the bits are packed at the
> beginning.
>
> Loads and stores of masks only operate on the bits the mask actually
> contains so when we e.g. have four vector units/elements only four
> mask bits are loaded or stored.
>
> The mode size of those very small modes is still one byte, i.e. we have
> padding for them.  But this only matters for the internal representation
> of those modes (and has been the source of several issues already of
> course, the masked else being one of them I believe).
>
> How does encoding work for SVE's small mask modes?  I suppose
>
>   unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
>  GET_MODE_NUNITS (mode));
>
> is != 1 but rather adjusted so a byte is filled?

It's 1 for VNx16BI, 2 for VNx8BI, and 4 for VNx4BI.  There are then
respectively 8, 4, and 2 elements per byte.

> For our mask modes anything else but zero padding makes no sense, so how
> could we clarify this?

Generally, we don't try to maintain the value of padding bits.  E.g.
if we have a single-byte V4BI mode with 4 bits of padding, doing:

  (set (reg:QI R1) (subreg:QI (reg:V4BI R2) 0))

would not necessarily leave the upper 4 bits of R2 as zero, and so
it would not be valid to optimise:

  (set (reg:QI R1) (and:QI (subreg:QI (reg:V4BI R2) 0) (const_int 15)))

to the move above.  Similarly, on SVE:

  (set (reg:VNx8BI R1) (subreg:VNx8BI (reg:VNx4BI R2) 0))

sets Nx4 of the R1 elements to undefined values.  The even elements are
defined, the odd elements are not.

That's why it seems like forcing the padding is masking a bug elsewhere.
In general, things should work whatever value the padding bits have.

Thanks,
Richard

[PATCH] forwprop: Fix lane handling for VEC_PERM seqence blending

2024-12-19 Thread Christoph Müllner

In PR117830 a miscompilation of 464.h264ref was reported.
An analysis showed that wrong code was generated because of
unsatisfied assuptions in the code.  This patch addresses
these issues.

The first assuption was that we can independently analyse the two
vec-perms at the start of a vec-perm-simplify sequence and use the
information  later for calculating a final vec-perm selector that
utilizes less lanes.  However, this information does not help much,
because for changing the selector entry, we need to ensure that both
elements of the operand vectors v_1 and v_2 remain equal.
This is addressed by removing the function get_vect_selector_index_map
and checking for this equality in the loop where we create the new
selector.

The calculation of the selector vector for the blended sequence
assumed that the indices of the selector vector of the narrowed
sequences are increasing.  This assuption does not hold in general.
This was fixed by allowing a wrap-around when searching for an empty
lane.

Further, there was an issue in the calculation of the selector vector
entries for the second sequence.  The code did not consider that the
lanes of the second sequence could have been moved.

A relevant property of this patch is, that it introduces a
couple of nested loops, where the out loop iterates from
i=0..nelts and the inner loop iterates from j=0..i.
To avoid performance concerns, a check is introduced that
ensures nelts won't exceed 4 lanes.

The added test case is derived from h264ref (the other cases from the
benchmark have the same structure and don't provide additional coverage).

Bootstrapped and regression-tested on x86-64 and aarch64.
Further, tested on CPU 2006 h264ref and CPU 2017 x264.

PR117830

gcc/ChangeLog:

* tree-ssa-forwprop.cc (get_vect_selector_index_map):
(recognise_vec_perm_simplify_seq):
(calc_perm_vec_perm_simplify_seqs):
(process_vec_perm_simplify_seq_list):

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vector-11.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/testsuite/gcc.dg/tree-ssa/vector-11.c |  38 
 gcc/tree-ssa-forwprop.cc  | 203 +-
 2 files changed, 162 insertions(+), 79 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vector-11.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vector-11.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vector-11.c
new file mode 100644
index 000..e4102d318d2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vector-11.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3 -fdump-tree-forwprop1-details -Wno-psabi" } */
+
+typedef int vec __attribute__((vector_size (4 * sizeof (int;
+
+void f1 (vec *p_v_in, vec *p_v_out_1, vec *p_v_out_2)
+{
+  vec sel00 = { 2, 3, 2, 2 };
+  vec sel01 = { 1, 0, 1, 1 };
+  vec sel10 = { 3, 2, 3, 3 };
+  vec sel11 = { 0, 1, 0, 0 };
+  vec sel = { 0, 5, 2, 7 };
+  vec v_1, v_2, v_x, v_y, v_out_1, v_out_2;
+  vec v_in = *p_v_in;
+
+  /* First vec perm sequence.  */
+  v_1 = __builtin_shuffle (v_in, v_in, sel00);
+  v_2 = __builtin_shuffle (v_in, v_in, sel01);
+  v_x = v_2 - v_1;
+  v_y = v_1 + v_2;
+  v_out_1 = __builtin_shuffle (v_y, v_x, sel);
+
+  /* Second vec perm sequence.  */
+  v_1 = __builtin_shuffle (v_in, v_in, sel10);
+  v_2 = __builtin_shuffle (v_in, v_in, sel11);
+  v_x = v_2 - v_1;
+  v_y = v_1 + v_2;
+  v_out_2 = __builtin_shuffle (v_y, v_x, sel);
+
+  /* Won't blend because the narrowed sequence
+ utilizes three of the four lanes.  */
+
+  *p_v_out_1 = v_out_1;
+  *p_v_out_2 = v_out_2;
+}
+
+/* { dg-final { scan-tree-dump "Vec perm simplify sequences have been blended" 
"forwprop1" { target { aarch64*-*-* i?86-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 7, 2, 6 }" "forwprop1" { 
target { aarch64*-*-* i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 7cae08f0d79..dae8c2f435b 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -3479,41 +3479,6 @@ fwprop_ssa_val (tree name)
   return name;
 }
 
-/* Get an index map from the provided vector permute selector
-   and return the number of unique indices.
-   E.g.: { 1, 3, 1, 3 } -> <0, 1, 0, 1>, 2
-{ 0, 2, 0, 2 } -> <0, 1, 0, 1>, 2
-{ 3, 2, 1, 0 } -> <0, 1, 2, 3>, 4.  */
-
-static unsigned int
-get_vect_selector_index_map (tree sel, vec *index_map)
-{
-  gcc_assert (VECTOR_CST_NELTS (sel).is_constant ());
-  unsigned int nelts = VECTOR_CST_NELTS (sel).to_constant ();
-  unsigned int n = 0;
-
-  for (unsigned int i = 0; i < nelts; i++)
-{
-  /* Extract the i-th value from the selector.  */
-  tree sel_cst_tree = VECTOR_CST_ELT (sel, i);
-  unsigned int sel_cst = TREE_INT_CST_LOW (sel_cst_tree);
-
-  unsigned int j = 0;
-  for (; j <= i; j++)
-   {
- tree prev_sel_cst_tree = VECTOR_CST_ELT (sel, j);
- unsigned int prev_sel_cst
-   = TREE_INT_CST_LOW (prev_sel_cst_t

[PATCH] c/c++: UX improvements to 'too {few, many} arguments' errors [PR118112]

2024-12-19 Thread David Malcolm

Consider this case of a bad call to a callback function (perhaps
due to C23 changing the meaning of () in function decls):

struct p {
int (*bar)();
};

void baz() {
struct p q;
q.bar(1);
}

Before this patch the C frontend emits:

t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'
7 | q.bar(1);
  | ^

and the C++ frontend emits:

t.c: In function 'void baz()':
t.c:7:10: error: too many arguments to function
7 | q.bar(1);
  | ~^~~

neither of which give the user much help in terms of knowing what
was expected, and where the relevant declaration is.

With this patch the C frontend emits:

t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'; expected 0, have 1
7 | q.bar(1);
  | ^ ~
t.c:2:15: note: declared here
2 | int (*bar)();
  |   ^~~

(showing the expected vs actual counts, the pertinent field decl, and
underlining the first extraneous argument at the callsite)

and the C++ frontend emits:

t.c: In function 'void baz()':
t.c:7:10: error: too many arguments to function; expected 0, have 1
7 | q.bar(1);
  | ~^~~

(showing the expected vs actual counts; the other data was not accessible
without a more invasive patch)

Similarly, the patch also updates the "too few arguments" case to also
show expected vs actual counts.  Doing so requires a tweak to the
wording to say "at least" for the case of variadic fns, and for C++ fns
with default args, where e.g. previously the C FE emitted:

s.c: In function 'test':
s.c:5:3: error: too few arguments to function 'callee'
5 |   callee ();
  |   ^~
s.c:1:6: note: declared here
1 | void callee (const char *, ...);
  |  ^~

with this patch it emits:

s.c: In function 'test':
s.c:5:3: error: too few arguments to function 'callee'; expected at least 1, 
have 0
5 |   callee ();
  |   ^~
s.c:1:6: note: declared here
1 | void callee (const char *, ...);
  |  ^~

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
OK for trunk?

gcc/c/ChangeLog:
PR c/118112
* c-typeck.cc (inform_declaration): Add "function_expr" param and
use it for cases where we couldn't show the function decl to show
field decls for callbacks.
(build_function_call_vec): Add missing auto_diagnostic_group.
Update for new param of inform_declaration.
(convert_arguments): Likewise.  For the "too many arguments" case
add the expected vs actual counts to the message, and if we have
it, add the location_t of the first surplus param as a secondary
location within the diagnostic.  For the "too few arguments" case,
determine the minimum number of arguments required and add the
expected vs actual counts to the message, tweaking it to "at least"
for variadic functions.

gcc/cp/ChangeLog:
PR c/118112
* typeck.cc (error_args_num): Add params "expected_num",
"actual_num", and "at_least_p".  Compute "too_many_p" from these
rather than have it be a param.  Add expected vs actual counts to
the messages and tweak them for the "at least" case.
(convert_arguments): Update calls to error_args_num to pass in
expected vs actual number, and the "at_least_p", determining this
for the "too few" case.

gcc/testsuite/ChangeLog:
PR c/118112
* c-c++-common/too-few-arguments.c: New test.
* c-c++-common/too-many-arguments.c: New test.
* g++.dg/cpp0x/variadic169.C: Verify the reported expected vs
actual argument counts.
* g++.dg/modules/macloc-1_c.C: Update regexp for addition of param
counts to error message.
* g++.dg/modules/macloc-1_d.C: Likewise.

Signed-off-by: David Malcolm 
---
 gcc/c/c-typeck.cc | 77 ---
 gcc/cp/typeck.cc  | 94 ++
 .../c-c++-common/too-few-arguments.c  | 38 
 .../c-c++-common/too-many-arguments.c | 96 +++
 gcc/testsuite/g++.dg/cpp0x/variadic169.C  |  2 +-
 gcc/testsuite/g++.dg/modules/macloc-1_c.C |  4 +-
 gcc/testsuite/g++.dg/modules/macloc-1_d.C |  4 +-
 7 files changed, 280 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/too-few-arguments.c
 create mode 100644 gcc/testsuite/c-c++-common/too-many-arguments.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 902898d1944b..685d490a187f 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -3737,14 +3737,30 @@ build_function_call (location_t loc, tree function, 
tree params)
   return ret;
 }
 
-/* Give a note about the location of the declaration of DECL.  */
+/* Give a note about the location of the declaration of DECL,
+   or, failing that, a pertinent declaration for FUNCTION_EXPR.  */
 
 static void
-inform_declaration (tr

Re: [PATCH] Fortran: potential aliasing of complex pointer inquiry references [PR118120]

2024-12-19 Thread Jerry D


On 12/19/24 1:34 PM, Harald Anlauf wrote:

Dear all,

the check for potential aliasing of lhs and rhs currently shortcuts
if the types differ.  This is a problem if one is of type complex
and the other is of type real (and of the same kind parameter value),
as this ignores that F2008 inquiry references (%RE, %IM) could be
involved.  The attached patch just addresses this shortcut.

This may not be a complete solution, see discussion in the PR,
but is a lightweight solution (for the time being).

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald


I agree with Steve, OK. The inquiry references can be dealt with later.

Regards,

Jerry

[PING]: [PATCH] testsuite: arm: Use effective-target for pr68674.c test

2024-12-19 Thread Torbjorn SVENSSON


Gentle ping :)

Kind regards,
Torbjörn

On 2024-11-14 17:32, Torbjorn SVENSSON wrote:



On 2024-11-14 16:26, Christophe Lyon wrote:

On Fri, 8 Nov 2024 at 18:54, Torbjörn SVENSSON
 wrote:


Ok for trunk and releases/gcc-14?


Can you describe what problem you are trying to fix?

I'm guessing it's similar to your other patch for attr-neon* tests?
And that the best / easiest course of action for the moment is to skip
this test on M-profile?


Exactly.
Without my patch, it would fail for armv8.1-m.main targets with -mfloat- 
abi=hard due to that the include file "arm_neon.h" is not compatible 
with Cortex-M. The test cases will therefor fail due to excess errors 
(about 2500 lines of errors).
With my patch, we are always testing this in armv7-a context where the 
FPU used in the test case is compatible.


Kind regards,
Torbjörn



Thanks,

Christophe


--

gcc/testsuite/ChangeLog:

 * gcc.target/arm/pr68674.c: Use effective-target arm_arch_v7a
 and arm_libc_fp_abi.

Signed-off-by: Torbjörn SVENSSON 
---
  gcc/testsuite/gcc.target/arm/pr68674.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr68674.c b/gcc/testsuite/ 
gcc.target/arm/pr68674.c

index 0b3237458fe..3fd562d0518 100644
--- a/gcc/testsuite/gcc.target/arm/pr68674.c
+++ b/gcc/testsuite/gcc.target/arm/pr68674.c
@@ -1,9 +1,10 @@
  /* PR target/68674 */
  /* { dg-do compile } */
-/* { dg-require-effective-target arm_neon_ok } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
  /* { dg-options "-O2" } */
-/* { dg-add-options arm_fp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */

  #pragma GCC target ("fpu=vfp")

--
2.25.1

[PING]: [PATCH v3] testsuite: arm: Use effective-target for attr-neon* tests

2024-12-19 Thread Torbjorn SVENSSON


Gentle ping :)

Kind regards,
Torbjörn

On 2024-11-14 17:51, Torbjorn SVENSSON wrote:



On 2024-11-14 16:16, Christophe Lyon wrote:

Hi Torbjörn,

On Sun, 10 Nov 2024 at 10:09, Torbjörn SVENSSON
 wrote:


Changes since v1:

- Changed from arm_neon to arm_arch_v7a for the required effective 
target.


Changes since v2:

- Added arm_libc_fp_abi as an required effective taret.
- Removed to arm_neon and arm_vfp from effective target.


With v3, the tests are now tested in armv7-a context in either hard 
or softfp

mode, depending on how libc was built.


I can see these tests have already quite a bit of history regarding
their dg-effective-target :-)

In your initial email, you said the tests fail for m55hard and
m85hard. So does this mean they pass for m7hard etc ? (I mean without
this patch).


Without the patch, the test cases are listed as unsupported due to 
conflicting switches (in this case -mcpu=cortex-m7 and -march=X where X 
depends on the test case...). The only targets that passes the required 
effective-target is armv8.m-main, but it is incompatible with the 
include of arm_nenon.h.


With the patch, all the tests are executed in armv7-a context, so then 
they all pass.


As Richard said, there seems to be an underlying issue we can fix 
separately.



Ok for trunk and releases/gcc-14?

AFAIU the patch does what was suggested: skip those tests on M-profile.

Does the arm_arch_v7a part also work on gcc-14 given it does not have
the -cpu=unset feature?


I only have a really old build of GCC14 available right now but it 
appears fine. Also, without the unset-feature, if the tested target is 
not compatible with -march=armv7-a, then there would be a warning 
printed and that would be enough to say that the effective-target is not 
fulfilled. The drawback is that there might be a few more combinations 
that will no list the test as unsupported rather than fail/pass.


Kind regards,
Torbjörn



Thanks,

Christophe



--

Force armv7-a as the tests require a neon compatible architecture.

gcc/testsuite/ChangeLog:

 * gcc.target/arm/attr-neon-builtin-fail.c: Use effective-target
 arm_arch_v7a.
 * gcc.target/arm/attr-neon-builtin-fail2.c: Likewise.
 * gcc.target/arm/attr-neon-fp16.c: Likewise.
 * gcc.target/arm/attr-neon2.c: Likewise.

Signed-off-by: Torbjörn SVENSSON 
---
  gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c  | 7 ---
  gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c | 6 --
  gcc/testsuite/gcc.target/arm/attr-neon-fp16.c  | 6 --
  gcc/testsuite/gcc.target/arm/attr-neon2.c  | 7 ---
  4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c b/ 
gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c

index fb6e0b9cd66..143ad9c4908 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c
@@ -1,9 +1,10 @@
  /* Check that calling a neon builtin from a function compiled with 
vfp fails.  */

  /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
-/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
  /* { dg-options "-O2" } */
-/* { dg-add-options arm_fp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */

  #include 

diff --git a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c 
b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c

index 9cb5a2ebb90..39689b7c3c7 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail2.c
@@ -1,8 +1,10 @@
  /* Check that calling a neon builtin from a function compiled with 
vfp fails.  */

  /* { dg-do compile } */
-/* { dg-require-effective-target arm_vfp_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
  /* { dg-options "-O2" } */
-/* { dg-add-options arm_vfp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */

  extern __simd64_int8_t a, b;

diff --git a/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c b/gcc/ 
testsuite/gcc.target/arm/attr-neon-fp16.c

index d7b75645bc4..9bc6ce635e2 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon-fp16.c
@@ -1,8 +1,10 @@
  /* { dg-do compile } */
  /* { dg-skip-if "-mpure-code supports M-profile only and without 
Neon" { *-*-* } { "-mpure-code" } } */

-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-require-effective-target arm_libc_fp_abi_ok } */
  /* { dg-options "-mfp16-format=ieee" } */
-/* { dg-add-options arm_fp } */
+/* { dg-add-options arm_arch_v7a } */
+/* { dg-add-options arm_libc_fp_abi } */

  #include "arm_neon.h"

diff --git a/gcc/tests

Re: [PATCH] COBOL 1/8 hdr: header files

2024-12-19 Thread James K. Lowden

On Mon, 16 Dec 2024 23:36:37 + (UTC)
Joseph Myers  wrote:

> > +extern "C"  _Float128 __gg__float128_from_qualified_field
> 
> I'm not entirely sure whether this is host or target code (you always
> need to be clear about which is which in GCC), but in any case, both
> hosts and targets without __int128 or _Float128 are supported in GCC.

In preparing my comprehensive TODO list, these points need
clarification (for us both, I think).   

We are ignoring 32-bit architectures and rely on 128-bit numeric support
to meet ISO COBOL requirements.  I know there's a way to enumerate
supported targets but don't know how.  As of now, any missing support
is reported by the compiler when building gcobol.  

Is there an architecture-feature database within gcc that lists which
ones support _Float128?  

> In general, target code - including headers - should not go under
> gcc/ at all.  And host code shouldn't be using __* identifiers as
> those are reserved.

The above function is implemented in the runtime library.  It is called
from generated code, and from within the library.  We have many such
functions.  They have leading underscores because they're not
intended to be called by any user; that is, they're part of the
implementation. It's my understanding we *are* the implementation to
which such names are reserved. 

> whether this is host or target code 

I think "target" must be the answer? The function is not used to build
gcobol.  The built compiler emits code that calls that function, which
it requires be supplied by libgcobol.  

Does this answer your concerns?  I have it filed under "not a
problem" unless you tell me otherwise.  

--jkl

Re: [PATCH] COBOL 1/8 hdr: header files

2024-12-19 Thread Andrew Pinski

On Thu, Dec 19, 2024 at 11:31 AM James K. Lowden
 wrote:
>
> On Mon, 16 Dec 2024 23:36:37 + (UTC)
> Joseph Myers  wrote:
>
> > > +extern "C"  _Float128 __gg__float128_from_qualified_field
> >
> > I'm not entirely sure whether this is host or target code (you always
> > need to be clear about which is which in GCC), but in any case, both
> > hosts and targets without __int128 or _Float128 are supported in GCC.
>
> In preparing my comprehensive TODO list, these points need
> clarification (for us both, I think).
>
> We are ignoring 32-bit architectures and rely on 128-bit numeric support
> to meet ISO COBOL requirements.  I know there's a way to enumerate
> supported targets but don't know how.  As of now, any missing support
> is reported by the compiler when building gcobol.

Maybe it is better to just use _BitInt instead of __int128. Yes the
number of targets that support _BitInt for C is less than __int128 but
in the future _BitInt will be more supported than __int128 especially
on 32bit targets.  E.g. _BitInt(128) is supported on 32bit x86 while
__int128 is not.

Thanks,
Adnrew Pinski

>
> Is there an architecture-feature database within gcc that lists which
> ones support _Float128?
>
> > In general, target code - including headers - should not go under
> > gcc/ at all.  And host code shouldn't be using __* identifiers as
> > those are reserved.
>
> The above function is implemented in the runtime library.  It is called
> from generated code, and from within the library.  We have many such
> functions.  They have leading underscores because they're not
> intended to be called by any user; that is, they're part of the
> implementation. It's my understanding we *are* the implementation to
> which such names are reserved.
>
> > whether this is host or target code
>
> I think "target" must be the answer? The function is not used to build
> gcobol.  The built compiler emits code that calls that function, which
> it requires be supplied by libgcobol.
>
> Does this answer your concerns?  I have it filed under "not a
> problem" unless you tell me otherwise.
>
> --jkl

Re: The COBOL front end, in 8 notes

2024-12-19 Thread Sam James

"James K. Lowden"  writes:

> On Thu, 12 Dec 2024 12:56:36 -0500
> "James K. Lowden"  wrote:
>
>> The following 8 patches constitute the 80 files needed to build and
>> document the COBOL front end.
> [...]
> * does not build on Darwin/macOS [Iain]
> [...]
>  - 32-bit architectures are not a consideration.

As Joseph said then, if that's the case, configure work will be needed
to not regress --enable-languages=all on such platforms.

>
> * building on the compile farm? [Iain]
>  - No, don't know how yet.  Willing.
>
> * does your testing include bootstrap builds? [David]
>  - No, we normally use
> --disable-bootstrap
>   --disable-multilib

Please do at least one bootstrap build at least before
resubmitting. Ideally have a CI job which runs at least nightly for
bootstrapping.

>
> * How would it be regression tested? [Andi]
>  - need to discuss licensing and feasibility
>

(I really believe this is a must. Not least because middle-end or
backend changes could regress COBOL and we want to detect that.)

> * ideal would be a branch with just the 8 patches [Iain]
>  - I test the patches on the cobol-patched branch, but I don't normally
> push them.

Iain is asking for you to push them to a branch temporarily to make it
easier to fetch and review, given the submission issue here (with not
using git-send-email or similar).

> [...]
> == Infeasible ==
>
> * Please use `git send-email` with threading. [Sam]
>  - dev machines have no email
>

You can use `git format-patch` to create send-email-able patches and
then run `git send-email` on another machine. Or you can attach the
output of `git format-patch` in your usual mail client.

You are not required to use `git send-email` *itself*, but you do need
to submit patches in a well-formed way.

Note that git send-email doesn't require sendmail or something running
locally, it can be given an SMTP server to speak to.

[PATCH] c++, v2: Fix up maybe_init_list_as_array for RAW_DATA_CST [PR118124]

2024-12-19 Thread Jakub Jelinek

On Thu, Dec 19, 2024 at 11:44:54AM -0500, Jason Merrill wrote:
> > --- gcc/cp/call.cc.jj   2024-12-19 16:10:12.977071898 +0100
> > +++ gcc/cp/call.cc  2024-12-19 16:55:40.953546502 +0100
> > @@ -4386,7 +4386,13 @@ maybe_init_list_as_array (tree elttype,
> > if (!is_xible (INIT_EXPR, elttype, copy_argtypes))
> >   return NULL_TREE;
> > -  tree arr = build_array_of_n_type (init_elttype, CONSTRUCTOR_NELTS 
> > (init));
> > +  unsigned int len = CONSTRUCTOR_NELTS (init);
> > +  if (INTEGRAL_TYPE_P (init_elttype))
> > +for (constructor_elt &e: CONSTRUCTOR_ELTS (init))
> > +  if (TREE_CODE (e.value) == RAW_DATA_CST)
> > +   len += RAW_DATA_LENGTH (e.value) - 1;
> 
> Really seems like we could use a function to ask how many elements a
> CONSTRUCTOR initializes, perhaps as a wrapper around
> categorize_ctor_elements?

I think categorize_ctor_elements is heavy-weight and computes tons of info
we don't need here, but more importantly does something different, it recurses
into each of the elements as well.  True, { { 0, 1 }, { 2, 3 } } would likely 
fail
braced_init_element_type, but still...  And it would be upset about ctors
with yet to be determined types.
Another question is if the function for this purpose should count
RANGE_EXPRs or not, e.g. I think convert_like_internal will simply not
handle them, when the loop is FOR_EACH_CONSTRUCTOR_VALUE it simply doesn't
consider them at all.  I think we currently reject
A a { 1, 2, [ 2 ... 6 ] = 3 };
And am not really sure a function like that would be useful for other FEs
or middle-end, given that C designated initializers can skip or initialize
elements out of order, so simply counting them isn't good enough, one would
need to find the highest index (implicit or explicit) or something similar.

This version just adds a static function so far used just here.

Also, I'm worried about build_array_of_n_type, which currently takes int
argument.  It is true that vectors, CONSTRUCTOR_ELTS etc. usually count
stuff in unsigned int or sometimes even in int, so without #embed
or RAW_DATA_CST it isn't really possible to have 2GB+ or 4GB+ elt
initializers (and without them it would be a compile time nightmare anyway).
With #embed/RAW_DATA_CST it isn't that hard to cross that boundary though,
so the patch changes it to uhwi.  One still needs to be careful not to
case breakup of the huge RAW_DATA_CSTs, but at least simple
std::initializer_list from say 16GB #embed should work.

2024-12-19  Jakub Jelinek  

PR c++/118124
* cp-tree.h (build_array_of_n_type): Change second argument type
from int to unsigned HOST_WIDE_INT.
* tree.cc (build_array_of_n_type): Likewise.
* call.cc (count_ctor_elements): New function.
(maybe_init_list_as_array): Use it instead of CONSTRUCTOR_NELTS.
(convert_like_internal): Use length from init's type instead of
len when handling the maybe_init_list_as_array case.

* g++.dg/cpp0x/initlist-opt5.C: New test.

--- gcc/cp/cp-tree.h.jj 2024-12-19 18:47:01.98895 +0100
+++ gcc/cp/cp-tree.h2024-12-19 19:12:10.252716265 +0100
@@ -8156,7 +8156,7 @@ extern tree build_aggr_init_expr  (tree,
 extern tree get_target_expr(tree,
 tsubst_flags_t = 
tf_warning_or_error);
 extern tree build_cplus_array_type (tree, tree, int is_dep = -1);
-extern tree build_array_of_n_type  (tree, int);
+extern tree build_array_of_n_type  (tree, unsigned HOST_WIDE_INT);
 extern bool array_of_runtime_bound_p   (tree);
 extern bool vla_type_p (tree);
 extern tree build_array_copy   (tree);
--- gcc/cp/tree.cc.jj   2024-12-19 11:35:59.227312977 +0100
+++ gcc/cp/tree.cc  2024-12-19 19:12:34.291385303 +0100
@@ -1207,7 +1207,7 @@ build_cplus_array_type (tree elt_type, t
 /* Return an ARRAY_TYPE with element type ELT and length N.  */

 tree
-build_array_of_n_type (tree elt, int n)
+build_array_of_n_type (tree elt, unsigned HOST_WIDE_INT n)
 {
   return build_cplus_array_type (elt, build_index_type (size_int (n - 1)));
 }
--- gcc/cp/call.cc.jj   2024-12-19 18:50:52.478315892 +0100
+++ gcc/cp/call.cc  2024-12-19 19:13:15.528817544 +0100
@@ -4325,6 +4325,20 @@ has_non_trivial_temporaries (tree expr)
   return false;
 }

+/* Return number of initialized elements in CTOR.  */
+
+static unsigned HOST_WIDE_INT
+count_ctor_elements (tree ctor)
+{
+  unsigned HOST_WIDE_INT len = 0;
+  for (constructor_elt &e: CONSTRUCTOR_ELTS (ctor))
+if (TREE_CODE (e.value) == RAW_DATA_CST)
+  len += RAW_DATA_LENGTH (e.value);
+else
+  ++len;
+  return len;
+}
+
 /* We're initializing an array of ELTTYPE from INIT.  If it seems useful,
return INIT as an array (of its own type) so the caller can initialize the
target array in a loop.  */
@@ -4386,7 +4400,8 @@ maybe_init_list_as_array (tree elttype,
   if (!is_xible (INIT_EXPR, elttype, cop

[PATCH] libstdc++: Improve u8path deprecation warning [PR114925]

2024-12-19 Thread hexagon-recursion

When I try to use std::filesystem::u8path() I get a deprecation message.
It recommends replacing u8path() with path((const char8_t*)&*source)
The code it recommends is undefined behaviour (See 
https://stackoverflow.com/a/57453713/14516046 and 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114925).

P3364 "Remove Deprecated u8path overloads From C++26" 
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3364r0.pdf recommends 
replacing u8path with:

inline auto myu8path(const char* s) {
  std::u8string u8s(s, s+std::strlen(s));
  return std::filesystem::path(u8s);
}

libstdc++-v3/ChangeLog:

PR libstdc++/PR114925
* include/bits/fs_path.h : Improve u8path deprecation warning
---
 libstdc++-v3/include/bits/fs_path.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/fs_path.h 
b/libstdc++-v3/include/bits/fs_path.h
index 62af6d98bb7..cc41e22aa25 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -838,7 +838,7 @@ namespace __detail
   template,
   typename _CharT = __detail::__value_type_is_char_or_char8_t<_Source>>
-_GLIBCXX20_DEPRECATED_SUGGEST("path((const char8_t*)&*source)")
+_GLIBCXX20_DEPRECATED_SUGGEST("path(u8string(source, 
source+strlen(source)))")
 inline path
 u8path(const _Source& __source)
 {
-- 
2.47.1

Re: [PATCH] Add RISC-V/rv64gc as a secondary platform

2024-12-19 Thread Jeff Law





On 12/19/24 3:08 PM, Palmer Dabbelt wrote:


I agree lacking B and V makes us very clearly uncompetitive in the space 
where these sort of things matter (ie, binary compatible distros and 
long term stability type things) -- the gap is just too big to close by 
doing clever things in the hardware.  Maybe just B and V isn't enough, 
it's hard to tell, but lacking them seems pretty clearly uncompetitive.


I'm not sure B is so scary on the SW side of things, it's been mostly 
performance issues we've been fixing.  V is huge, though, and we've 
generally found a bunch of V-related functional codegen bugs.  Without 
reliable hardware to test against (and do distro builds and such) it 
just seems premature to declare that being as stable as the other ports 
on the list.
It wouldn't take much to push me into agreeing to B -- it's not scary in 
any way.  There's just notable systems out there that don't implement B, 
but I wouldn't mind leaving them behind for this change.


V has real performance concerns.  I haven't tested it performance-wise 
on the BPI recently, but when I last did the general rule of thumb was 
the more vector you did, the worse it performed *especially* for FP.


jeff

[PATCH] libstdc++: add atomic_ref::address() (P2835R7)

2024-12-19 Thread Giuseppe D'Angelo


Hello,

The attached patch builds on top of the previous one, this time adding 
support for C++26's std::atomic_ref::address(). Tested on x86-64 Linux.


Thank you,
--
Giuseppe D'Angelo
From 0cc16c6e0365308255e02328b0d1f52c0624f8ac Mon Sep 17 00:00:00 2001
From: Giuseppe D'Angelo 
Date: Thu, 19 Dec 2024 18:39:20 +0100
Subject: [PATCH] libstdc++: add atomic_ref::address() (P2835R7)

This commit adds support for the address() getter to atomic_ref,
added by P2835R7 for C++26. The implementation is straightforward, just
return the value of the existing data member.

Technically speaking, P2835R7 has been rebased on top of P3309R3
(constexpr atomic and atomic_ref), marking address() as constexpr.
However P3309R3 has not been implemented yet in libstdc++ -- I'd wager
that adding constexpr to address() can be done at the same time as the
rest of the API. (Also, this particular function doesn't require
anything special in terms of implementation for it to be marked
`constexpr`, this is actually forbidden by [constexpr.functions] so I've
decided not to do it.)

libstdc++-v3/ChangeLog:

	* include/bits/atomic_base.h: Add address() to the various
	specializations of __atomic_ref_base.
	* include/bits/version.def: Bump the feature-testing macro.
	* include/bits/version.h: Regenerate.
	* testsuite/29_atomics/atomic_ref/address.cc: New test.

Signed-off-by: Giuseppe D'Angelo 
---
 libstdc++-v3/include/bits/atomic_base.h   | 16 ++
 libstdc++-v3/include/bits/version.def |  4 ++
 libstdc++-v3/include/bits/version.h   |  7 ++-
 .../29_atomics/atomic_ref/address.cc  | 55 +++
 4 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/testsuite/29_atomics/atomic_ref/address.cc

diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index df642716ce8..0c57f7b59ae 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -1556,6 +1556,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // TODO add const volatile overload
 #endif // __glibcxx_atomic_wait
 
+#if __glibcxx_atomic_ref >= 202411L // C++26
+  _Tp* address() const noexcept { return _M_ptr; }
+#endif
+
 protected:
   _Tp* _M_ptr;
 };
@@ -1690,6 +1694,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // TODO add const volatile overload
 #endif // __glibcxx_atomic_wait
 
+#if __glibcxx_atomic_ref >= 202411L // C++26
+  _Tp* address() const noexcept { return _M_ptr; }
+#endif
+
 protected:
   _Tp* _M_ptr;
 };
@@ -1883,6 +1891,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // TODO add const volatile overload
 #endif // __glibcxx_atomic_wait
 
+#if __glibcxx_atomic_ref >= 202411L // C++26
+  _Fp* address() const noexcept { return _M_ptr; }
+#endif
+
 protected:
   _Fp* _M_ptr;
 };
@@ -2032,6 +2044,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // TODO add const volatile overload
 #endif // __glibcxx_atomic_wait
 
+#if __glibcxx_atomic_ref >= 202411L // C++26
+  _Tp* address() const noexcept { return _M_ptr; }
+#endif
+
 protected:
   static constexpr ptrdiff_t
   _S_type_size(ptrdiff_t __d) noexcept
diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def
index 62b8252e02d..08ca01803c0 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -754,6 +754,10 @@ ftms = {
 
 ftms = {
   name = atomic_ref;
+  values = {
+v = 202411;
+cxxmin = 26;
+  };
   values = {
 v = 201806;
 cxxmin = 20;
diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h
index 16cdae920a1..714235a599c 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -841,7 +841,12 @@
 #undef __glibcxx_want_atomic_lock_free_type_aliases
 
 #if !defined(__cpp_lib_atomic_ref)
-# if (__cplusplus >= 202002L)
+# if (__cplusplus >  202302L)
+#  define __glibcxx_atomic_ref 202411L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_atomic_ref)
+#   define __cpp_lib_atomic_ref 202411L
+#  endif
+# elif (__cplusplus >= 202002L)
 #  define __glibcxx_atomic_ref 201806L
 #  if defined(__glibcxx_want_all) || defined(__glibcxx_want_atomic_ref)
 #   define __cpp_lib_atomic_ref 201806L
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_ref/address.cc b/libstdc++-v3/testsuite/29_atomics/atomic_ref/address.cc
new file mode 100644
index 000..24cd6a2d877
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/address.cc
@@ -0,0 +1,55 @@
+// { dg-do run { target c++26 } }
+
+#include 
+#include 
+#include 
+
+#if !defined(__cpp_lib_atomic_ref)
+#error "__cpp_lib_atomic_ref should have been defined"
+#elif __cpp_lib_atomic_ref < 202411
+#error "__cpp_lib_atomic_ref should have been defined to >= 202411"
+#endif
+
+struct X
+{
+  X() = default;
+  X(int i) : i(i) { }
+  bool operator==(int rhs) const { return i == rhs; }
+  int i;
+};
+
+template

[PATCH] c++, v2: Fix ICEs with large initializer lists or ones including #embed [PR118124]

2024-12-19 Thread Jakub Jelinek

On Thu, Dec 19, 2024 at 11:52:33AM -0500, Jason Merrill wrote:
> Please add this paragraph as a comment.

Ok.

> > +   if (!TYPE_UNSIGNED (elt_type)
> > +   && (complain & tf_warning)
> 
> We shouldn't check tf_warning here.

Oops, you're right.  I saw the complain & tf_warning early exit at the
start of check_narrowing, but that is for C++98 only and I think C++98
code shouldn't get into ck_list handling.

> > +   && (TYPE_UNSIGNED (TREE_TYPE (val))
> > +   || (TYPE_PRECISION (TREE_TYPE (val))
> > +   > CHAR_BIT)))
> > + for (int i = 0; i < RAW_DATA_LENGTH (val); ++i)
> > +   if (RAW_DATA_SCHAR_ELT (val, i) < 0)
> > + {
> 
> Instead, check tf_warning_or_error here, and return error_mark_node if it's
> not set.

check_narrowing actually tests just complain & tf_error:
if not cxx98 nor !CONSTANT_CLASS_P (init) (RAW_DATA_CST is necessarily
constant):
  else if (complain & tf_error)
{
  int savederrorcount = errorcount;
  permerror_opt (loc, OPT_Wnarrowing,
 "narrowing conversion of %qE from %qH to %qI",
 init, ftype, type);
  if (errorcount == savederrorcount)
ok = true;
}
}

  return ok;

So I went with complain & tf_error check and return error_mark_node
if that is 0.

So far lightly tested, ok for trunk this way if it passes bootstrap & testing?

2024-12-19  Jakub Jelinek  

PR c++/118124
* call.cc (convert_like_internal): Handle RAW_DATA_CST in
ck_list handling.  Formatting fixes.

* g++.dg/cpp/embed-15.C: New test.
* g++.dg/cpp/embed-16.C: New test.
* g++.dg/cpp0x/initlist-opt3.C: New test.
* g++.dg/cpp0x/initlist-opt4.C: New test.

--- gcc/cp/call.cc.jj   2024-12-11 17:27:52.481221310 +0100
+++ gcc/cp/call.cc  2024-12-19 18:50:52.478315892 +0100
@@ -8766,8 +8766,8 @@ convert_like_internal (conversion *convs
 
if (tree init = maybe_init_list_as_array (elttype, expr))
  {
-   elttype = cp_build_qualified_type
- (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST);
+   elttype = cp_build_qualified_type (elttype, cp_type_quals (elttype)
+   | TYPE_QUAL_CONST);
array = build_array_of_n_type (elttype, len);
array = build_vec_init_expr (array, init, complain);
array = get_target_expr (array);
@@ -8775,13 +8775,83 @@ convert_like_internal (conversion *convs
  }
else if (len)
  {
-   tree val; unsigned ix;
-
+   tree val;
+   unsigned ix;
tree new_ctor = build_constructor (init_list_type_node, NULL);
 
/* Convert all the elements.  */
FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (expr), ix, val)
  {
+   if (TREE_CODE (val) == RAW_DATA_CST)
+ {
+   tree elt_type;
+   conversion *next;
+   /* For conversion to initializer_list or
+  initializer_list or initializer_list
+  we can optimize and keep RAW_DATA_CST with adjusted
+  type if we report narrowing errors if needed, for
+  others this converts each element separately.  */
+   if (convs->u.list[ix]->kind == ck_std
+   && (elt_type = convs->u.list[ix]->type)
+   && (TREE_CODE (elt_type) == INTEGER_TYPE
+   || is_byte_access_type (elt_type))
+   && TYPE_PRECISION (elt_type) == CHAR_BIT
+   && (next = next_conversion (convs->u.list[ix]))
+   && next->kind == ck_identity)
+ {
+   if (!TYPE_UNSIGNED (elt_type)
+   && (TYPE_UNSIGNED (TREE_TYPE (val))
+   || (TYPE_PRECISION (TREE_TYPE (val))
+   > CHAR_BIT)))
+ for (int i = 0; i < RAW_DATA_LENGTH (val); ++i)
+   if (RAW_DATA_SCHAR_ELT (val, i) >= 0)
+ continue;
+   else if (complain & tf_error)
+ {
+   location_t loc
+ = cp_expr_loc_or_input_loc (val);
+   int savederrorcount = errorcount;
+   permerror_opt (loc, OPT_Wnarrowing,
+  "narrowing conversion of %qd "
+  "from %qH to %qI",
+  RAW_DATA_UCHAR_ELT (val, i),
+  TREE_TYPE (val),

Re: [Fortran, Patch, PR57598] Fix coarray STOP

2024-12-19 Thread Jerry D


On 12/19/24 4:13 AM, Andre Vehreschild wrote:

Hi all,

attached patch fixes a rather old open issue, that I stumbled upon
while trying to figure, why a test failed on the command line but not
in the testsuite. The implementation of the STOP command in caf_single
did not hand the errorcode over to the OS, as does non-caf STOP and as
it is required by the standard. So I fixed that. I also added reporting
of exceptions to the coarray (ERROR)? STOP routines. For this I have
exported the existing function of the regular gfortran runtime library.
I tried to do this via iexport_proto, but was never able to access the
routine from the caf-library. I always got linker errors.

After fixing caf-STOP the testsuite reported one regression, which I
also fixed in send_by_ref.

Bootstrapped and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
mainline?

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gcc dot gnu dot org


Yes, this is OK.

Thanks,

Jerry

Re: [PATCH] Add RISC-V/rv64gc as a secondary platform

2024-12-19 Thread Palmer Dabbelt


On Wed, 18 Dec 2024 08:20:46 PST (-0800), jeffreya...@gmail.com wrote:



On 12/17/24 5:11 PM, Palmer Dabbelt wrote:

This came up on IRC this morning and we talked a bit on the patchwork
call this morning.  I'm not really sure what the right answer is here,
but it seems at least reasonable to talk about -- we've got a lot more
testing these days are we've been somewhat reasonable about following
the release stages.  Either way it looks like a mailing list discussion
and this seems like the easiest way to start it.

I figured it'd be best to start with just rv64gc, as that's the target
that is widley used by distros and has hardware to test on.  Hopefully
at some point we'll add a more exciting target, but it seems safer to
start with something small.

Just to be explicit for the wider community.  I'm on board, we're
starting from a relatively conservative place, but that seems like the
right thing to do.


Ya, thanks :)


I can easily see pushing this towards rv64gc with bitmanip+vector in the
future.


I agree lacking B and V makes us very clearly uncompetitive in the space 
where these sort of things matter (ie, binary compatible distros and 
long term stability type things) -- the gap is just too big to close by 
doing clever things in the hardware.  Maybe just B and V isn't enough, 
it's hard to tell, but lacking them seems pretty clearly uncompetitive.


I'm not sure B is so scary on the SW side of things, it's been mostly 
performance issues we've been fixing.  V is huge, though, and we've 
generally found a bunch of V-related functional codegen bugs.  Without 
reliable hardware to test against (and do distro builds and such) it 
just seems premature to declare that being as stable as the other ports 
on the list.


So hopefully we'll get there some day.  If people feel we're ready then 
I'm happy to give it a shot -- we certainly need to get there some day, 
I just don't want to declare we're ready too early.





Jeff

RE: [PATCH] COBOL 1/8 hdr: header files

2024-12-19 Thread Robert Dubner

Joseph, I am Bob Dubner, the other half of the development team for the
COBOL front end.  Conceptually, I regard the front end as having a blurry
line down the middle of it; Jim primarily does parsing, I generate the
GENERIC tree.

> -Original Message-
> From: Joseph Myers 
> Sent: Thursday, December 19, 2024 15:18
> To: James K. Lowden 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] COBOL 1/8 hdr: header files
> 
> On Wed, 18 Dec 2024, James K. Lowden wrote:
> 
> > On Mon, 16 Dec 2024 23:36:37 + (UTC) Joseph Myers
> >  wrote:
> >
> > > > +extern "C"  _Float128 __gg__float128_from_qualified_field
> > >
> > > I'm not entirely sure whether this is host or target code (you
> > > always need to be clear about which is which in GCC), but in any
> > > case, both hosts and targets without __int128 or _Float128 are
> supported in GCC.
> >
> > In preparing my comprehensive TODO list, these points need
> > clarification (for us both, I think).
> >
> > We are ignoring 32-bit architectures and rely on 128-bit numeric
> > support to meet ISO COBOL requirements.  I know there's a way to
> > enumerate supported targets but don't know how.  As of now, any
> > missing support is reported by the compiler when building gcobol.
> 
> Could you give more details of what is required or optional in ISO
COBOL?
> Looking at ISO/IEC 1989:2023 (and searching for "128"), I see, for
> example, in A.3 item 17, "The usages FLOAT-BINARY-32, FLOAT-BINARY-64
and
> FLOAT-BINARY-128 are dependent on the capabilities of the processor.".

In the document you mention, we have section "8.3.3.3.2 Fixed-point
numeric literals", which specifies that "...shall allow for fixed-point
numeric literals of 1 through 31 digits in length".  COBOL provides that
fixed-point values can have decimal places, but they are not stored as
floating point.  A data description of "PICTURE 99V999" means that the
data structure can hold five digits, with an implied decimal point at the
'V'.  If the DISPLAY usage is specified, then the value 12.345 is stored
as the characters "12345" (In ASCII, 0x31 through 0x35).  If a binary
USAGE is specified, then the binary value 12345 (0x3039) is stored in
memory.  The number of bytes, and whether or not it is stored as big- or
little-endian, is also determined by the data description.  Yes, all that
is part of the language.  Welcome to COBOL.

My implementation attempts to keep intermediate values small.  So, when at
run-time I am adding two values that both fit into 32-bit integers, I try
to do that.  If they get up to 10 or more digits, I switch to 64-bit
integers; when they get up to 20 or more digits I switch to __int128.
__int128 can hold numbers up to 38 digits, and that's the limit of our
implementation, which meets the requirement that a fixed-point number can
be [at least] 31 digits.

The following section, " 8.3.3.3.3 Floating-point numeric literals",
requires 1 to 36 digits.  I assume it is no coincidence that 36 digits can
be stored in an IEEE 754 binary128.  The ISO float-short, float-long, and
float-extended correspond precisely with the IEEE binary32, binary64, and
binary128 definitions.  So, I used them.

I have been speaking of what Jim and I call "run-time code", and what I
see here is referenced as the target code.

At compile-time (or on the host), we also do numeric calculations.  The
ISO specification allows for compile-time computations specified in the
source code.  In addition, at times I put initial values for the COBOL
variables into the run-time structures that are the COBOL variables.  In
order to create those CONSTRUCTOR nodes we have to do those calculations
at compile time, hence the use of __int128 and _Float128 in the host code.

In the run-time/host code, I have been using intTI_type_node for __int128,
and unsigned_intTI_type_node for __uint128.  For floating point, I've been
using float32_type_node, float64_type_node, and float128_type_node.

If there are recommendations as to what would work better across other
architectures, I am all ears.

As to how we arrived here:  I am very aware of, and a bit in awe of, GCC's
ability to create hosts pn one set of architectures that themselves create
executables for other architectures.  Jim and I, however, have had plenty
to do just getting an Ubuntu/x86_64 version of GCC to create Ubuntu/x86_64
COBOL executables.

> 
> The corresponding C and C++ features are optional - some targets support
> them, some don't, the language doesn't require them to be supported.
> (I'm aware of a C++ proposal to require support for 128-bit integers,
but
> I'm not sure of its current status.  If it went in, we'd need all
> architecture maintainers for 8-bit/16-bit/32-bit architectures to define
> the ABI for 128-bit integers on their target, in collaboration with the
> maintainers of any ABI document or other implementations.)
> 
> And having support for such features on the target is in any case
> independent of having it on the host.  You can build GCC to

Re: [patch 2/2, avr] Use new target hook to assemble a variable

2024-12-19 Thread Denis Chertykov

чт, 19 дек. 2024 г. в 21:56, Georg-Johann Lay :
>
> The "io", "io_low", and "address" attributes require to asm output
> the definition of respective symbols in a manner that was not supported
> until the introduction of the new target hook TARGET_ASM_VARIABLE.
>
> The previous implementation of these attributes abused tls_common_section
> which is a noswitch section.  Notice that the middle-end doesn't allow to
> introduce own, custom noswitch sections.
> The tls_comm_section->noswitch.callback allowed for a custom asm output
> regardless of -f[no-]data-sections and -f[no-]common.  However, it would
> not work with checking enabled due to varasm.cc::assemble_variable()'s
>
>/* Emulated TLS had better not get this far.  */
>gcc_checking_assert (targetm.have_tls || !DECL_THREAD_LOCAL_P (decl));
>
> This patch avoids that hack and uses the new TARGET_ASM_VARIABLE target
> hook to output variables with the mentioned attributes.
>
> Ok for trunk?
>

Ok.
Approved.

Denis.

> Johann
>
> --
>
> AVR: target/112952 - Use new TARGET_ASM_VARIABLE for io, io_low, address.
>
> The "io", "io_low", and "address" attributes require to asm output
> the definition of respective symbols in a manner that was not supported
> until the introduction of the new target hook TARGET_ASM_VARIABLE.
>
> The previous implementation of these attributes abused tls_common_section
> which is a noswitch section.  Notice that the middle-end doesn't allow to
> introduce own, custom noswitch sections.
> The tls_comm_section->noswitch.callback allowed for a custom asm output
> regardless of -f[no-]data-sections and -f[no-]common.  However, it would
> not work with checking enabled due to varasm.cc::assemble_variable()'s
>
>/* Emulated TLS had better not get this far.  */
>gcc_checking_assert (targetm.have_tls || !DECL_THREAD_LOCAL_P (decl));
>
> This patch avoids that hack and uses the new TARGET_ASM_VARIABLE target
> hook to output variables with the mentioned attributes.
>
> PR target/112952
> gcc/
> * config/avr/avr.cc (avr_output_addr_attrib): Rename and
> rewrite to avr_asm_variable.
> (avr_asm_init_sections): Leave tls_comm_section alone.
> (avr_encode_section_info) [io, io_low, address]: Don't
> abuse tls_comm_section.
> (TARGET_ASM_VARIABLE): New define to avr_asm_variable.

[PATCH] RISC-V: List valid -mtune options only once

2024-12-19 Thread Christoph Müllner

This patch ensures that the list of valid -mtune options
does not contain entries more than once.
The -mtune option accepts CPU identifiers as well as
tuning identifiers and there are cases where a CPU and
its tuning have the same identifier.

PR116347

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_get_valid_option_values):
Skip adding mtune entries that are already in the list.

Signed-off-by: Christoph Müllner 
---
 gcc/common/config/riscv/riscv-common.cc | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 4c9a72d1180..2f85bb21a4c 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -2437,7 +2437,19 @@ riscv_get_valid_option_values (int option_code,
 
const riscv_cpu_info *cpu_info = &riscv_cpu_tables[0];
for (;cpu_info->name; ++cpu_info)
- v.safe_push (cpu_info->name);
+ {
+   /* Skip duplicates.  */
+   bool skip = false;
+   int i;
+   const char *str;
+   FOR_EACH_VEC_ELT (v, i, str)
+ {
+   if (!strcmp (str, cpu_info->name))
+ skip = true;
+ }
+   if (!skip)
+ v.safe_push (cpu_info->name);
+ }
   }
   break;
 case OPT_mcpu_:
-- 
2.47.1

Re: The COBOL front end, in 8 notes

2024-12-19 Thread James K. Lowden

On Thu, 12 Dec 2024 12:56:36 -0500
"James K. Lowden"  wrote:

> The following 8 patches constitute the 80 files needed to build and
> document the COBOL front end.

Below is a list of issues with the COBOL front end, listed in
order of priority, most important first.  Each is tagged with who
raised it and a brief status.  It is intended to be comprehensive; if
something is missing, please say. 

Our git repository is maintained at

https://gitlab.cobolworx.com/COBOLworx/gcc-cobol

The next set of patches will be against

commit 7d6dc2130970ccf6555d4e9f977515c6c20f7d2f
Author: Jonathan Wakely 
Date:   Fri Dec 13 16:53:06 2024 +

I think the issue that raised the most concern is the one I think is
most important: diagnostics. It was unclear -- still is, to me --
whether the COBOL front end must or should use the gcc diagnostic
framework.  (In my defense, the system goes unmentioned in gccint.info.)

>From where I sit now, having not yet opened the hood, I think we could
have gcc diagnostics incorporated in January.  Probably it's one of
those things that's easy if you know how, and getting to step #1 is the
hard part. 

We regard translatable messages as something that can be deferred.  I
would very much like to see the compiler in the hands of users, and
respond to their actual needs. When someone volunteers to translate the
messages, I would make it a priority to support the effort. 

== Build Issues ==

* Some files were mislabled in the ChangeLog heading [Marc]
 - Hopefully all fixed

* gcc/cobol/config-lang.in was in both bld and cfg patches. [Marc]
 - We Regret the Error
 
* regenerate gcc/cobol/lang.opt.urls [David]
 - David's patch will be included verbatim as patch #9
 
* Makefile variables DESTDIR, YACC, LEX, udfdir [Joseph]
 - IIUC, YACC = BISON and LEX = FLEX
 - unclear what default definition of udfdir should be
 
* install directory for gcobc script
 - will try to correct
 
* uninstall shouldn't remove anything from the build directory [Joseph]
 - understood
 
* If m4 is needed for something other than regenerating configure
scripts, or if the requirements on m4 for COBOL are stricter
 - unsure what to do

* a version of gm4 that recognises ?gnu [Marc]
 - Not as far as I know

* trivial fix to placate older C++ compilers [David]
 - applied, thanks
 
* Please make sure to do all regeneration with *unmodified* versions
[Joseph]
 - I don't understand. 

* does not build on Darwin/macOS [Iain]
 - We have built only on Linux, on aarch64 x86_64 (per arch(1)).
 - We rely on support for 128-bit integer and floating point to meet
ISO COBOL requirements.  
 - We want to build on any 64-bit Posix OS, BE and LE.
 - 32-bit architectures are not a consideration.

* building on the compile farm? [Iain]
 - No, don't know how yet.  Willing.

* does your testing include bootstrap builds? [David]
 - No, we normally use
--disable-bootstrap
--disable-multilib

* How would it be regression tested? [Andi]
 - need to discuss licensing and feasibility
 
* ideal would be a branch with just the 8 patches [Iain]
 - I test the patches on the cobol-patched branch, but I don't normally
push them.

== Front-end Issues  ==

* Bison dependency [Iain]
 - we require Bison 3.5.1, released nearly 6 years ago.  As a point of
comparison, it predates GCC 8.4.

* Bison dependency needs to be documented [Joseph]
 - will do

* there isn't any HTML documentation [David, Joseph]
 - patches include the changes required to update_web_docs_git for mdoc
files
 - will post HTML and PDF versions for reference
 - can include generated HTML on request
 
* check asprintf [Joseph]
 - The only examples I find where the returned value of asprintf is not
checked is in symbols_dump(), which is used only to debug the front
end. 
* symbol versioned on targets [Jakub]
 - We will endeavor to use symbol versioning
 - We have no experience with it

* Leading underscores [Joseph]
 - pending discussion

* Static buffers with a PATH_MAX size will probably break the build on
Hurd host. [Joseph]
 - Hurd probably not relevant to COBOL
 - PATH_MAX is Posix. There is no perfect solution. 
 - If any provided filename is too long, the front end should report it
as an error. That's what tar(1) does. 

== Diagnositics == 

* diagnositics [David, Joseph]
 - We intended to move to using the gcc diagnositic framework before
submitting the patches.  It's still on the front burner. 
 - Is it OK to apply David's SARIF patch to our branch, subsuming it
into our patches? 
 - prefer explicit locations, check
 - Regarding translation support, we would prefer to add that soon
after the front end is accepted. 

== Infeasible ==

* Please use `git send-email` with threading. [Sam]
 - dev machines have no email
 
* please make sure that every function has such a comment [Josef]
 - There are 2074 functions, 1295 of which are static. I estimate the
effort would require 86 man-days. 
 - IMO this is not the best way to write documentation, n

Re: [PATCH] COBOL 1/8 hdr: header files

2024-12-19 Thread Joseph Myers

On Wed, 18 Dec 2024, James K. Lowden wrote:

> On Mon, 16 Dec 2024 23:36:37 + (UTC)
> Joseph Myers  wrote:
> 
> > > +extern "C"  _Float128 __gg__float128_from_qualified_field
> > 
> > I'm not entirely sure whether this is host or target code (you always
> > need to be clear about which is which in GCC), but in any case, both
> > hosts and targets without __int128 or _Float128 are supported in GCC.
> 
> In preparing my comprehensive TODO list, these points need
> clarification (for us both, I think).   
> 
> We are ignoring 32-bit architectures and rely on 128-bit numeric support
> to meet ISO COBOL requirements.  I know there's a way to enumerate
> supported targets but don't know how.  As of now, any missing support
> is reported by the compiler when building gcobol.  

Could you give more details of what is required or optional in ISO COBOL?  
Looking at ISO/IEC 1989:2023 (and searching for "128"), I see, for 
example, in A.3 item 17, "The usages FLOAT-BINARY-32, FLOAT-BINARY-64 and 
FLOAT-BINARY-128 are dependent on the capabilities of the processor.".

The corresponding C and C++ features are optional - some targets support 
them, some don't, the language doesn't require them to be supported.  
(I'm aware of a C++ proposal to require support for 128-bit integers, but 
I'm not sure of its current status.  If it went in, we'd need all 
architecture maintainers for 8-bit/16-bit/32-bit architectures to define 
the ABI for 128-bit integers on their target, in collaboration with the 
maintainers of any ABI document or other implementations.)

And having support for such features on the target is in any case 
independent of having it on the host.  You can build GCC to run on 32-bit 
Arm (no __int128 or _Float128) as the host, and generate code for AArch64 
(has __int128 and _Float128) as the target.  It would be odd to require a 
64-bit host for a particular language (if you need arithmetic within the 
compiler itself wider than natively supported on the host, we have both 
GCC's wide_int and GMP available; likewise, GCC's real* and MPFR for wider 
floating-point support).

If you require __int128 on the target, the toplevel / libgcobol configure 
code will need to handle building libgcobol only for the subset of 
multilibs for the target that have __int128, since lots of targets have 
both 32-bit multilibs (no __int128) and 64-bit multilibs (with __int128).

> Is there an architecture-feature database within gcc that lists which
> ones support _Float128?  

_Float128 is generally TFmode.  Look at the TARGET_SCALAR_MODE_SUPPORTED_P 
hooks to see which support TFmode.  The default hook supports it if it's 
used for long double (TARGET_C_MODE_FOR_FLOATING_TYPE hook).

Although most 64-bit targets do support _Float128, that isn't universally 
the case.  For example, powerpc64 big-endian doesn't.

> > In general, target code - including headers - should not go under
> > gcc/ at all.  And host code shouldn't be using __* identifiers as
> > those are reserved.
> 
> The above function is implemented in the runtime library.  It is called
> from generated code, and from within the library.  We have many such
> functions.  They have leading underscores because they're not
> intended to be called by any user; that is, they're part of the
> implementation. It's my understanding we *are* the implementation to
> which such names are reserved. 
> 
> > whether this is host or target code 
> 
> I think "target" must be the answer? The function is not used to build
> gcobol.  The built compiler emits code that calls that function, which
> it requires be supplied by libgcobol.  

OK, so this header should go in the libgcobol/ directory, not in 
gcc/cobol/ (which is where this patch version has it).  The same for any 
other headers declaring functions in libgcobol.

If there's any header that needs to be included in both the compiler and 
the library for some reason (e.g. if you need a header defining constants 
that are used by the library, and the compiler also needs to know when 
generating code), we'll need to look at that in more detail.  But function 
declarations certainly should only be included in one of those two places: 
the compiler's headers should declare functions that are part of the 
compiler, the library's headers should declare functions that are part of 
the library.  And structure declarations can't readily be shared either 
simply because the host and target can have different types.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] COBOL 1/8 hdr: header files

2024-12-19 Thread Joseph Myers

On Thu, 19 Dec 2024, Andrew Pinski wrote:

> Maybe it is better to just use _BitInt instead of __int128. Yes the
> number of targets that support _BitInt for C is less than __int128 but
> in the future _BitInt will be more supported than __int128 especially
> on 32bit targets.  E.g. _BitInt(128) is supported on 32bit x86 while
> __int128 is not.

Given the use of extern "C" in this header code, I think this target 
library may be built as C++ (where we don't support _BitInt).

I'm not sure of the status of C++ P3140 (requiring std::int_least128_t), 
but if it goes in then we'll need lots of less-then-64-bit target 
maintainers (who mostly haven't defined their ABIs for _BitInt yet) to 
(work with ABI maintainers / other implementations to) define ABIs for 
__int128 as well (which applies even if those targets don't support COBOL 
but do still support C++).

-- 
Joseph S. Myers
josmy...@redhat.com

Re: The COBOL front end, in 8 notes

2024-12-19 Thread Joseph Myers

On Wed, 18 Dec 2024, James K. Lowden wrote:

> I think the issue that raised the most concern is the one I think is
> most important: diagnostics. It was unclear -- still is, to me --
> whether the COBOL front end must or should use the gcc diagnostic
> framework.  (In my defense, the system goes unmentioned in gccint.info.)

It definitely *should*.  By doing so, you get access for free to so many 
features, including highlighting parts of the message coming from certain 
formats, automatically printing the part of the source code referred to in 
the diagnostic (caret diagnostics), links to documentation for relevant 
warning options (in terminals that support them), SARIF output, Unicode 
quotes, translation, the ability to incrementally add fix-it hints in your 
front end for any cases where they are useful, consistency of diagnostic 
appearance with those diagnostics coming from the GCC middle-end, 

As for "must" - well, Ada doesn't, but for any front end implemented in 
C++ and not shared with a non-GCC back end, it would seem rather 
unfortunate to do something different and inconsistent with the rest of 
GCC.

-- 
Joseph S. Myers
josmy...@redhat.com

[PATCH] Fortran: potential aliasing of complex pointer inquiry references [PR118120]

2024-12-19 Thread Harald Anlauf

Dear all,

the check for potential aliasing of lhs and rhs currently shortcuts
if the types differ.  This is a problem if one is of type complex
and the other is of type real (and of the same kind parameter value),
as this ignores that F2008 inquiry references (%RE, %IM) could be
involved.  The attached patch just addresses this shortcut.

This may not be a complete solution, see discussion in the PR,
but is a lightweight solution (for the time being).

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From c5029d09151292ec2ed1e878d49afc3480476588 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 19 Dec 2024 22:22:52 +0100
Subject: [PATCH] Fortran: potential aliasing of complex pointer inquiry
 references [PR118120]

	PR fortran/118120

gcc/fortran/ChangeLog:

	* trans-array.cc (symbols_could_alias): If one symbol refers to a
	complex type and the other to a real type of the same kind, do not
	a priori exclude the possibility of aliasing.

gcc/testsuite/ChangeLog:

	* gfortran.dg/aliasing_complex_pointer.f90: New test.
---
 gcc/fortran/trans-array.cc| 17 +++-
 .../gfortran.dg/aliasing_complex_pointer.f90  | 27 +++
 2 files changed, 38 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 82a2ae1f747..52813857353 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -5344,15 +5344,20 @@ static bool
 symbols_could_alias (gfc_symbol *lsym, gfc_symbol *rsym, bool lsym_pointer,
 		 bool lsym_target, bool rsym_pointer, bool rsym_target)
 {
-  /* Aliasing isn't possible if the symbols have different base types.  */
-  if (gfc_compare_types (&lsym->ts, &rsym->ts) == 0)
-return 0;
+  /* Aliasing isn't possible if the symbols have different base types,
+ except for complex types where an inquiry reference (%RE, %IM) could
+ alias with a real type with the same kind parameter.  */
+  if (!gfc_compare_types (&lsym->ts, &rsym->ts)
+  && !(((lsym->ts.type == BT_COMPLEX && rsym->ts.type == BT_REAL)
+	|| (lsym->ts.type == BT_REAL && rsym->ts.type == BT_COMPLEX))
+	   && lsym->ts.kind == rsym->ts.kind))
+return false;

   /* Pointers can point to other pointers and target objects.  */

   if ((lsym_pointer && (rsym_pointer || rsym_target))
   || (rsym_pointer && (lsym_pointer || lsym_target)))
-return 1;
+return true;

   /* Special case: Argument association, cf. F90 12.4.1.6, F2003 12.4.1.7
  and F2008 12.5.2.13 items 3b and 4b. The pointer case (a) is already
@@ -5363,9 +5368,9 @@ symbols_could_alias (gfc_symbol *lsym, gfc_symbol *rsym, bool lsym_pointer,
 	  || (rsym->attr.dummy && !rsym->attr.contiguous
 	  && (!rsym->attr.dimension
 		  || rsym->as->type == AS_ASSUMED_SHAPE
-return 1;
+return true;

-  return 0;
+  return false;
 }


diff --git a/gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90 b/gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90
new file mode 100644
index 000..0ce4e6a8578
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90
@@ -0,0 +1,27 @@
+! { dg-do run }
+! PR fortran/118120 - potential aliasing of complex pointer inquiry references
+!
+! Contributed by Slava Zakharin < szakharin at nvidia dot com >
+
+program main
+  implicit none
+  integer :: k
+  complex, target :: data(21)
+  do k=1,21
+ data(k) = cmplx(-k,0.0)
+  end do
+  call test(1, 1, data)
+! print *, data
+  if ( data(1)  /= -1.)   stop 1
+  if (any (data(2:)% re /= [(k,k=1,20)])) stop 2
+contains
+  subroutine test(i, j, data)
+integer :: i, j
+complex, target  :: data(21)
+real, pointer:: result(:,:,:,:)
+complex, pointer :: temp(:,:)
+result(i:i,j:j,1:4,1:5) => data(2:)%re
+temp(1:4,1:5)   => data(1:20)
+result(i,j,:,:) = abs(temp)
+  end subroutine test
+end program main
--
2.35.3

Re: [PATCH] COBOL 1/8 hdr: header files

2024-12-19 Thread Jakub Jelinek

On Thu, Dec 19, 2024 at 09:22:04PM +, Joseph Myers wrote:
> On Thu, 19 Dec 2024, Andrew Pinski wrote:
> 
> > Maybe it is better to just use _BitInt instead of __int128. Yes the
> > number of targets that support _BitInt for C is less than __int128 but
> > in the future _BitInt will be more supported than __int128 especially
> > on 32bit targets.  E.g. _BitInt(128) is supported on 32bit x86 while
> > __int128 is not.
> 
> Given the use of extern "C" in this header code, I think this target 
> library may be built as C++ (where we don't support _BitInt).

If it is for the library, guess one workaround could be represent it as
struct with the right size/alignment and use separate C source
for the actual arithmetics on it.

Jakub

Re: [PATCH] Fortran: potential aliasing of complex pointer inquiry references [PR118120]

2024-12-19 Thread Steve Kargl

I'm ok withi your patch.  It seems to also catch PR113928.
You may want to give others a chance to chime in.

-- 
steve

On Thu, Dec 19, 2024 at 09:34:38PM +, Harald Anlauf wrote:
> 
> the check for potential aliasing of lhs and rhs currently shortcuts
> if the types differ.  This is a problem if one is of type complex
> and the other is of type real (and of the same kind parameter value),
> as this ignores that F2008 inquiry references (%RE, %IM) could be
> involved.  The attached patch just addresses this shortcut.
> 
> This may not be a complete solution, see discussion in the PR,
> but is a lightweight solution (for the time being).
> 
> Regtested on x86_64-pc-linux-gnu.  OK for mainline?
> 
> Thanks,
> Harald
> 

> From c5029d09151292ec2ed1e878d49afc3480476588 Mon Sep 17 00:00:00 2001
> From: Harald Anlauf 
> Date: Thu, 19 Dec 2024 22:22:52 +0100
> Subject: [PATCH] Fortran: potential aliasing of complex pointer inquiry
>  references [PR118120]
> 
>   PR fortran/118120
> 
> gcc/fortran/ChangeLog:
> 
>   * trans-array.cc (symbols_could_alias): If one symbol refers to a
>   complex type and the other to a real type of the same kind, do not
>   a priori exclude the possibility of aliasing.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/aliasing_complex_pointer.f90: New test.
> ---
>  gcc/fortran/trans-array.cc| 17 +++-
>  .../gfortran.dg/aliasing_complex_pointer.f90  | 27 +++
>  2 files changed, 38 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90
> 
> diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
> index 82a2ae1f747..52813857353 100644
> --- a/gcc/fortran/trans-array.cc
> +++ b/gcc/fortran/trans-array.cc
> @@ -5344,15 +5344,20 @@ static bool
>  symbols_could_alias (gfc_symbol *lsym, gfc_symbol *rsym, bool lsym_pointer,
>bool lsym_target, bool rsym_pointer, bool rsym_target)
>  {
> -  /* Aliasing isn't possible if the symbols have different base types.  */
> -  if (gfc_compare_types (&lsym->ts, &rsym->ts) == 0)
> -return 0;
> +  /* Aliasing isn't possible if the symbols have different base types,
> + except for complex types where an inquiry reference (%RE, %IM) could
> + alias with a real type with the same kind parameter.  */
> +  if (!gfc_compare_types (&lsym->ts, &rsym->ts)
> +  && !(((lsym->ts.type == BT_COMPLEX && rsym->ts.type == BT_REAL)
> + || (lsym->ts.type == BT_REAL && rsym->ts.type == BT_COMPLEX))
> +&& lsym->ts.kind == rsym->ts.kind))
> +return false;
> 
>/* Pointers can point to other pointers and target objects.  */
> 
>if ((lsym_pointer && (rsym_pointer || rsym_target))
>|| (rsym_pointer && (lsym_pointer || lsym_target)))
> -return 1;
> +return true;
> 
>/* Special case: Argument association, cf. F90 12.4.1.6, F2003 12.4.1.7
>   and F2008 12.5.2.13 items 3b and 4b. The pointer case (a) is already
> @@ -5363,9 +5368,9 @@ symbols_could_alias (gfc_symbol *lsym, gfc_symbol 
> *rsym, bool lsym_pointer,
> || (rsym->attr.dummy && !rsym->attr.contiguous
> && (!rsym->attr.dimension
> || rsym->as->type == AS_ASSUMED_SHAPE
> -return 1;
> +return true;
> 
> -  return 0;
> +  return false;
>  }
> 
> 
> diff --git a/gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90 
> b/gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90
> new file mode 100644
> index 000..0ce4e6a8578
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/aliasing_complex_pointer.f90
> @@ -0,0 +1,27 @@
> +! { dg-do run }
> +! PR fortran/118120 - potential aliasing of complex pointer inquiry 
> references
> +!
> +! Contributed by Slava Zakharin < szakharin at nvidia dot com >
> +
> +program main
> +  implicit none
> +  integer :: k
> +  complex, target :: data(21)
> +  do k=1,21
> + data(k) = cmplx(-k,0.0)
> +  end do
> +  call test(1, 1, data)
> +! print *, data
> +  if ( data(1)  /= -1.)   stop 1
> +  if (any (data(2:)% re /= [(k,k=1,20)])) stop 2
> +contains
> +  subroutine test(i, j, data)
> +integer :: i, j
> +complex, target  :: data(21)
> +real, pointer:: result(:,:,:,:)
> +complex, pointer :: temp(:,:)
> +result(i:i,j:j,1:4,1:5) => data(2:)%re
> +temp(1:4,1:5)   => data(1:20)
> +result(i,j,:,:) = abs(temp)
> +  end subroutine test
> +end program main
> --
> 2.35.3
> 


-- 
Steve

[PATCH] c: special-case some "bool" errors with C23 (v2) [PR117629]

2024-12-19 Thread David Malcolm

Here's an updated version of the patch.

Changed in v2:
- distinguish between "bool" and "_Bool" when determining
  standard version
- more test coverage

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
OK for trunk?

This patch attempts to provide better error messages for
code compiled with C23 that hasn't been updated for
"bool", "true", and "false" becoming keywords.

Specifically:

(1) with "typedef int bool;" previously we emitted:

t1.c:7:13: error: two or more data types in declaration specifiers
7 | typedef int bool;
  | ^~~~
t1.c:7:1: warning: useless type name in empty declaration
7 | typedef int bool;
  | ^~~

whereas with this patch we emit:

t1.c:7:13: error: 'bool' cannot be defined via 'typedef'
7 | typedef int bool;
  | ^~~~
t1.c:7:13: note: 'bool' is a keyword with '-std=c23' onwards
t1.c:7:1: warning: useless type name in empty declaration
7 | typedef int bool;
  | ^~~

(2) with "int bool;" previously we emitted:

t2.c:7:5: error: two or more data types in declaration specifiers
7 | int bool;
  | ^~~~
t2.c:7:1: warning: useless type name in empty declaration
7 | int bool;
  | ^~~

whereas with this patch we emit:

t2.c:7:5: error: 'bool' cannot be used here
7 | int bool;
  | ^~~~
t2.c:7:5: note: 'bool' is a keyword with '-std=c23' onwards
t2.c:7:1: warning: useless type name in empty declaration
7 | int bool;
  | ^~~

(3) with "typedef enum { false = 0, true = 1 } _Bool;" previously we
emitted:

t3.c:7:16: error: expected identifier before 'false'
7 | typedef enum { false = 0, true = 1 } _Bool;
  |^
t3.c:7:38: error: expected ';', identifier or '(' before '_Bool'
7 | typedef enum { false = 0, true = 1 } _Bool;
  |  ^
t3.c:7:38: warning: useless type name in empty declaration

whereas with this patch we emit:

t3.c:7:16: error: cannot use keyword 'false' as enumeration constant
7 | typedef enum { false = 0, true = 1 } _Bool;
  |^
t3.c:7:16: note: 'false' is a keyword with '-std=c23' onwards
t3.c:7:38: error: expected ';', identifier or '(' before '_Bool'
7 | typedef enum { false = 0, true = 1 } _Bool;
  |  ^
t3.c:7:38: warning: useless type name in empty declaration

gcc/c/ChangeLog:
PR c/117629
* c-decl.cc (declspecs_add_type): Special-case attempts to use
bool as a typedef name or declaration name.
* c-errors.cc (get_std_for_keyword): New.
(add_note_about_new_keyword): New.
* c-parser.cc (report_bad_enum_name): New, split out from...
(c_parser_enum_specifier): ...here, adding handling for RID_FALSE
and RID_TRUE.
* c-tree.h (add_note_about_new_keyword): New decl.

gcc/testsuite/ChangeLog:
PR c/117629
* gcc.dg/auto-type-2.c: Update expected output with _Bool.
* gcc.dg/c23-bool-errors-1.c: New test.
* gcc.dg/c23-bool-errors-2.c: New test.
* gcc.dg/c23-bool-errors-3.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/c/c-decl.cc  | 18 +++-
 gcc/c/c-errors.cc| 36 
 gcc/c/c-parser.cc| 52 +++-
 gcc/c/c-tree.h   |  2 +
 gcc/testsuite/gcc.dg/auto-type-2.c   |  3 +-
 gcc/testsuite/gcc.dg/c23-bool-errors-1.c | 14 +++
 gcc/testsuite/gcc.dg/c23-bool-errors-2.c |  9 
 gcc/testsuite/gcc.dg/c23-bool-errors-3.c | 18 
 8 files changed, 139 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-bool-errors-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-bool-errors-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-bool-errors-3.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 7abf1921b577..9ad373543f7b 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -12493,8 +12493,22 @@ declspecs_add_type (location_t loc, struct c_declspecs 
*specs,
 "__auto_type".  */
  if (specs->typespec_word != cts_none)
{
- error_at (loc,
-   "two or more data types in declaration specifiers");
+ if (i == RID_BOOL)
+   {
+ auto_diagnostic_group d;
+ if (specs->storage_class == csc_typedef)
+   error_at (loc,
+ "%qs cannot be defined via %",
+ IDENTIFIER_POINTER (type));
+ else
+   error_at (loc,
+ "%qs cannot be used here",
+ IDENTIFIER_POINTER (type));
+ add_note_about_new_keyword (loc, type);
+   }
+ else
+   error_at (loc,
+ "two or more data types in declaration specifiers");
  return specs;
}

RE: [PATCH v1] Match: Refactor the signed SAT_ADD match patterns [NFC]

2024-12-19 Thread Li, Pan2

Kindly ping.

Pan

-Original Message-
From: Li, Pan2  
Sent: Tuesday, December 10, 2024 2:28 PM
To: gcc-patches@gcc.gnu.org
Cc: richard.guent...@gmail.com; tamar.christ...@arm.com; juzhe.zh...@rivai.ai; 
kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com; Li, Pan2 

Subject: [PATCH v1] Match: Refactor the signed SAT_ADD match patterns [NFC]

From: Pan Li 

This patch would like to refactor the all signed SAT_ADD patterns,
aka:
* Extract type check outside.
* Re-arrange the related match pattern forms together.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Refactor sorts of signed SAT_ADD match patterns.

Signed-off-by: Pan Li 
---
 gcc/match.pd | 140 +--
 1 file changed, 58 insertions(+), 82 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 55617b21139..dd5302015c7 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3314,90 +3314,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 }
 (if (wi::eq_p (trunc_max, int_cst_1) && wi::eq_p (max, int_cst_2)))
 
-/* Signed saturation add, case 1:
-   T sum = (T)((UT)X + (UT)Y)
-   SAT_S_ADD = (X ^ sum) & !(X ^ Y) < 0 ? (-(T)(X < 0) ^ MAX) : sum;
-
-   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
-(match (signed_integer_sat_add @0 @1)
- (cond^ (lt (bit_and:c (bit_xor:c @0 (nop_convert@2 (plus (nop_convert @0)
- (nop_convert @1
-  (bit_not (bit_xor:c @0 @1)))
-   integer_zerop)
-   (bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
-   @2)
- (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type
-
-/* Signed saturation add, case 2:
-   T sum = (T)((UT)X + (UT)Y)
-   SAT_S_ADD = (X ^ sum) & !(X ^ Y) >= 0 ? sum : (-(T)(X < 0) ^ MAX);
-
-   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
-(match (signed_integer_sat_add @0 @1)
- (cond^ (ge (bit_and:c (bit_xor @0 (nop_convert@2 (plus (nop_convert @0)
-   (nop_convert @1
-  (bit_not (bit_xor:c @0 @1)))
-   integer_zerop)
-   @2
-   (bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value))
- (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type
-
-/* Signed saturation add, case 3:
-   T sum = (T)((UT)X + (UT)Y)
-   SAT_S_ADD = (X ^ Y) < 0 && (X ^ sum) >= 0 ? (-(T)(X < 0) ^ MAX) : sum;
-
-   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
-(match (signed_integer_sat_add @0 @1)
- (cond^ (bit_and:c (lt (bit_xor @0 (nop_convert@2 (plus (nop_convert @0)
-   (nop_convert @1
-  integer_zerop)
-  (ge (bit_xor:c @0 @1) integer_zerop))
-   (bit_xor:c (nop_convert (negate (nop_convert (convert
- (lt @0 integer_zerop)
-  max_value)
-   @2)
- (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type
-
-/* Signed saturation add, case 4:
-   Z = .ADD_OVERFLOW (X, Y)
-   SAT_S_ADD = IMAGPART_EXPR (Z) != 0 ? (-(T)(X < 0) ^ MAX) : sum;  */
-(match (signed_integer_sat_add @0 @1)
- (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) integer_zerop)
-   (bit_xor:c (nop_convert?
-   (negate (nop_convert? (convert (lt @0 integer_zerop)
-  max_value)
-   (realpart @2))
- (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type)
-  && types_match (type, @0, @1
-
-/* Signed saturation add, case 5:
-   T sum = (T)((UT)X + (UT)Y);
-   SAT_S_ADD = (X ^ sum) < 0 & ~((X ^ Y) < 0) ? (-(T)(X < 0) ^ MAX) : sum;
-
-   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
-(match (signed_integer_sat_add @0 @1)
- (cond^ (bit_and:c (lt (bit_xor @0 (nop_convert@2 (plus (nop_convert @0)
+(if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
+ (match (signed_integer_sat_add @0 @1)
+  /* T SUM = (T)((UT)X + (UT)Y)
+ SAT_S_ADD = (X ^ SUM) & !(X ^ Y) < 0 ? (-(T)(X < 0) ^ MAX) : SUM  */
+  (cond^ (lt (bit_and:c (bit_xor:c @0 (nop_convert@2 (plus (nop_convert @0)
+  (nop_convert @1
+   (bit_not (bit_xor:c @0 @1)))
+integer_zerop)
+(bit_xor:c (negate (convert (lt @0 integer_zerop))) max_value)
+@2))
+ (match (signed_integer_sat_add @0 @1)
+  /* T SUM = (T)((UT)X + (UT)Y)
+ SAT_S_ADD = (X ^ SUM) & !(X ^ Y) >= 0 ? SUM : (-(T)(X < 0) ^ MAX)  */
+  (cond^ (ge (bit_and:c (bit_xor @0 (nop_convert@2 (plus (nop_convert @0)
 (nop_convert @1
-  integer_zerop)
-  (bit_not (lt (bit_xor:c @0 @1) integer_zerop)))
-   (bit_xor:c (nop_convert (negate (nop_convert (convert
+   (bit_not (bi

[PATCH] Fix compilation error in vmsdbgout_begin_block on VMS targets

2024-12-19 Thread Mark Harmstone

Commit 4ed189854eae ("Add block parameter to begin_block debug hook") changed
the definition of the begin_block function pointer to add another parameter,
but I missed a call in vmsdbgout_begin_block.

Fixes bug #118123.

gcc/
* vmsdbgout.cc (vmsdbgout_begin_block): Fix compilation error.
---
 gcc/vmsdbgout.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/vmsdbgout.cc b/gcc/vmsdbgout.cc
index d9e6a8b7b74..204e5695d39 100644
--- a/gcc/vmsdbgout.cc
+++ b/gcc/vmsdbgout.cc
@@ -1231,10 +1231,10 @@ vmsdbgout_end_epilogue (unsigned int line, const char 
*file)
 
 static void
 vmsdbgout_begin_block (unsigned line, unsigned blocknum,
-  tree block ATTRIBUTE_UNUSED)
+  tree block)
 {
   if (write_symbols == VMS_AND_DWARF2_DEBUG)
-(*dwarf2_debug_hooks.begin_block) (line, blocknum);
+(*dwarf2_debug_hooks.begin_block) (line, blocknum, block);
 
   if (debug_info_level > DINFO_LEVEL_TERSE)
 targetm.asm_out.internal_label (asm_out_file, BLOCK_BEGIN_LABEL, blocknum);
-- 
2.45.2

[PATCH v1] RISC-V: Fix the the operand alignment for strided load/store pattern [NFC]

2024-12-19 Thread pan2 . li

From: Pan Li 

Just notice the unalignment operand for strided load/store pattern when
bugfix the strided load/store memory alias, would like to make it align.

gcc/ChangeLog:

* config/riscv/autovec.md: Align the operand for strided
load/store pattern.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 2529dc77f22..88c0f00e0ea 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2903,7 +2903,7 @@ (define_expand "v3"
 ;; == Strided Load/Store
 ;; =
 (define_expand "mask_len_strided_load_"
-  [(match_operand:V_VLS 0 "register_operand")
+  [(match_operand:V_VLS 0 "register_operand")
(match_operand   1 "pmode_reg_or_0_operand")
(match_operand   2 "pmode_reg_or_0_operand")
(match_operand:  3 "vector_mask_operand")
@@ -2919,7 +2919,7 @@ (define_expand "mask_len_strided_load_"
 (define_expand "mask_len_strided_store_"
   [(match_operand   0 "pmode_reg_or_0_operand")
(match_operand   1 "pmode_reg_or_0_operand")
-   (match_operand:V_VLS 2 "register_operand")
+   (match_operand:V_VLS 2 "register_operand")
(match_operand:  3 "vector_mask_operand")
(match_operand   4 "autovec_length_operand")
(match_operand   5 "const_0_operand")]
-- 
2.43.0

[PATCH v1] RISC-V: Refine strided load/store testcase dump check to tree optimized

2024-12-19 Thread pan2 . li

From: Pan Li 

Like the sat alu related testcase, the dump check of strided load/store
takes the rtl dump for the standard name MASK_LEN_STRIDED_LOAD for times.
But the rtl pass expand is somehow mutable by the middle-end change or
debug information.

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c: Ditto

Signed-off-by: Pan Li 
---
 .../rvv/autovec/strided/strided_ld_st-1-f16.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-f32.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-f64.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-i16.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-i32.c  | 18 +-
 .../rvv/autovec/strided/strided_ld_st-1-i64.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-i8.c   | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-u16.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-u32.c  | 18 +-
 .../rvv/autovec/strided/strided_ld_st-1-u64.c  | 10 +-
 .../rvv/autovec/strided/strided_ld_st-1-u8.c   | 10 +-
 11 files changed, 63 insertions(+), 63 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c
index 4098774ba38..fb0d1d8a449 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c
@@ -1,24 +1,24 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -mno-vector-strict-align 
-fno-vect-cost-model -fdump-rtl-expand-details" } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -mno-vector-strict-align 
-fno-vect-cost-model -fdump-tree-optimized" } */
 
 #include "strided_ld_st.h"
 
 DEF_STRIDED_LD_ST_FORM_1(_Float16)
 
-/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" { 
target {
+/* { dg-final { scan-tree-dump-times ".MASK_LEN_STRIDED_LOAD " 2 "optimized" { 
target {
  any-opts "-O3"
  no-opts "-mrvv-vector-bits=zvl"
} } } } */
-/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" { 
target {
+/* { dg-final { scan-tree-dump-times ".MASK_LEN_STRIDED_STORE " 2 "optimized" 
{ target {
  any-opts "-O3"
  no-opts "-mrvv-vector-bits=zvl"
} } } } */
 
-/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 2 "expand" { 
target {
+/* { dg-final { scan-tree-dump-times ".MASK_LEN_STRIDED_LOAD " 1 "optimized" { 
target {
  any-opts "-O2"
  no-opts "-mrvv-vector-bits=zvl"
} } } } */
-/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 2 "expand" { 
target {
+/* { dg-final { scan-tree-dump-times ".MASK_LEN_STRIDED_STORE " 1 "optimized" 
{ target {
  any-opts "-O2"
  no-opts "-mrvv-vector-bits=zvl"
} } } } */
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c
index e1d1063ec8c..48e0a096cf2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c
@@ -1,24 +1,24 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gcv -mabi=lp64d -mno-vector-strict-align 
-fno-vect-cost-model -fdump-rtl-expand-details" } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -mno-vector-strict-align 
-fno-vect-cost-model -fdump-tree-optimized" } */
 
 #include "strided_ld_st.h"
 
 DEF_STRIDED_LD_ST_FORM_1(float)
 
-/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" { 
target {
+/* { dg-final { scan-tree-dump-times ".MASK_LEN_STRIDED_LOAD " 2 "optimized" { 
target {
  any-opts "-O3

[ping][patch] Allow target to chose address-space for artificial rodata lookup tables.

2024-12-19 Thread Georg-Johann Lay


This is a ping for

https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671216.html

Johann


This patch adds a new target hook that allows to chose
a non-generic named address-space for compiler generated
lookup tables.

The purpose is that there are cases (on avr namely) where
the generic address space is sub-optimal because it must
put .rodata in RAM.  With this hook it is possible to
chose an address space that's better suited, like the
__flash address space that allocates to .progmem.data which
resides in flash.

The patch passes without regressions on avr.

On x86_64, it bootstraps and tests without regressions.

Ok for trunk?

Johann

p.s.  The feature has been discussed in the lists before,
and in all discussions I failed in getting across why a
different address space is needed.  Some background:

1) On AVR, you cannot just put data in a different section
without also using different instructions to access it.
In general a different section also requires different
address registers and different addressing modes and
different instructions.

2) It is *not* possible to do this during linker relaxation.
You cannot just change register allocation and address registers
in the linker.  You cannot just replace a 16-bit register like
X or Y by a 24-bit address that lives in Z (lower 16 bits) and
in some SFR (upper 8 bits).

3) You cannot just put all static storage read-only data into
a different address space.  For example, it is perfectly fine
for a C/C++ code to define a variable in static storage and
access it in assembly code.  The assembly code must know the
address space of the symbol, or otherwise the code is wrong.

4) From 3) it follows that you can only change the address space
of an object when it is hidden from the user, i.e. the compiler
is building the object and has control over all accesses, and
there's no way the user can get a reference to the object.

To date, there are only 2 lookup tables generated by GCC that
fit these criteria:

A) CSWTCH tables from tree-switch-conversion.cc.

B) crc_table_for_* tables from gimple-crc-optimization.cc.

Though B) may increase the code size by quite a lot.  For example,
size of gcc.dg/torture/crc-2.c will increase by more than 1500%
(and even more when a 24-bit address-space is required).  The
CRC optimizations uses some builtin magic, so it's unclear where
and how to introduce a different address space.

--

Allow target to chose address-space for artificial rodata.

gcc/
* coretypes.h (enum artificial_rodata): New enum type.
* doc/tm.texi: Rebuild.
* doc/tm.texi.in (TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA):
New hook.
* target.def (addr_sapce.for_artificial_rodata): New DEFHOOK.
* targhooks.cc (default_addr_space_convert): New function.
* targhooks.h (default_addr_space_convert): New prototype.
* tree-switch-conversion.cc (build_one_array) :
Set type_quals address-space according to
targetm.addr_space.for_artificial_rodata().

* config/avr/avr.cc (avr_rodata_in_flash_p): Move up.
(TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA): Define to...
(avr_addr_space_for_artificial_rodata): ...this new function.diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 7c7736781c8..7dc3eb2016a 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -119,6 +119,25 @@ const avr_addrspace_t avr_addrspace[ADDR_SPACE_COUNT] =
 };
 
 
+#ifdef HAVE_LD_AVR_AVRXMEGA2_FLMAP
+static const bool have_avrxmega2_flmap = true;
+#else
+static const bool have_avrxmega2_flmap = false;
+#endif
+
+#ifdef HAVE_LD_AVR_AVRXMEGA4_FLMAP
+static const bool have_avrxmega4_flmap = true;
+#else
+static const bool have_avrxmega4_flmap = false;
+#endif
+
+#ifdef HAVE_LD_AVR_AVRXMEGA3_RODATA_IN_FLASH
+static const bool have_avrxmega3_rodata_in_flash = true;
+#else
+static const bool have_avrxmega3_rodata_in_flash = false;
+#endif
+
+
 /* Holding RAM addresses of some SFRs used by the compiler and that
are unique over all devices in an architecture like 'avr4'.  */
 
@@ -254,6 +273,31 @@ avr_tolower (char *lo, const char *up)
 }
 
 
+static bool
+avr_rodata_in_flash_p ()
+{
+  switch (avr_arch_index)
+{
+default:
+  break;
+
+case ARCH_AVRTINY:
+  return true;
+
+case ARCH_AVRXMEGA3:
+  return have_avrxmega3_rodata_in_flash;
+
+case ARCH_AVRXMEGA2:
+  return avropt_flmap && have_avrxmega2_flmap && avropt_rodata_in_ram != 1;
+
+case ARCH_AVRXMEGA4:
+  return avropt_flmap && have_avrxmega4_flmap && avropt_rodata_in_ram != 1;
+}
+
+  return false;
+}
+
+
 /* Return chunk of mode MODE of X as an rtx.  N specifies the subreg
byte at which the chunk starts.  N must be an integral multiple
of the mode size.  */
@@ -11123,6 +11167,18 @@ avr_addr_space_diagnose_usage (addr_space_t as, location_t loc)
 }
 
 
+/* Implement `TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA'.  */
+
+static addr_space_t
+avr_addr_space_for_artificial_rodat

Re: [Fortran, Patch, PR57598] Fix coarray STOP

2024-12-19 Thread Andre Vehreschild

Hi Jerry,

thanks for the review. Committed as gcc-15-6383-ga25cc268846

Thanks again,
Andre

On Thu, 19 Dec 2024 11:09:20 -0800
Jerry D  wrote:

> On 12/19/24 4:13 AM, Andre Vehreschild wrote:
> > Hi all,
> >
> > attached patch fixes a rather old open issue, that I stumbled upon
> > while trying to figure, why a test failed on the command line but not
> > in the testsuite. The implementation of the STOP command in caf_single
> > did not hand the errorcode over to the OS, as does non-caf STOP and as
> > it is required by the standard. So I fixed that. I also added reporting
> > of exceptions to the coarray (ERROR)? STOP routines. For this I have
> > exported the existing function of the regular gfortran runtime library.
> > I tried to do this via iexport_proto, but was never able to access the
> > routine from the caf-library. I always got linker errors.
> >
> > After fixing caf-STOP the testsuite reported one regression, which I
> > also fixed in send_by_ref.
> >
> > Bootstrapped and regtests ok on x86_64-pc-linux-gnu / F41. Ok for
> > mainline?
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gcc dot gnu dot org
>
> Yes, this is OK.
>
> Thanks,
>
> Jerry


--
Andre Vehreschild * Email: vehre ad gmx dot de

94 matches

Mail list logo