Re: [PATCH v3 1/2] RISC-V: Allocate the initial register in the expand phase for the vl of XTheadVector

2025-01-19 Thread Jin Ma
> On 1/17/25 7:37 AM, Jin Ma wrote:
> >>> Since the parameter vl of XTheadVector does not support immediate 
> >>> numbers, we need
> >>> to put it in the register in advance. That generates the initial code 
> >>> correctly.
> >>>
> >>>   PR 116593
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>   * config/riscv/riscv-vector-builtins.cc 
> >>> (function_expander::add_input_operand):
> >>>   Put immediate for vl to GPR for XTheadVector.
> >>
> >> Generally both patches look reasonable to me and the change is less 
> >> invasive
> >> than going via spec_restriction.
> >>
> >> How was this tested?  The Rivos CI has already picked it up but please 
> >> still
> >> always specify.  Thanks.
> > 
> > I'm not sure what you mean, but I'll do my best to understand.
> > If you're referring to how to test the logic of this code to ensure it works
> > as intended, it can be quite challenging to explain. Testing this particular
> > code is not straightforward. Even without any specific processing, the 
> > predicate
> > "vector_length_operand" still constrains the XTheadVector, and the immediate
> > value for VL will be stored in the register. In this sense, the code itself
> > may seem redundant.
> I suspect Robin is referring to running the GCC testsuite for 
> regressions.  That is standard practice for any patch.
> 
> Essentially build & run make check before and after your patch and 
> verify there are no new failures.  Ideally you'd also verify that your 
> testcase is fixed :-)

I see, thank you very much for dispelling my doubts. As there is currently
no good way to test, perhaps the first patch should be ignored.
Let's look at the second patch, which really solves ICE.

V4:
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674001.html

Thanks,
Jin

> If you're writing a patch that touches target independent parts of the 
> compiler, then the bar is even higher.  You need to bootstrap & 
> regression test such patches on one of the primary platforms such as 
> x86_64, aarch64.
> 
> Jeff

[PATCH v4] RISC-V: Add a new constraint to ensure that the vl of XTheadVector does not produce a non-zero immediate.

2025-01-19 Thread Jin Ma
Although we have handled the vl of XTheadVector correctly in the
expand phase and predicates, the results show that the work is
still insufficient.

In the curr_insn_transform function, the insn is transformed from:
(insn 69 67 225 12 (set (mem:RVVM8SF (reg/f:DI 218 [ _77 ]) [0  S[128, 128] 
A32])
(if_then_else:RVVM8SF (unspec:RVVMF4BI [
(const_vector:RVVMF4BI repeat [
(const_int 1 [0x1])
])
(reg:DI 209)
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(reg/v:RVVM8SF 143 [ _xx ])
(mem:RVVM8SF (reg/f:DI 218 [ _77 ]) [0  S[128, 128] A32])))
 (expr_list:REG_DEAD (reg/v:RVVM8SF 143 [ _xx ])
(nil)))
to
(insn 69 284 225 11 (set (mem:RVVM8SF (reg/f:DI 18 s2 [orig:218 _77 ] [218]) [0 
 S[128, 128] A32])
(if_then_else:RVVM8SF (unspec:RVVMF4BI [
(const_vector:RVVMF4BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(reg/v:RVVM8SF 104 v8 [orig:143 _xx ] [143])
(mem:RVVM8SF (reg/f:DI 18 s2 [orig:218 _77 ] [218]) [0  S[128, 128] 
A32])))
 (nil))

Looking at the log for the reload pass, it is found that "Changing pseudo 209 in
operand 3 of insn 69 on equiv 0x1".
It converts the vl operand in insn from the expected register(reg:DI 209) to the
constant 1(const_int 1 [0x1]).

This conversion occurs because, although the predicate for the vl operand is
restricted by "vector_length_operand" in the pattern, the constraint is still
"rK", which allows the transformation.

The issue is that changing the "rK" constraint to "rJ" for the constraint of vl
operand in the pattern would prevent this conversion, But unfortunately this 
will
conflict with RVV (RISC-V Vector Extension).

Based on the review's recommendations, the best solution for now is to create
a new constraint to distinguish between RVV and XTheadVector, which is exactly
what this patch does.

PR 116593

gcc/ChangeLog:

* config/riscv/constraints.md (vl): New.
* config/riscv/thead-vector.md: Replacing rK with rvl.
* config/riscv/vector.md: Likewise.

gcc/testsuite/ChangeLog:

* g++.target/riscv/xtheadvector/pr116593.C: New test.
* g++.target/riscv/xtheadvector/xtheadvector.exp: New test.

Reported-by: nihui 
---
 gcc/config/riscv/constraints.md   |   6 +
 gcc/config/riscv/thead-vector.md  |  18 +-
 gcc/config/riscv/vector.md| 476 +-
 .../g++.target/riscv/xtheadvector/pr116593.C  |  47 ++
 .../riscv/xtheadvector/xtheadvector.exp   |  34 ++
 5 files changed, 334 insertions(+), 247 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/pr116593.C
 create mode 100644 gcc/testsuite/g++.target/riscv/xtheadvector/xtheadvector.exp

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index f25975dc0208..ba3c6e6a4c44 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -209,6 +209,12 @@ (define_constraint "vk"
   (and (match_code "const_vector")
(match_test "riscv_vector::const_vec_all_same_in_range_p (op, 0, 31)")))
 
+(define_constraint "vl"
+  "A uimm5 for Vector or zero for XTheadVector."
+  (and (match_code "const_int")
+   (ior (match_test "!TARGET_XTHEADVECTOR && satisfies_constraint_K (op)")
+   (match_test "TARGET_XTHEADVECTOR && satisfies_constraint_J (op)"
+
 (define_constraint "Wc0"
   "@internal
  A constraint that matches a vector of immediate all zeros."
diff --git a/gcc/config/riscv/thead-vector.md b/gcc/config/riscv/thead-vector.md
index 5fe9ba08c4eb..5a02debdd207 100644
--- a/gcc/config/riscv/thead-vector.md
+++ b/gcc/config/riscv/thead-vector.md
@@ -108,7 +108,7 @@ (define_insn_and_split "@pred_th_whole_mov"
   [(set (match_operand:V_VLS_VT 0 "reg_or_mem_operand"  "=vr,vr, m")
(unspec:V_VLS_VT
  [(match_operand:V_VLS_VT 1 "reg_or_mem_operand" " vr, m,vr")
-  (match_operand 2 "vector_length_operand"   " rK, rK, rK")
+  (match_operand 2 "vector_length_operand"   "rvl,rvl,rvl")
   (match_operand 3 "const_1_operand" "  i, i, i")
   (reg:SI VL_REGNUM)
   (reg:SI VTYPE_REGNUM)]
@@ -133,7 +133,7 @@ (define_insn_and_split "@pred_th_whole_mov"
   [(set (match_operand:VB 0 "reg_or_mem_operand"  "=vr,vr, m")
(unspec:VB
  [(match_operand:VB 1 "reg_or_mem_operand" " vr, m,vr")
-  (match_operand 2 "vector_length_operand"   " rK, rK, rK")
+  (match_operand 2 "vector_length_operand"   "rvl,rvl,rvl")
   (match_operand 3 "const_1_operand" "  i, i, 

Re: [PATCH] tree-ssa-dce: Punt on allocations with too large constant sizes [PR118224]

2025-01-19 Thread Dimitar Dimitrov
On Fri, Jan 03, 2025 at 10:46:18PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> As suggested by Richi in the PR, the following patch will fail to DCE
> allocation calls if they have constant size which is too large (over
> PTRDIFF_MAX), or for the case of calloc, if either of the arguments
> is too large (in that case in theory the call could succeed if the other
> argument is variable zero but who cares) or if both are constant and
> their product overflows or is above PTRDIFF_MAX.
> 
> This will make some pedantic conformance tests happy, though if one
> hides the size one will still need to use -fno-malloc-dce or obfuscate even
> the malloc etc. uses.  If the size is constant and too large, it isn't worth
> trying to optimize it.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2025-01-03  Jakub Jelinek  
> 
>   PR tree-optimization/118224
>   * tree-ssa-dce.cc (is_removable_allocation_p): Don't return true
>   for allocations with constant size argument larger than PTRDIFF_MAX
>   or for calloc with one of the arguments constant larger than
>   PTRDIFF_MAX or their product known constant above PTRDIFF_MAX.
>   Fix comment typos, furhter -> further and then -> than.
>   * lto-section-in.cc (lto_free_function_in_decl_state_for_node):
>   Fix comment typo, furhter -> further.
> 
>   * gcc.dg/pr118224.c: New test.
>   * c-c++-common/ubsan/vla-1.c (bar): Use noipa attribute instead
>   of noinline, noclone.
> 
> --- gcc/tree-ssa-dce.cc.jj2025-01-02 11:23:26.890372040 +0100
> +++ gcc/tree-ssa-dce.cc   2025-01-03 20:01:33.074172650 +0100
> @@ -243,38 +243,101 @@ mark_operand_necessary (tree op)
>  
>  /* Return true if STMT is a call to allocation function that can be
> optimized out if the memory block is never used for anything else
> -   then NULL pointer check or free.
> -   If NON_NULL_CHECK is false, we can furhter assume that return value
> -   is never checked to be non-NULL. */
> +   than NULL pointer check or free.
> +   If NON_NULL_CHECK is false, we can further assume that return value
> +   is never checked to be non-NULL.
> +   Don't return true if it is called with constant size (or sizes for calloc)
> +   and the size is excessively large (larger than PTRDIFF_MAX, for calloc
> +   either argument larger than PTRDIFF_MAX or both constant and their product
> +   larger than PTRDIFF_MAX).  */
>  
>  static bool
>  is_removable_allocation_p (gcall *stmt, bool non_null_check)
>  {
> -  tree callee = gimple_call_fndecl (stmt);
> +  int arg = -1;
> +  tree callee = gimple_call_fndecl (stmt), a1, a2;
>if (callee != NULL_TREE
>&& fndecl_built_in_p (callee, BUILT_IN_NORMAL))
>  switch (DECL_FUNCTION_CODE (callee))
>{
>case BUILT_IN_MALLOC:
> + arg = 1;
> + goto do_malloc;
>case BUILT_IN_ALIGNED_ALLOC:
> + arg = 2;
> + goto do_malloc;
>case BUILT_IN_CALLOC:
> + arg = 3;
> + goto do_malloc;
>CASE_BUILT_IN_ALLOCA:
> + arg = 1;
> + goto do_malloc;
>case BUILT_IN_STRDUP:
>case BUILT_IN_STRNDUP:
> - return non_null_check ? flag_malloc_dce > 1 : flag_malloc_dce;
> + arg = 0;
> + /* FALLTHRU */
> +  do_malloc:
> + if (non_null_check)
> +   {
> + if (flag_malloc_dce <= 1)
> +   return false;
> +   }
> + else if (!flag_malloc_dce)
> +   return false;
> + break;
>  
>case BUILT_IN_GOMP_ALLOC:
> - return true;
> + arg = 2;
> + break;
>  
>default:;
>}
>  
> -  if (callee != NULL_TREE
> +  if (arg == -1
> +  && callee != NULL_TREE
>&& flag_allocation_dce
>&& gimple_call_from_new_or_delete (stmt)
>&& DECL_IS_REPLACEABLE_OPERATOR_NEW_P (callee))
> -return true;
> -  return false;
> +arg = 1;
> +
> +  switch (arg)
> +{
> +case -1:
> +  return false;
> +case 0:
> +  return true;
> +case 1:
> +case 2:
> +  if (gimple_call_num_args (stmt) < (unsigned) arg)
> + return false;
> +  a1 = gimple_call_arg (stmt, arg - 1);
> +  if (tree_fits_uhwi_p (a1)
> +   && (tree_to_uhwi (a1)
> +   > tree_to_uhwi (TYPE_MAX_VALUE (ptrdiff_type_node
> + return false;
> +  return true;
> +case 3:
> +  if (gimple_call_num_args (stmt) < 2)
> + return false;
> +  a1 = gimple_call_arg (stmt, 0);
> +  a2 = gimple_call_arg (stmt, 1);
> +  if (tree_fits_uhwi_p (a1)
> +   && (tree_to_uhwi (a1)
> +   > tree_to_uhwi (TYPE_MAX_VALUE (ptrdiff_type_node
> + return false;
> +  if (tree_fits_uhwi_p (a2)
> +   && (tree_to_uhwi (a2)
> +   > tree_to_uhwi (TYPE_MAX_VALUE (ptrdiff_type_node
> + return false;
> +  if (TREE_CODE (a1) == INTEGER_CST
> +   && TREE_CODE (a2) == INTEGER_CST
> +   && (wi::to_widest (a1) + wi::to_widest (a2)

Hi Jakub,

Shouldn't the two "calloc" arguments be multiplied here, instead of

Re: [PATCH] tree-ssa-dce: Punt on allocations with too large constant sizes [PR118224]

2025-01-19 Thread Jakub Jelinek
On Sun, Jan 19, 2025 at 02:27:32PM +0200, Dimitar Dimitrov wrote:
> Shouldn't the two "calloc" arguments be multiplied here, instead of
> added?

Yes, will fix tomorrow.

Jakub



Re: [pushed] libstdc++: Delete leftover from Profile Mode removal

2025-01-19 Thread Gerald Pfeifer
On Sun, 29 Dec 2024, Jonathan Wakely wrote:
> On Sun, 29 Dec 2024, 13:55 Gerald Pfeifer wrote:
>> something tells me this is not the full extent of this issue. Something 
>> to dig into when I find more time.
> I think the explanation for this is simple, and not likely to be part of a
> bigger issue. It should be easy to check, just delete all files under
> doc/html/manual and regenerate them. Any files which are not recreated are
> orphaned and should have been removed by previous commits (as this file
> should have been).

Indeed, this is one half of the equation. Is this something you could give 
a try? (My Docbook stack isn't fully in place.)


The other challenge, however, is that even when files disappear from
libstdc++-v3/doc/html/ they will still remain on the web site.

Here is the culprit, from maintainer-scripts/update_web_docs_libstdcxx_git

  # checkout all the HTML files, get down into an interesting directory
  git -C $GITROOT archive master $GETTHIS | tar xf -
  cd $GETTHIS

  # copy the tree to the onlinedocs area, preserve directory structure
  find . -depth -print | cpio -pd $WWWDIR 2>&1 | egrep -v "$FILTER"

This copies existing files over into the overall web site instead of 
properly mirroring it (with rsync or the like). In other words, it's 
incremental, no way to ever remove anything.

Would this work (as we need) if we replace
  find . -depth -print | cpio -pd $WWWDIR 2>&1
by 
  rsync -ahv . $WWWDIR 2>&1
?

Gerald


[PATCH] libstdc++: perfectly forward std::ranges::clamp arguments

2025-01-19 Thread Giuseppe D'Angelo

Hi,

the attached patch fixes PR118185 (mentioned in the earlier thread about 
118160). Tested on x86-64 Linux.


Thanks,
--
Giuseppe D'Angelo
From 9c61058809ac091335a5e73ad8080d8310e9942e Mon Sep 17 00:00:00 2001
From: Giuseppe D'Angelo 
Date: Sun, 19 Jan 2025 16:30:20 +0100
Subject: [PATCH] libstdc++: perfectly forward std::ranges::clamp arguments

As reported in PR118185, std::ranges::clamp does not correctly forward
the projected value to the comparator. Add the missing forward.

libstdc++-v3/ChangeLog:

	PR libstdc++/118185
PR libstdc++/100249
	* include/bits/ranges_algo.h (__clamp_fn): Correctly forward the
	projected value to the comparator.
	* testsuite/25_algorithms/clamp/118185.cc: New test.

Signed-off-by: Giuseppe D'Angelo 
---
 libstdc++-v3/include/bits/ranges_algo.h   |  4 +-
 .../testsuite/25_algorithms/clamp/118185.cc   | 42 +++
 2 files changed, 44 insertions(+), 2 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/clamp/118185.cc

diff --git a/libstdc++-v3/include/bits/ranges_algo.h b/libstdc++-v3/include/bits/ranges_algo.h
index e38a330d48c..bb6ab8b6b4b 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -2993,9 +2993,9 @@ namespace ranges
 	 std::__invoke(__proj, __hi),
 	 std::__invoke(__proj, __lo;
 	auto&& __proj_val = std::__invoke(__proj, __val);
-	if (std::__invoke(__comp, __proj_val, std::__invoke(__proj, __lo)))
+	if (std::__invoke(__comp, std::forward(__proj_val), std::__invoke(__proj, __lo)))
 	  return __lo;
-	else if (std::__invoke(__comp, std::__invoke(__proj, __hi), __proj_val))
+	else if (std::__invoke(__comp, std::__invoke(__proj, __hi), std::forward(__proj_val)))
 	  return __hi;
 	else
 	  return __val;
diff --git a/libstdc++-v3/testsuite/25_algorithms/clamp/118185.cc b/libstdc++-v3/testsuite/25_algorithms/clamp/118185.cc
new file mode 100644
index 000..908eb708c1e
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/clamp/118185.cc
@@ -0,0 +1,42 @@
+// { dg-do compile { target c++20 } }
+
+#include 
+#include 
+
+struct Comp
+{
+  constexpr bool operator()(const int&& x, const int&& y) { return x < y; }
+};
+
+struct Proj
+{
+  constexpr const int&& operator()(const int& x) const { return std::move(x); }
+};
+
+static_assert(std::indirect_strict_weak_order>);
+
+static_assert(std::ranges::clamp(+1, 0, 2, Comp{}, Proj{}) == 1);
+static_assert(std::ranges::clamp(-1, 0, 2, Comp{}, Proj{}) == 0);
+static_assert(std::ranges::clamp(10, 0, 2, Comp{}, Proj{}) == 2);
+
+
+// Testcase from PR118185
+
+struct Comp2
+{
+  constexpr bool operator()(const int&& x, const int&& y) const { return x < y; }
+  constexpr bool operator()(const int&& x, int& y) const { return x < y; }
+  constexpr bool operator()(int& x, const int&& y) const { return x < y; }
+  constexpr bool operator()(int& x, int& y) const { return x < y; }
+  constexpr bool operator()(std::same_as auto && x, std::same_as auto && y) const
+  {
+return x < y;
+  }
+};
+
+static_assert(std::indirect_strict_weak_order>);
+
+static_assert(std::ranges::clamp(+1, 0, 2, Comp2{}, Proj{}) == 1);
+static_assert(std::ranges::clamp(-1, 0, 2, Comp2{}, Proj{}) == 0);
+static_assert(std::ranges::clamp(10, 0, 2, Comp2{}, Proj{}) == 2);
+
-- 
2.34.1



smime.p7s
Description: S/MIME Cryptographic Signature


[PATCH] testsuite: arm: Use -std=c17 for gcc.target/arm/thumb-bitfld1.c

2025-01-19 Thread Torbjörn SVENSSON
Ok for trunk?

--

Using -std=c17 avoids excess errors like:
.../thumb-bitfld1.c:15:1: warning: old-style function definition 
[-Wold-style-definition]

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb-bitfld1.c: Use -std=c17.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/gcc.target/arm/thumb-bitfld1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/thumb-bitfld1.c 
b/gcc/testsuite/gcc.target/arm/thumb-bitfld1.c
index 37630f1a1f7..3548e097611 100644
--- a/gcc/testsuite/gcc.target/arm/thumb-bitfld1.c
+++ b/gcc/testsuite/gcc.target/arm/thumb-bitfld1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_thumb1_ok } */
-/* { dg-options "-O1 -mthumb" }  */
+/* { dg-options "-O1 -mthumb -std=c17" }  */
 
 struct foo
 {
-- 
2.25.1



[PATCH, RFC] Fortran: do not copy back for parameter actual arguments [PR81978]

2025-01-19 Thread Harald Anlauf

Dear all,

this patch addresses a long-standing difference between gfortran and
other brands: when an array actual argument was passed to a procedure,
and the dummy argument had no intent specified, we would often create
packing and unpacking code.  Only the case of the dummy argument
having intent(in) did avoid the unpacking.

This resulted in the user experience that passing an array having the
PARAMETER attribute, i.e. a named constant that gcc could place in
read-only memory, was generating a segfault at runtime even if the
dummy argument was not modified in the invoked procedure.  It therefore
was often not possible passing such an argument to a procedure without
explicitly requiring a temporary that might not be needed. (*)

Other brands tested do not crash:

(1) Intel, Nvidia, AMD flang, g95 seem to not put PARAMETER into read-only
memory.  One can write code by lying to the compiler and modify the
array values.

(2) NAG appears to prevent modification of variables with the PARAMETER
attribute.  Code that lies to the compiler seems to have no effect,
but that compiler has a checking option that detects an illegal
assignment in the called procedure.

The proposal is to simply not generate the unpacking / copying-back code
if the actual argument has the PARAMETER attribute.  Non-conforming code
should rather be either detected at compile-time (which we do to a
reasonable extent), or we might add (in the future) new checking code
that detects modification of the dummy similar to case (2) above.
(We do something like this e.g. for do-loop indices passed as actual
arguments).

How do you think of this approach?

BTW: attached patch regtests fine on x86_64-pc-linux-gnu.  ON for mainline?

Thanks,
Harald

(*) There is a missed-optimization in that we do not simply create
suitable array descriptors when passing to assumed-shape dummies,
which may avoid the packing.

From 387177dbeed5a2c6563d3c2275fee8a4d756d7a5 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sun, 19 Jan 2025 21:06:56 +0100
Subject: [PATCH] Fortran: do not copy back for parameter actual arguments
 [PR81978]

When an array is packed for passing as an actual argument, and the array
has the PARAMETER attribute (i.e., it is a named constant that can reside
in read-only memory), do not copy back (unpack) from the temporary.

	PR fortran/81978

gcc/fortran/ChangeLog:

	* trans-array.cc (gfc_conv_array_parameter): Do not copy back data
	if actual array parameter has the PARAMETER attribute.
	* trans-expr.cc (gfc_conv_subref_array_arg): Likewise.

gcc/testsuite/ChangeLog:

	* gfortran.dg/pr81978.f90: New test.
---
 gcc/fortran/trans-array.cc|  10 ++-
 gcc/fortran/trans-expr.cc |  11 ++-
 gcc/testsuite/gfortran.dg/pr81978.f90 | 107 ++
 3 files changed, 124 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr81978.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 44b091af2c6..ec627dddffd 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -8925,6 +8925,7 @@ gfc_conv_array_parameter (gfc_se *se, gfc_expr *expr, bool g77,
   bool good_allocatable;
   bool ultimate_ptr_comp;
   bool ultimate_alloc_comp;
+  bool readonly;
   gfc_symbol *sym;
   stmtblock_t block;
   gfc_ref *ref;
@@ -9381,8 +9382,13 @@ gfc_conv_array_parameter (gfc_se *se, gfc_expr *expr, bool g77,
 
   gfc_start_block (&block);
 
-  /* Copy the data back.  */
-  if (fsym == NULL || fsym->attr.intent != INTENT_IN)
+  /* Copy the data back.  If input expr is read-only, e.g. a PARAMETER
+	 array, copying back modified values is undefined behavior.  */
+  readonly = (expr->expr_type == EXPR_VARIABLE
+		  && expr->symtree
+		  && expr->symtree->n.sym->attr.flavor == FL_PARAMETER);
+
+  if ((fsym == NULL || fsym->attr.intent != INTENT_IN) && !readonly)
 	{
 	  if (ctree)
 	{
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index bef49d32a58..dcf42d53175 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -5200,6 +5200,7 @@ gfc_conv_subref_array_arg (gfc_se *se, gfc_expr * expr, int g77,
   gfc_se work_se;
   gfc_se *parmse;
   bool pass_optional;
+  bool readonly;
 
   pass_optional = fsym && fsym->attr.optional && sym && sym->attr.optional;
 
@@ -5416,8 +5417,14 @@ gfc_conv_subref_array_arg (gfc_se *se, gfc_expr * expr, int g77,
 
   /* Wrap the whole thing up by adding the second loop to the post-block
  and following it by the post-block of the first loop.  In this way,
- if the temporary needs freeing, it is done after use!  */
-  if (intent != INTENT_IN)
+ if the temporary needs freeing, it is done after use!
+ If input expr is read-only, e.g. a PARAMETER array, copying back
+ modified values is undefined behavior.  */
+  readonly = (expr->expr_type == EXPR_VARIABLE
+	  && expr->symtree
+	  && expr->symtree->n.sym->attr.flavor == FL_PARAMETER);

[PATCH] RISC-V: Fix a typo in zce to zcf implication

2025-01-19 Thread Yuriy Kolerov
zce must imply zcf but this rule was corrupted after
refactoring in this commit:

9e12010b5e724277ea44c300630802f464407d8d

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: fix zce to zcf
implication.

Signed-off-by: Yuriy Kolerov 
---
 gcc/common/config/riscv/riscv-common.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 5038f0eb959..b34409adf39 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -213,7 +213,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zcmp", "zca"},
   {"zcmt", "zca"},
   {"zcmt", "zicsr"},
-  {"zcf", "f",
+  {"zce", "zcf",
[] (const riscv_subset_list *subset_list) -> bool
{
  return subset_list->xlen () == 32 && subset_list->lookup ("f");
-- 
2.34.1



Re: [PATCH] wwwdocs: Clarify DCO name/identity and (anonymous) pseudonym policy

2025-01-19 Thread Mark Wielaard
Hi Gerald,

On Tue, Dec 17, 2024 at 04:40:10PM +0900, Gerald Pfeifer wrote:
> On Mon, 2 Dec 2024, Mark Wielaard wrote:
> > Adjust the DCO text to match the broader community usage and
> > clarifications around the use of real names, known identities and
> > (anonymous) pseudonyms.
> > 
> > These changes clarify what was meant by "real name" and that it is not
> > required to be a "legal name" or any other stronger requirement than a
> > known identity that could be contacted to discuss the contribution as
> > adopted by other communities like the linux kernel, elfutils, cncf and
> > gentoo.
> > 
> > Also explain that the FSF assignment policy might be more appropriate
> > when wanting to contribute using an anonymous pseudonym.
> 
> This looks fine to me personally, though I don't feel I can simply approve 
> this wearing my wwwdocs hat (which is why I haven't responded earlier).
> 
> Jason, you have originally contributed this if I remember correctly; how 
> do we go about such a change?

Please let me know. Updated patch below.

> (One minor bit: The sentence "This will be done for you automatically if 
> you use `git commit -s`" feels a bit off now there is more details on 
> something else preceeding it. i.e., `git commit -s` does not establish
> real names as such. Maybe leave it earlier in the paragraph?)

Yes, good point. I swapped the sentences.

Cheers,

Mark

P.S. There will be a panel discussion organized by the FSF licensing
and compliance manager on copyrights and developer certificates of
origin at Fosdem:
https://fosdem.org/2025/schedule/event/fosdem-2025-5376-managing-copyrights-in-free-software-projects-discussion-panel-with-gnu-maintainers/
I'll try to attend and report back.
>From b15a3ee40d159b841119d824234a4f21b85e1dff Mon Sep 17 00:00:00 2001
From: Mark Wielaard 
Date: Mon, 2 Dec 2024 11:16:00 +0100
Subject: [PATCH] wwwdocs: Clarify DCO name/identity and (anonymous) pseudonym
 policy

Adjust the DCO text to match the broader community usage and
clarifications around the use of real names, known identities and
(anonymous) pseudonyms.

These changes clarify what was meant by "real name" and that it is not
required to be a "legal name" or any other stronger requirement than a
known identity that could be contacted to discuss the contribution as
adopted by other communities like the linux kernel, elfutils, cncf and
gentoo.

Also explain that the FSF assignment policy might be more appropriate
when wanting to contribute using an anonymous pseudonym.
---
 htdocs/dco.html | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/htdocs/dco.html b/htdocs/dco.html
index 68fa183b9fc0..5713f003cce3 100644
--- a/htdocs/dco.html
+++ b/htdocs/dco.html
@@ -54,8 +54,21 @@ then you just add a line saying:
 
 Signed-off-by: Random J Developer 

 
-using your real name (sorry, no pseudonyms or anonymous contributions.)  This
-will be done for you automatically if you use `git commit -s`.
+using a known identity (sorry, no anonymous contributions.)  The name
+you use as your identity should not be an anonymous id or false name
+that misrepresents who you are.  This will be done for you
+automatically if you use `git commit -s`.
+
+A known identity can be the committer's real, birth or legal name,
+but can also be an established (online) identity.  It is the name you
+convey to people in the community for them to use to identify you as
+you.  The key concern is that your identification is sufficient enough
+to contact you if an issue were to arise in the future about your
+contribution.  You should not deliberately use a name or email address
+that hides your identity.  When you wish to only contribute under an
+(anonymous) pseudonym, or when you require an explicit employer
+disclaimer, then following the FSF
+assignment process is more appropriate.
 
 Some people also put extra optional tags at the end.  The GCC project does
 not require tags from anyone other than the original author of the patch, but
-- 
2.47.1



Re: [PATCH] RISC-V: Correct the mode that is causing the program to fail for XTheadCondMov

2025-01-19 Thread Xi Ruoyao
On Sun, 2025-01-19 at 20:42 +0800, Jin Ma wrote:

/* snip */

> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
> new file mode 100644
> index ..33658b863514
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile  { target { rv64 } } } */
> +/* { dg-options "-march=rv64gc_xtheadcondmov -mabi=lp64d -O2" } */
> +
> +__attribute__((noinline, noclone)) long long int

The attributes are useless as nothing is calling the function in the
tests.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] RISC-V: Fix ICE when prefetching addresses less than 12 bits for zicbop

2025-01-19 Thread Jin Ma
gcc/ChangeLog:

* config/riscv/riscv.md: Change 'r' to 'p'.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/prefetch-zicbop-ice.c: New test.
---
 gcc/config/riscv/riscv.md| 2 +-
 gcc/testsuite/gcc.target/riscv/prefetch-zicbop-ice.c | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/prefetch-zicbop-ice.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c2a9de610526..86c39628ae8d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -4370,7 +4370,7 @@ (define_insn "riscv_zero_"
 )
 
 (define_insn "prefetch"
-  [(prefetch (match_operand 0 "address_operand" "r")
+  [(prefetch (match_operand 0 "address_operand" "p")
  (match_operand 1 "imm5_operand" "i")
  (match_operand 2 "const_int_operand" "n"))]
   "TARGET_ZICBOP"
diff --git a/gcc/testsuite/gcc.target/riscv/prefetch-zicbop-ice.c 
b/gcc/testsuite/gcc.target/riscv/prefetch-zicbop-ice.c
new file mode 100644
index ..31908debb5cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/prefetch-zicbop-ice.c
@@ -0,0 +1,9 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicbop -mabi=lp64d" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zicbop -mabi=ilp32d" { target { rv32 } } } */
+
+void foo ()
+{
+  __builtin_prefetch ((int*)0xff, 1, 0);
+}
-- 
2.25.1



[COMMITTED] Regenerate sparc.opt.urls

2025-01-19 Thread Mark Wielaard
sparc added a -mvis3b option, but the sparc.opt.url file wasn't
regenerated.

Fixes: d309844d6fe0 ("Fix bootstrap failure on SPARC with -O3 -mcpu=niagara4")

gcc/ChangeLog:

* config/sparc/sparc.opt.urls: Regenerated.
---
 gcc/config/sparc/sparc.opt.urls | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/sparc/sparc.opt.urls b/gcc/config/sparc/sparc.opt.urls
index 2a6ffa258e08..1188f88fdaab 100644
--- a/gcc/config/sparc/sparc.opt.urls
+++ b/gcc/config/sparc/sparc.opt.urls
@@ -36,6 +36,9 @@ UrlSuffix(gcc/SPARC-Options.html#index-mvis2)
 mvis3
 UrlSuffix(gcc/SPARC-Options.html#index-mvis3)
 
+mvis3b
+UrlSuffix(gcc/SPARC-Options.html#index-mvis3b)
+
 mvis4
 UrlSuffix(gcc/SPARC-Options.html#index-mvis4)
 
-- 
2.47.1



RE: [PATCH] COBOL 3/8 gen: GENERIC interface

2025-01-19 Thread Robert Dubner



> -Original Message-
> From: Michael Matz 
> Sent: Wednesday, January 15, 2025 09:50
> To: Robert Dubner 
> Cc: Richard Biener ; jklow...@symas.com; Joseph Myers
> ; gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH] COBOL 3/8 gen: GENERIC interface
> 
> Hello,
> 
> On Mon, 23 Dec 2024, Robert Dubner wrote:
> 
> > > +static tree
> > > +gg_get_larger_type(tree A, tree B)
> > > +  {
> > > +  tree larger = TREE_TYPE(B);
> > > +  if( TYPE_SIZE(TREE_TYPE(A)) > TYPE_SIZE(TREE_TYPE(B)) )
> > > +{
> > > +larger = TREE_TYPE(A);
> > >
> > > that doesn't work - TYPE_SIZE is a pointer to a tree.  You can use
> > > tree_int_cst_compare (TYPE_SIZE(TREE_TYPE(A)),
> > > TYPE_SIZE(TREE_TYPE(B))) == 1 instead.
> >
> > In any case, and given that gg_get_larger_type() is used by me in my
> > code on variables I define, rather than on user code on variables the
> > user defines, I believe it is doing what I need it to do.  For
> > example, when I need to multiply tree A by tree B, I need them to both
> > be of the same type in order to keep the TRUNC_DIV_EXPR from
> > complaining that they aren't the same type.  So, I use
> > gg_get_larger_type(A, B) to return the type of the one with the larger
> TYPE_SIZE, and then I cast both A and B to that type.
> 
> It may have been lost in the other (very interesting!) explanations
about
> Cobol peculiarities, but no, the gg_get_larger_type() as written (and
> cited above) is _not_ doing what you want it to do.  TYPE_SIZE is a
> pointer, hence 'TYPE_SIZE(foo) > TYPE_SIZE(bar)' compares two pointers,
> not two numbers, and so what it returns is not the larger-sized type but
> simply the one whose size description was allocated at higher addresses
in
> GCCs memory pool.  That may or may not coincide with those trees
> representing higher numbers.

Well, that's seriously not what I need, there.  

I love your "...may or may not coincide...", which brings us to the
mathematical formulation of Murphy's Law:

"If there is a fifty-fifty chance of something going wrong, nine times out
of ten it will."

So, let's see...

I have rewritten that function as

static tree
gg_get_larger_type(tree A, tree B)
  {
  tree larger = TREE_TYPE(B);
  if(TREE_INT_CST_LOW(TYPE_SIZE(TREE_TYPE(A))) 
   > TREE_INT_CST_LOW(TYPE_SIZE(TREE_TYPE(B))) )
{
larger = TREE_TYPE(A);
}
  return larger;
  }

I take heart from this.  The fact that the existing flawed code
nonetheless works in our hundreds of regression tests provides happy
evidence that my upstream efforts to make sure that operations on
variables take place between variables of the same type have (within
Murphy's limits) been successful, since when A and B are the same type,
then TREE_TYPE for both of them returns the same thing, and the comparison
is always between equal values.

But that's a case of being more lucky than good.

Thank you, thank you, thank you!

Bob D.

> 
> 
> Ciao,
> Michael.


[PATCH] RISC-V: Correct the mode that is causing the program to fail for XTheadCondMov

2025-01-19 Thread Jin Ma
For XTheadCondMov, the bit width of rs2 should always be XLEN-sized, otherwise
the program logic will be wrong.

Reference form
https://github.com/XUANTIE-RV/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf

Synopsis
Move if equal zero.

Mnemonic
th.mveqz rd, rs1, rs2

Description
This instruction moves the content of register rs1 into rd if the content of 
rs2 is 0x0.
Otherwise, the value of rd does not change.

Operation
if (reg[rs2] == 0x0)
  reg[rd] := reg[rs1]

gcc/ChangeLog:

* config/riscv/thead.md (*th_cond_mov):
Change GPR2 to X.
(*th_cond_mov): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadcondmov-bug.c: New test.
---
 gcc/config/riscv/thead.md  |  4 ++--
 gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c | 12 
 2 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c

diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
index 54b9737b4308..d816f3b86dde 100644
--- a/gcc/config/riscv/thead.md
+++ b/gcc/config/riscv/thead.md
@@ -154,11 +154,11 @@ (define_insn "*th_tst3"
 
 ;; XTheadCondMov
 
-(define_insn "*th_cond_mov"
+(define_insn "*th_cond_mov"
   [(set (match_operand:GPR 0 "register_operand" "=r,r")
(if_then_else:GPR
 (match_operator 4 "equality_operator"
-   [(match_operand:GPR2 1 "register_operand" "r,r")
+   [(match_operand:X 1 "register_operand" "r,r")
 (const_int 0)])
 (match_operand:GPR 2 "reg_or_0_operand" "rJ,0")
 (match_operand:GPR 3 "reg_or_0_operand" "0,rJ")))]
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c 
b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
new file mode 100644
index ..33658b863514
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
@@ -0,0 +1,12 @@
+/* { dg-do compile  { target { rv64 } } } */
+/* { dg-options "-march=rv64gc_xtheadcondmov -mabi=lp64d -O2" } */
+
+__attribute__((noinline, noclone)) long long int
+foo (long long int x, long long int y)
+{
+  if (((int) x | (int) y) != 0)
+return 6;
+  return x + y;
+}
+
+/* { dg-final { scan-assembler-times {\msext\.w\M} 1 } } */
-- 
2.25.1



Re: [PATCH] testsuite: Fixes for test case pr117546.c

2025-01-19 Thread Dimitar Dimitrov
On Sat, Jan 18, 2025 at 07:06:16PM +, Sam James wrote:
> Dimitar Dimitrov  writes:
> 
> > This test fails on AVR.
> >
> > Debugging the test on x86 host, I noticed that u in function s sometimes
> > has value 16128.  The "t <= 3 * u" expression in the same function
> > results in signed integer overflow for targets with sizeof(int)=16.
> >
> > Fix by requiring int32 effective target.
> 
> Ah, thanks, I should've clocked that, especially because I'd played with
> the values quite a bit.
> 
> >
> > Also add return statement for the main function.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/torture/pr117546.c: Require effective target int32.
> > (main): Add return statement.
> >
> > Ok for trunk?
> >
> > Cc: Sam James 
> > Signed-off-by: Dimitar Dimitrov 
> > ---
> 
> I think it can go in as obvious. Thanks for the fixup.

Pushed as obvious as r15-7031-g34c51485808188.

Thanks,
Dimitar


Re: [pushed]PR118067][LRA]: Check secondary memory mode for the reg class

2025-01-19 Thread Uros Bizjak
On Fri, Jan 17, 2025 at 10:01 PM Vladimir Makarov  wrote:
>
> This is one more patch to solve
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067
>
> with different -mcpu used.
>
> The patch was successfully bootstrapped and tested on x86-64, aarch64, and 
> ppc64le.

I have committed a small testsuite adjustment to refine target
selector and to use -mtune= compiler directive instead of obsolete
-mcpu= to avoid compiler warning.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118067.c (dg-compile): Use target int128.
* gcc.target/i386/pr118067-2.c (dg-compile): Ditto.
(dg-options): Use -mtune= instead of deprecated -mcpu= option.

Thanks,
Uros.
diff --git a/gcc/testsuite/gcc.target/i386/pr118067-2.c 
b/gcc/testsuite/gcc.target/i386/pr118067-2.c
index 831871db0b4..24f6e6f430d 100644
--- a/gcc/testsuite/gcc.target/i386/pr118067-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr118067-2.c
@@ -1,5 +1,5 @@
-/* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O -fno-split-wide-types -mavx512f -mcpu=k8" } */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O -fno-split-wide-types -mavx512f -mtune=k8" } */
 
 typedef unsigned short U __attribute__((__vector_size__(64)));
 typedef int V __attribute__((__vector_size__(64)));
diff --git a/gcc/testsuite/gcc.target/i386/pr118067.c 
b/gcc/testsuite/gcc.target/i386/pr118067.c
index 7a7f072a5d8..ca9f5ddf50e 100644
--- a/gcc/testsuite/gcc.target/i386/pr118067.c
+++ b/gcc/testsuite/gcc.target/i386/pr118067.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-do compile { target int128 } } */
 /* { dg-options "-O -fno-split-wide-types -mavx512f" } */
 
 typedef unsigned short U __attribute__((__vector_size__(64)));


[PATCH] testsuite: Only run test if alarm is available

2025-01-19 Thread Torbjörn SVENSSON
Ok for trunk?

--

Most baremetal toolchains will not have an implementation for alarm and
sigaction as they are target specific.
For arm-none-eabi with newlib, function signatures are exposed, but
there is no implmentation and thus the test cases causes a undefined
symbol link error.

gcc/testsuite/ChangeLog:

* gcc.dg/pr78185.c: Remove dg-do and replace with
with dg-require-effective-target of signal and alarm.
* gcc.dg/pr116906-1.c: Likewise.
* gcc.dg/pr116906-1.c: Likewise.
* gcc.dg/vect/pr101145inf.c: Use effective-target alarm.
* gcc.dg/vect/pr101145inf_1.c: Likewise.
* lib/target-supports.exp(check_effective_target_alarm): New.

gcc/ChangeLog:

* doc/sourcebuild.texi (Effective-Target Keywords): Document
'alarm'.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/doc/sourcebuild.texi  |  3 +++
 gcc/testsuite/gcc.dg/pr116906-1.c |  3 ++-
 gcc/testsuite/gcc.dg/pr116906-2.c |  3 ++-
 gcc/testsuite/gcc.dg/pr78185.c|  3 ++-
 gcc/testsuite/gcc.dg/vect/pr101145inf.c   |  1 +
 gcc/testsuite/gcc.dg/vect/pr101145inf_1.c |  1 +
 gcc/testsuite/lib/target-supports.exp | 23 +++
 7 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b5c1b23e527..98ede70f23c 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2808,6 +2808,9 @@ both scalar and vector modes.
 @subsubsection Environment attributes
 
 @table @code
+@item alarm
+Target supports @code{alarm}.
+
 @item c
 The language for the compiler under test is C.
 
diff --git a/gcc/testsuite/gcc.dg/pr116906-1.c 
b/gcc/testsuite/gcc.dg/pr116906-1.c
index 27b1fdae02b..7187507a60d 100644
--- a/gcc/testsuite/gcc.dg/pr116906-1.c
+++ b/gcc/testsuite/gcc.dg/pr116906-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
+/* { dg-require-effective-target alarm } */
+/* { dg-require-effective-target signal } */
 /* { dg-options "-O2" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/pr116906-2.c 
b/gcc/testsuite/gcc.dg/pr116906-2.c
index 3478771678c..41a352bf837 100644
--- a/gcc/testsuite/gcc.dg/pr116906-2.c
+++ b/gcc/testsuite/gcc.dg/pr116906-2.c
@@ -1,4 +1,5 @@
-/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
+/* { dg-require-effective-target alarm } */
+/* { dg-require-effective-target signal } */
 /* { dg-options "-O2 -fno-tree-ch" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/pr78185.c b/gcc/testsuite/gcc.dg/pr78185.c
index d7781b2080f..ada8b1b9f90 100644
--- a/gcc/testsuite/gcc.dg/pr78185.c
+++ b/gcc/testsuite/gcc.dg/pr78185.c
@@ -1,4 +1,5 @@
-/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
+/* { dg-require-effective-target alarm } */
+/* { dg-require-effective-target signal } */
 /* { dg-options "-O" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.dg/vect/pr101145inf.c 
b/gcc/testsuite/gcc.dg/vect/pr101145inf.c
index aa598875aa5..70aea94b6e0 100644
--- a/gcc/testsuite/gcc.dg/vect/pr101145inf.c
+++ b/gcc/testsuite/gcc.dg/vect/pr101145inf.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target alarm } */
 /* { dg-require-effective-target signal } */
 /* { dg-additional-options "-O3" } */
 #include 
diff --git a/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c 
b/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
index 0465788c3cc..fe008284e1d 100644
--- a/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
+++ b/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target alarm } */
 /* { dg-require-effective-target signal } */
 /* { dg-additional-options "-O3" } */
 #include 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 939ef3a4119..93795a7e27f 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -14255,3 +14255,26 @@ proc add_options_for_nvptx_alloca_ptx { flags } {
 
 return $flags
 }
+
+# Return true if alarm is supported on the target.
+
+proc check_effective_target_alarm { } {
+return [check_no_compiler_messages alarm executable {
+   #include 
+   #include 
+   #include 
+   void do_exit(int i) { exit (0); }
+   int main (void) { 
+ struct sigaction s;
+ sigemptyset (&s.sa_mask);
+ s.sa_handler = exit;
+ s.sa_flags = 0;
+ sigaction (SIGALRM, &s, NULL);
+ alarm (1);
+
+ /* Infinite loop to simulate work...  */
+ while (1);
+ abort ();
+   }
+}]
+}
-- 
2.25.1



Re: [PATCH] testsuite: Only run test if alarm is available

2025-01-19 Thread Andrew Pinski
On Sun, Jan 19, 2025 at 12:17 PM Torbjörn SVENSSON
 wrote:
>
> Ok for trunk?
>
> --
>
> Most baremetal toolchains will not have an implementation for alarm and
> sigaction as they are target specific.
> For arm-none-eabi with newlib, function signatures are exposed, but
> there is no implmentation and thus the test cases causes a undefined
> symbol link error.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr78185.c: Remove dg-do and replace with
> with dg-require-effective-target of signal and alarm.
> * gcc.dg/pr116906-1.c: Likewise.
> * gcc.dg/pr116906-1.c: Likewise.
> * gcc.dg/vect/pr101145inf.c: Use effective-target alarm.
> * gcc.dg/vect/pr101145inf_1.c: Likewise.
> * lib/target-supports.exp(check_effective_target_alarm): New.
>
> gcc/ChangeLog:
>
> * doc/sourcebuild.texi (Effective-Target Keywords): Document
> 'alarm'.
>
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  gcc/doc/sourcebuild.texi  |  3 +++
>  gcc/testsuite/gcc.dg/pr116906-1.c |  3 ++-
>  gcc/testsuite/gcc.dg/pr116906-2.c |  3 ++-
>  gcc/testsuite/gcc.dg/pr78185.c|  3 ++-
>  gcc/testsuite/gcc.dg/vect/pr101145inf.c   |  1 +
>  gcc/testsuite/gcc.dg/vect/pr101145inf_1.c |  1 +
>  gcc/testsuite/lib/target-supports.exp | 23 +++
>  7 files changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index b5c1b23e527..98ede70f23c 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -2808,6 +2808,9 @@ both scalar and vector modes.
>  @subsubsection Environment attributes
>
>  @table @code
> +@item alarm
> +Target supports @code{alarm}.
> +
>  @item c
>  The language for the compiler under test is C.
>
> diff --git a/gcc/testsuite/gcc.dg/pr116906-1.c 
> b/gcc/testsuite/gcc.dg/pr116906-1.c
> index 27b1fdae02b..7187507a60d 100644
> --- a/gcc/testsuite/gcc.dg/pr116906-1.c
> +++ b/gcc/testsuite/gcc.dg/pr116906-1.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
> +/* { dg-require-effective-target alarm } */
> +/* { dg-require-effective-target signal } */
>  /* { dg-options "-O2" } */
>
>  #include 
> diff --git a/gcc/testsuite/gcc.dg/pr116906-2.c 
> b/gcc/testsuite/gcc.dg/pr116906-2.c
> index 3478771678c..41a352bf837 100644
> --- a/gcc/testsuite/gcc.dg/pr116906-2.c
> +++ b/gcc/testsuite/gcc.dg/pr116906-2.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
> +/* { dg-require-effective-target alarm } */
> +/* { dg-require-effective-target signal } */
>  /* { dg-options "-O2 -fno-tree-ch" } */
>
>  #include 
> diff --git a/gcc/testsuite/gcc.dg/pr78185.c b/gcc/testsuite/gcc.dg/pr78185.c
> index d7781b2080f..ada8b1b9f90 100644
> --- a/gcc/testsuite/gcc.dg/pr78185.c
> +++ b/gcc/testsuite/gcc.dg/pr78185.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
> +/* { dg-require-effective-target alarm } */
> +/* { dg-require-effective-target signal } */
>  /* { dg-options "-O" } */
>
>  #include 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr101145inf.c 
> b/gcc/testsuite/gcc.dg/vect/pr101145inf.c
> index aa598875aa5..70aea94b6e0 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr101145inf.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr101145inf.c
> @@ -1,3 +1,4 @@
> +/* { dg-require-effective-target alarm } */
>  /* { dg-require-effective-target signal } */
>  /* { dg-additional-options "-O3" } */
>  #include 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c 
> b/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
> index 0465788c3cc..fe008284e1d 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
> @@ -1,3 +1,4 @@
> +/* { dg-require-effective-target alarm } */
>  /* { dg-require-effective-target signal } */
>  /* { dg-additional-options "-O3" } */
>  #include 
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 939ef3a4119..93795a7e27f 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -14255,3 +14255,26 @@ proc add_options_for_nvptx_alloca_ptx { flags } {
>
>  return $flags
>  }
> +
> +# Return true if alarm is supported on the target.
> +
> +proc check_effective_target_alarm { } {

Maybe A small optimization is return false here if signal is not supported.

That is:
  if ![check_effective_target_signal] {
return 0
  }

Thanks,
Andrew

> +return [check_no_compiler_messages alarm executable {
> +   #include 
> +   #include 
> +   #include 
> +   void do_exit(int i) { exit (0); }
> +   int main (void) {
> + struct sigaction s;
> + sigemptyset (&s.sa_mask);
> + s.sa_handler = exit;
> + s.sa_flags = 0;
> + sigaction (SIGALRM, &s, NULL);
> + alarm (1);
> +
> + /* Infinite loop to simulate work...  */
> + while (1);
> +  

Re: [PATCH] testsuite: Only run test if alarm is available

2025-01-19 Thread Torbjorn SVENSSON




On 2025-01-19 21:20, Andrew Pinski wrote:

On Sun, Jan 19, 2025 at 12:17 PM Torbjörn SVENSSON
 wrote:


Ok for trunk?

--

Most baremetal toolchains will not have an implementation for alarm and
sigaction as they are target specific.
For arm-none-eabi with newlib, function signatures are exposed, but
there is no implmentation and thus the test cases causes a undefined
symbol link error.

gcc/testsuite/ChangeLog:

 * gcc.dg/pr78185.c: Remove dg-do and replace with
 with dg-require-effective-target of signal and alarm.
 * gcc.dg/pr116906-1.c: Likewise.
 * gcc.dg/pr116906-1.c: Likewise.
 * gcc.dg/vect/pr101145inf.c: Use effective-target alarm.
 * gcc.dg/vect/pr101145inf_1.c: Likewise.
 * lib/target-supports.exp(check_effective_target_alarm): New.

gcc/ChangeLog:

 * doc/sourcebuild.texi (Effective-Target Keywords): Document
 'alarm'.

Signed-off-by: Torbjörn SVENSSON 
---
  gcc/doc/sourcebuild.texi  |  3 +++
  gcc/testsuite/gcc.dg/pr116906-1.c |  3 ++-
  gcc/testsuite/gcc.dg/pr116906-2.c |  3 ++-
  gcc/testsuite/gcc.dg/pr78185.c|  3 ++-
  gcc/testsuite/gcc.dg/vect/pr101145inf.c   |  1 +
  gcc/testsuite/gcc.dg/vect/pr101145inf_1.c |  1 +
  gcc/testsuite/lib/target-supports.exp | 23 +++
  7 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b5c1b23e527..98ede70f23c 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2808,6 +2808,9 @@ both scalar and vector modes.
  @subsubsection Environment attributes

  @table @code
+@item alarm
+Target supports @code{alarm}.
+
  @item c
  The language for the compiler under test is C.

diff --git a/gcc/testsuite/gcc.dg/pr116906-1.c 
b/gcc/testsuite/gcc.dg/pr116906-1.c
index 27b1fdae02b..7187507a60d 100644
--- a/gcc/testsuite/gcc.dg/pr116906-1.c
+++ b/gcc/testsuite/gcc.dg/pr116906-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
+/* { dg-require-effective-target alarm } */
+/* { dg-require-effective-target signal } */
  /* { dg-options "-O2" } */

  #include 
diff --git a/gcc/testsuite/gcc.dg/pr116906-2.c 
b/gcc/testsuite/gcc.dg/pr116906-2.c
index 3478771678c..41a352bf837 100644
--- a/gcc/testsuite/gcc.dg/pr116906-2.c
+++ b/gcc/testsuite/gcc.dg/pr116906-2.c
@@ -1,4 +1,5 @@
-/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
+/* { dg-require-effective-target alarm } */
+/* { dg-require-effective-target signal } */
  /* { dg-options "-O2 -fno-tree-ch" } */

  #include 
diff --git a/gcc/testsuite/gcc.dg/pr78185.c b/gcc/testsuite/gcc.dg/pr78185.c
index d7781b2080f..ada8b1b9f90 100644
--- a/gcc/testsuite/gcc.dg/pr78185.c
+++ b/gcc/testsuite/gcc.dg/pr78185.c
@@ -1,4 +1,5 @@
-/* { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } } */
+/* { dg-require-effective-target alarm } */
+/* { dg-require-effective-target signal } */
  /* { dg-options "-O" } */

  #include 
diff --git a/gcc/testsuite/gcc.dg/vect/pr101145inf.c 
b/gcc/testsuite/gcc.dg/vect/pr101145inf.c
index aa598875aa5..70aea94b6e0 100644
--- a/gcc/testsuite/gcc.dg/vect/pr101145inf.c
+++ b/gcc/testsuite/gcc.dg/vect/pr101145inf.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target alarm } */
  /* { dg-require-effective-target signal } */
  /* { dg-additional-options "-O3" } */
  #include 
diff --git a/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c 
b/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
index 0465788c3cc..fe008284e1d 100644
--- a/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
+++ b/gcc/testsuite/gcc.dg/vect/pr101145inf_1.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target alarm } */
  /* { dg-require-effective-target signal } */
  /* { dg-additional-options "-O3" } */
  #include 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 939ef3a4119..93795a7e27f 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -14255,3 +14255,26 @@ proc add_options_for_nvptx_alloca_ptx { flags } {

  return $flags
  }
+
+# Return true if alarm is supported on the target.
+
+proc check_effective_target_alarm { } {


Maybe A small optimization is return false here if signal is not supported.

That is:
   if ![check_effective_target_signal] {
 return 0
   }


Sure, I'll add that.

Is it okay with this change?
Or should I send a v2 with this?

Kind regards,
Torbjörn



Thanks,
Andrew


+return [check_no_compiler_messages alarm executable {
+   #include 
+   #include 
+   #include 
+   void do_exit(int i) { exit (0); }
+   int main (void) {
+ struct sigaction s;
+ sigemptyset (&s.sa_mask);
+ s.sa_handler = exit;
+ s.sa_flags = 0;
+ sigaction (SIGALRM, &s, NULL);
+ alarm (1);
+
+ /* Infinite loop to simulate work...  */
+ while (1);
+ abort ();
+   }
+}]
+}
--
2.25.1





Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-19 Thread Iain Sandoe
All:

Thank you all for looking at this - there are a large number of moving parts 
and I could
easily be making incorrect assumptions.  FWIW the highest weighting in the 
inputs I have
are given to DDI0487L_a_a-profile and the query output from the actual desktops.

-

Please note that the Darwin assembler is Apple’s LLVM backend (invoked via 
clang -cc1as)
and that means that whatever GCC deduces for the feature set and outputs into 
the asm
in the .arch line has to mean the same to LLVM as it does to GCC.

For example:
I ran into a problem (that might or might not still exist) where specifying crc 
on top of a
8.4 spec dropped the base rev assumed back to 8.x where crc was introduced 
(that could
be just a bug, but I have to live with it).

So I need to check carefully when adding/subtracting features.
(agreed the two toolchains should do the same thing - but if they don’t then I 
need a work-
 around).

It might be nice, at some point, to have a controlled assembler for GCC - but 
at the moment
99.99+% of my downstream are using xcode to provide the ‘binutils’.

=

Kyrill:

>> Some of the content is estimates/best guesses - based on the following
>> public sources of information:
>> * XNU (only for the Apple Implementer ID)
>> * sysctl -a | grep hw on various M1, M2 and machines
>> * AArch64.td from the Apple Open Source repo for LLVM.
>> * What XCode-14 clang passes to cc1.
>> 
> 
> How about the llvm/lib/TargetParser/Host.cpp in upstream LLVM for the part 
> numbers?
> I see it has different values for the M1,M2,M3 ones that you have in your 
> patch.

Looking at a recent version of that I see the host-side values that we use when 
doing the native query.  These are obtained from the OS - not the chip directly.
(its a sysctl call).

What I do not see is the manufacturer/chip pairs that you get in /proc/cpuinfo.
(of course I could be blind :) )

>> gcc/ChangeLog:
>> 
>> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add apple-a12,
>> apple-m1, apple-m2, apple-m3.
>> * config/aarch64/aarch64-tune.md: Regenerate.
> 
> These need entries in the documentation too.

Ack .. I will handle this once we settle things a bit.

=

Andrew:
>> 
>> * Currently, we do not seem to have any way to specify that M2/M3 has support
>>  for FEAT_BTI, but because of missing feaures is not compliant with the Arm
>>  base rev that implies this.
> 
> Since FEAT_BTI only adds hint instructions, I don't think any part of the
> compiler actually checks for whether the feature is supported.  Whether or not
> to emit FEAT_BTI instructions is controlled by a different compiler option.

I guess the question then is how do we enable it for apple-m2+ and not for the
m1 (or are you saying it does not matter, since the lower revs would just treat
the hint as NOP)?

>> +/* Apple (A12 and M) cores based on Armv8.
>> +   Apple implementer ID from xnu,
>> +   Guesses for part # and suitable scheduler ident, generic_armv8_a for 
>> costs.
>> +   A12 seems mostly 8.3,
>> +   M1 seems to be 8.4 + extras (see comments in option-extensions about 
>> f16fml),
>> +   M2 mostly 8.5 but with missing mandatory features.
>> +   M3 is essentially the same as M2 for the features declared here.  */
>> +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (), generic_armv8_a, 
>> 0x61, 0x12, -1)
>> +AARCH64_CORE("apple-m1", applem1, cortexa57, V8_4A,  (F16, SB, SSBS), 
>> generic_armv8_a, 0x61, 0x23, -1)
>> +AARCH64_CORE("apple-m2", applem2, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
>> SSBS), generic_armv8_a, 0x61, 0x23, -1)
>> +AARCH64_CORE("apple-m3", applem3, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
>> SSBS), generic_armv8_a, 0x61, 0x23, -1)
>> +
> 
> Comparing to LLVM's AArch64Processors.td, this seems to be missing a few 
> things:
> - Crpyto extensions (SHA2 and AES, and SHA3 from apple-m1);

I do not see FEAT_SHA2 listed in either the Arm doc, or the output from the 
sysctl.
FEAT_AES: 1
FEAT_SHA3: 1
So I’ve added those to the three entries.

> - New flags I just added (FRINTTS and FLAGM2 from apple-m1);
FEAT_FRINTTS: 1
FEAT_FlagM2: 1
So I;ve added those.

> - PREDRES (from apple-m1)

I cannot find FEAT_PREDRES …
… however we do have 
FEAT_SPECRES: 0

AFAICT from the Arm doc DDI0487L_a_a-profile this is mandatory for 8.5 and is 
the reason i left it at 8.4,

> If that's accurate, then I think you could list apple-m1 as V8_5A (although
> LLVM only specifies V8_4A), and apple-m2 and apple-m3 as V8_6A (same as LLVM).
> The only other difference from the increased architecture version would be to
> enable a few more sysreg names (and our system register gating is an
> inconsistent mess anyway).

I am going to try a bootstrap and test cycle with the changes above (still 
based on 8.4 for now) and 
see how the output looks.

thanks again for looking at this.
Iain




Re: [PATCH v5 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-01-19 Thread Hongyu Wang
Thanks Richard for willing to review this part, it is true that the
try_cmove_arith logic adds quite a lot of special handling for
optimization, so I reduce the logic in emit_mask_load_store to just
generate most simple load/store that does not allow sources to be
swapped.

Hi Jeff, would you help to take a review for the ifcvt changes? Thanks
in advance!

Richard Sandiford  于2025年1月16日周四 19:06写道:

>
> Hongyu Wang  writes:
> > From: Lingling Kong 
> >
> > Hi,
> >
> > Appreciated to Richard's review, the v5 patch contaings below change:
> >
> > 1. Separate the maskload/maskstore emit out from noce_emit_cmove, add
> > a new function emit_mask_load_store in optabs.cc.
> > 2. Follow the operand order of maskload and maskstore optab and takes
> > cond as predicate operand with VOIDmode.
> > 3. Cache may_trap_or_fault_p and correct the logic to ensure only one
> > of cmove source operand can be a may_trap_or_fault memory.
> >
> > Bootstrapped & regtested on x86-64-pclinux-gnu.
> >
> > OK for trunk?
> >
> > APX CFCMOV feature implements conditionally faulting which means
> > that all memory faults are suppressed when the condition code
> > evaluates to false and load or store a memory operand. Now we
> > could load or store a memory operand may trap or fault for
> > conditional move.
> >
> > In middle-end, now we don't support a conditional move if we knew
> > that a load from A or B could trap or fault. To enable CFCMOV, we
> > use mask_load and mask_store as a proxy for backend expander. The
> > predicate of mask_load/mask_store is recognized as comparison rtx
> > in the inital implementation.
> >
> > Conditional move suppress_fault for condition mem store would not
> > move any arithmetic calculations. For condition mem load now just
> > support a conditional move one trap mem and one no trap and no mem
> > cases.
> >
> > gcc/ChangeLog:
> >
> >   * ifcvt.cc (can_use_mask_load_store):  New function to check
> >   wheter conditional fault load store .
> >   (noce_try_cmove_arith): Relax the condition for operand
> >   may_trap_or_fault check, expand with mask_load/mask_store optab
> >   for one of the cmove operand may trap or fault.
> >   (noce_process_if_block): Allow trap_or_fault dest for
> >   "if (...)" *x = a; else skip" scenario when mask_store optab is
> >   available.
> >   * optabs.h (emit_mask_load_store): New declaration.
> >   * optabs.cc (emit_mask_load_store): New function to emit
> >   conditional move with mask_load/mask_store optab.
>
> Thanks for the update.  This addresses the comments I had about
> the use of the maskload/store optabs in previous versions.
>
> I did make several attempts to review the patch beyond that, but I find
> it very difficult to understand the flow of noce_try_cmove_arith, and
> how all the various special cases fit together.  (Not your fault of
> course.)  So I think someone who knows ifcvt should take it from here.
>
> It would be nice if the internal implementation of emit_mask_load_store
> could share more code with other routines though.
>
> Thanks (and sorry),
> Richard
>
> > ---
> >  gcc/ifcvt.cc  | 110 ++
> >  gcc/optabs.cc | 103 ++
> >  gcc/optabs.h  |   3 ++
> >  3 files changed, 200 insertions(+), 16 deletions(-)
> >
> > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> > index cb5597bc171..51ac398aee1 100644
> > --- a/gcc/ifcvt.cc
> > +++ b/gcc/ifcvt.cc
> > @@ -778,6 +778,7 @@ static bool noce_try_store_flag_mask (struct 
> > noce_if_info *);
> >  static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
> >   rtx, rtx, rtx, rtx = NULL, rtx = NULL);
> >  static bool noce_try_cmove (struct noce_if_info *);
> > +static bool can_use_mask_load_store (struct noce_if_info *);
> >  static bool noce_try_cmove_arith (struct noce_if_info *);
> >  static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn 
> > **);
> >  static bool noce_try_minmax (struct noce_if_info *);
> > @@ -2132,6 +2133,39 @@ noce_emit_bb (rtx last_insn, basic_block bb, bool 
> > simple)
> >return true;
> >  }
> >
> > +/* Return TRUE if backend supports scalar maskload_optab
> > +   or maskstore_optab, who suppresses memory faults when trying to
> > +   load or store a memory operand and the condition code evaluates
> > +   to false.
> > +   Currently the following forms
> > +   "if (test) *x = a; else skip;" --> mask_store
> > +   "if (test) x = *a; else x = b;" --> mask_load
> > +   "if (test) x = a; else x = *b;" --> mask_load
> > +   are supported.  */
> > +
> > +static bool
> > +can_use_mask_load_store (struct noce_if_info *if_info)
> > +{
> > +  rtx b = if_info->b;
> > +  rtx x = if_info->x;
> > +  rtx cond = if_info->cond;
> > +
> > +  if (MEM_P (x))
> > +{
> > +  if (convert_optab_handler (maskstore_optab, GET_MODE (x),
> > +  GET_MODE (cond)) == CODE_F

[PATCH] c++/modules: Check linkage of structured binding decls

2025-01-19 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

When looking at PR c++/118513 I noticed that we don't currently check
the linkage of structured binding declarations in modules.  This patch
adds those checks, and corrects decl_linkage to properly recognise
structured binding declarations as potentially having linkage.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_decomposition_declaration): Check linkage
of structured bindings in modules.
* tree.cc (decl_linkage): Structured bindings don't necessarily
have no linkage.

gcc/testsuite/ChangeLog:

* g++.dg/modules/export-6.C: Add structured binding tests.
* g++.dg/modules/hdr-2.H: Likewise.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/parser.cc| 1 +
 gcc/cp/tree.cc  | 2 ++
 gcc/testsuite/g++.dg/modules/export-6.C | 6 ++
 gcc/testsuite/g++.dg/modules/hdr-2.H| 9 +
 4 files changed, 18 insertions(+)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index ff58a8ec98e..a8ac8af0955 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -16728,6 +16728,7 @@ cp_parser_decomposition_declaration (cp_parser *parser,
  cp_finish_decl (decl, initializer, non_constant_p, NULL_TREE,
  (is_direct_init ? LOOKUP_NORMAL : LOOKUP_IMPLICIT),
  &decomp);
+ check_module_decl_linkage (decl);
}
 }
   else if (decl != error_mark_node)
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index ed01ca42eaf..36581865a17 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -6003,6 +6003,8 @@ decl_linkage (tree decl)
 {
   if (TREE_CODE (decl) == TYPE_DECL && !TYPE_ANON_P (TREE_TYPE (decl)))
/* This entity has a typedef name for linkage purposes.  */;
+  else if (DECL_DECOMPOSITION_P (decl) && DECL_DECOMP_IS_BASE (decl))
+   /* Namespace-scope structured bindings can have linkage.  */;
   else if (TREE_CODE (decl) == NAMESPACE_DECL && cxx_dialect >= cxx11)
/* An anonymous namespace has internal linkage since C++11.  */
return lk_internal;
diff --git a/gcc/testsuite/g++.dg/modules/export-6.C 
b/gcc/testsuite/g++.dg/modules/export-6.C
index c59944aa688..460cdf08ea9 100644
--- a/gcc/testsuite/g++.dg/modules/export-6.C
+++ b/gcc/testsuite/g++.dg/modules/export-6.C
@@ -3,6 +3,7 @@
 
 export module bad;
 namespace global {}
+struct S { int x; };
 
 export static int x = 123;  // { dg-error "internal linkage" }
 export static void f();  // { dg-error "internal linkage" }
@@ -10,12 +11,17 @@ export static void g() {}  // { dg-error "internal linkage" 
}
 export template  static void t();  // { dg-error "internal 
linkage" }
 export template  static void u() {}  // { dg-error "internal 
linkage" }
 
+#if __cplusplus >= 202002L
+export static auto [d] = S{};  // { dg-error "internal linkage" "" { target 
c++20 } }
+#endif
+
 namespace {
   export int y = 456;  // { dg-error "internal linkage" }
   export void h();  // { dg-error "internal linkage" }
   export void i() {}  // { dg-error "internal linkage" }
   export template  void v(); // { dg-error "internal linkage" }
   export template  void w() {} // { dg-error "internal linkage" }
+  export auto [e] = S{};  // { dg-error "internal linkage" }
 
   export namespace ns {}  // { dg-error "internal linkage" }
   export namespace alias = global;  // { dg-error "internal linkage" }
diff --git a/gcc/testsuite/g++.dg/modules/hdr-2.H 
b/gcc/testsuite/g++.dg/modules/hdr-2.H
index 097546d5667..834c68241b8 100644
--- a/gcc/testsuite/g++.dg/modules/hdr-2.H
+++ b/gcc/testsuite/g++.dg/modules/hdr-2.H
@@ -3,8 +3,11 @@
 // external linkage variables or functions in header units must
 // not have non-inline definitions
 
+struct S { int x; };
+
 int x_err;  // { dg-error "external linkage definition" }
 int y_err = 123;  // { dg-error "external linkage definition" }
+auto [d_err] = S{};  // { dg-error "external linkage definition" }
 void f_err() {}  // { dg-error "external linkage definition" }
 
 struct Err {
@@ -59,6 +62,7 @@ struct Inl {
 // Internal linkage decls are OK
 static int x_internal;
 static int y_internal = 123;
+namespace { auto [d_internal] = S{}; }
 static void f_internal() {}
 
 namespace {
@@ -81,6 +85,11 @@ inline void f_static() {
   thread_local int x_thread_local;
   thread_local int y_thread_local = 123;
 
+#if __cplusplus >= 202002L
+  static auto [d_static] = S{};
+  thread_local auto [d_thread_local] = S{};
+#endif
+
   x_static = y_static;
   x_thread_local = y_thread_local;
 }
-- 
2.47.0



Re: [PATCH] match: Change (A + CST0) * CST1 to (A + sign_extend(CST0)) * CST1 [PR116845]

2025-01-19 Thread Richard Biener
On Fri, 17 Jan 2025, Philipp Tomsich wrote:

> Folks,
> 
> we'd appreciate it if someone could take the time to review this fix
> for PR116845.
> 
> Thanks,
> Philipp.
> 
> 
> 
> On Tue, 31 Dec 2024 at 10:03, Konstantinos Eleftheriou
>  wrote:
> >
> > From: kelefth 
> >
> > `(A * B) + (-C) to (B - C/A) * A` fails to match on ILP32 targets due to
> > the upper bits of CST0 being zeros in some cases.
> >
> > This patch adds the following pattern in match.pd:
> > (A + CST0) * CST1 -> (A + CST0') * CST1, where CST1 is a power of 2
> > constant and CST0' is CST0 with the log2(CST1) MS bits sign-extended.
> >
> > This pattern sign-extends the log2(CST1) MS bits of CST0, in case that
> > the bit that follows them is set. These bits will be pushed out by
> > the multiplication with CST1.

How does this not possibly introudce new signed overflow UB?

It also feels odd to set more bits in CST0 as "canonicalization",
IIRC in other context we have code (in fold-const.cc at least) that
reduces the number of set bits with the idea to make constant
generation cheaper.

That said, it looks odd to do this in a match.pd pattern, instead
I'd have expected the places that would benefit to adjust to also
handle this case?

> > Bootstrapped/regtested on x86 and AArch64.
> >
> > PR tree-optimization/116845
> >
> > gcc/ChangeLog:
> >
> > * match.pd: New pattern.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr116845.c: New test.
> > ---
> >  gcc/match.pd | 21 -
> >  gcc/testsuite/gcc.target/i386/pr116845.c | 23 +++
> >  2 files changed, 43 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr116845.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 83eca8b2e0a..aa2108726f9 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -4400,7 +4400,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >   /* Calculate @2 / @0 in order to factorize the expression.  */
> >   wide_int div_res = wi::sdiv_trunc (c2, c0);
> >   tree div_cst = wide_int_to_tree (type, div_res); }
> > -   (mult (plus @1 { div_cst; }) @0
> > +   (mult (plus @1 { div_cst; }) @0)))
> > + /* (A + CST0) * CST1 -> (A + CST0') * CST1, where CST1 is a power of 2
> > +constant and CST0' is CST0 with the log2(CST1) MS bits sign-extended.  
> > */
> > + (simplify
> > +   (mult (plus:c @0 INTEGER_CST@1) integer_pow2p@2)
> > + (with { wide_int c1 = wi::to_wide (@1); }
> > + (if (!wi::neg_p (c1))
> > + (with {
> > +   int bits_to_extend = wi::exact_log2 (wi::to_wide (@2));
> > +   uint rest_bits = tree_to_uhwi (TYPE_SIZE (type)) - bits_to_extend;
> > +   /* Get the value of the MSB of the rest_bits.  */
> > +   wide_int bitmask = wi::set_bit_in_zero (rest_bits - 1,
> > +  TYPE_PRECISION (type));
> > +   wide_int bit_value = wi::lrshift (c1 & bitmask, rest_bits - 1); }
> > +   /* If the bit is set, extend c1 with its value.  */
> > +   (if (!wi::cmp (bit_value, 1, TYPE_SIGN (type)))
> > +(with {
> > +  c1 |= wi::mask (rest_bits, true, TYPE_PRECISION (type));
> > +  tree c1_ext = wide_int_to_tree (type, c1); }
> > +(mult (plus @0 { c1_ext; }) @2
> >
> >  #if GIMPLE
> >  /* Canonicalize X + (X << C) into X * (1 + (1 << C)) and
> > diff --git a/gcc/testsuite/gcc.target/i386/pr116845.c 
> > b/gcc/testsuite/gcc.target/i386/pr116845.c
> > new file mode 100644
> > index 000..230e4d5f8b5
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr116845.c
> > @@ -0,0 +1,23 @@
> > +/* PR tree-optimization/116845 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized -m32" } */
> > +
> > +int foo(int *a, int j)
> > +{
> > +  int k = j - 1;
> > +  return a[j - 1] == a[k];
> > +}
> > +
> > +int foo2(int *a, int j)
> > +{
> > +  int k = j - 5;
> > +  return a[j - 5] == a[k];
> > +}
> > +
> > +int bar(int *a, int j)
> > +{
> > +  int k = j - 1;
> > +  return (&a[j + 1] - 2) == &a[k];
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "return 1;" 3 "optimized" } } */
> > --
> > 2.47.0
> >
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [COMMITTED] Regenerate sparc.opt.urls

2025-01-19 Thread Eric Botcazou
> sparc added a -mvis3b option, but the sparc.opt.url file wasn't
> regenerated.
> 
> Fixes: d309844d6fe0 ("Fix bootstrap failure on SPARC with -O3
> -mcpu=niagara4")

Thanks, but how is one supposed to detect this?  Everything worked fine.

-- 
Eric Botcazou




Re: [PATCH v5 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-01-19 Thread Richard Biener
On Mon, 20 Jan 2025, Hongyu Wang wrote:

> Thanks Richard for willing to review this part, it is true that the
> try_cmove_arith logic adds quite a lot of special handling for
> optimization, so I reduce the logic in emit_mask_load_store to just
> generate most simple load/store that does not allow sources to be
> swapped.
> 
> Hi Jeff, would you help to take a review for the ifcvt changes? Thanks
> in advance!

I want to add that it feels a bit risky to do invasive ifcvt.cc changes
at this point.  Is it possible to defer this to stage1?  I am not
expecting a sudden uptick in APX code generation adoption ...

Richard.

> Richard Sandiford  于2025年1月16日周四 19:06写道:
> 
> >
> > Hongyu Wang  writes:
> > > From: Lingling Kong 
> > >
> > > Hi,
> > >
> > > Appreciated to Richard's review, the v5 patch contaings below change:
> > >
> > > 1. Separate the maskload/maskstore emit out from noce_emit_cmove, add
> > > a new function emit_mask_load_store in optabs.cc.
> > > 2. Follow the operand order of maskload and maskstore optab and takes
> > > cond as predicate operand with VOIDmode.
> > > 3. Cache may_trap_or_fault_p and correct the logic to ensure only one
> > > of cmove source operand can be a may_trap_or_fault memory.
> > >
> > > Bootstrapped & regtested on x86-64-pclinux-gnu.
> > >
> > > OK for trunk?
> > >
> > > APX CFCMOV feature implements conditionally faulting which means
> > > that all memory faults are suppressed when the condition code
> > > evaluates to false and load or store a memory operand. Now we
> > > could load or store a memory operand may trap or fault for
> > > conditional move.
> > >
> > > In middle-end, now we don't support a conditional move if we knew
> > > that a load from A or B could trap or fault. To enable CFCMOV, we
> > > use mask_load and mask_store as a proxy for backend expander. The
> > > predicate of mask_load/mask_store is recognized as comparison rtx
> > > in the inital implementation.
> > >
> > > Conditional move suppress_fault for condition mem store would not
> > > move any arithmetic calculations. For condition mem load now just
> > > support a conditional move one trap mem and one no trap and no mem
> > > cases.
> > >
> > > gcc/ChangeLog:
> > >
> > >   * ifcvt.cc (can_use_mask_load_store):  New function to check
> > >   wheter conditional fault load store .
> > >   (noce_try_cmove_arith): Relax the condition for operand
> > >   may_trap_or_fault check, expand with mask_load/mask_store optab
> > >   for one of the cmove operand may trap or fault.
> > >   (noce_process_if_block): Allow trap_or_fault dest for
> > >   "if (...)" *x = a; else skip" scenario when mask_store optab is
> > >   available.
> > >   * optabs.h (emit_mask_load_store): New declaration.
> > >   * optabs.cc (emit_mask_load_store): New function to emit
> > >   conditional move with mask_load/mask_store optab.
> >
> > Thanks for the update.  This addresses the comments I had about
> > the use of the maskload/store optabs in previous versions.
> >
> > I did make several attempts to review the patch beyond that, but I find
> > it very difficult to understand the flow of noce_try_cmove_arith, and
> > how all the various special cases fit together.  (Not your fault of
> > course.)  So I think someone who knows ifcvt should take it from here.
> >
> > It would be nice if the internal implementation of emit_mask_load_store
> > could share more code with other routines though.
> >
> > Thanks (and sorry),
> > Richard
> >
> > > ---
> > >  gcc/ifcvt.cc  | 110 ++
> > >  gcc/optabs.cc | 103 ++
> > >  gcc/optabs.h  |   3 ++
> > >  3 files changed, 200 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> > > index cb5597bc171..51ac398aee1 100644
> > > --- a/gcc/ifcvt.cc
> > > +++ b/gcc/ifcvt.cc
> > > @@ -778,6 +778,7 @@ static bool noce_try_store_flag_mask (struct 
> > > noce_if_info *);
> > >  static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, 
> > > rtx,
> > >   rtx, rtx, rtx, rtx = NULL, rtx = NULL);
> > >  static bool noce_try_cmove (struct noce_if_info *);
> > > +static bool can_use_mask_load_store (struct noce_if_info *);
> > >  static bool noce_try_cmove_arith (struct noce_if_info *);
> > >  static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn 
> > > **);
> > >  static bool noce_try_minmax (struct noce_if_info *);
> > > @@ -2132,6 +2133,39 @@ noce_emit_bb (rtx last_insn, basic_block bb, bool 
> > > simple)
> > >return true;
> > >  }
> > >
> > > +/* Return TRUE if backend supports scalar maskload_optab
> > > +   or maskstore_optab, who suppresses memory faults when trying to
> > > +   load or store a memory operand and the condition code evaluates
> > > +   to false.
> > > +   Currently the following forms
> > > +   "if (test) *x = a; else skip;" --> mask_store
> > > +   "if (t

Re: [PATCH] match.pd: Fix indefinite recursion during exp-log transformations [PR118490]

2025-01-19 Thread Richard Biener
On Mon, 20 Jan 2025, Soumya AR wrote:

> This patch fixes the ICE caused when comparing log or exp of a constant with
> another constant.
> 
> The transform is now restricted to cases where the resultant
> log/exp (CST) can be constant folded.

OK.

Richard.

> Signed-off-by: Soumya AR 
> 
> gcc/ChangeLog:
> 
> PR target/118490
> * match.pd: Added ! to verify that log/exp (CST) can be constant folded.
> 
> gcc/testsuite/ChangeLog:
> 
> PR target/118490
> * gcc.dg/pr118490.c: New test.
> ---
> gcc/match.pd | 4 ++--
> gcc/testsuite/gcc.dg/pr | 0
> gcc/testsuite/gcc.dg/pr118490.c | 7 +++
> 3 files changed, 9 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/pr
> create mode 100644 gcc/testsuite/gcc.dg/pr118490.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index b6cbb851897..fd1ddf627bf 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8317,12 +8317,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> /* Simplify logN (x) CMP CST into x CMP expN (CST) */
> (simplify
> (cmp:c (logs:s @0) REAL_CST@1)
> - (cmp @0 (exps @1)))
> + (cmp @0 (exps! @1)))
> 
> /* Simplify expN (x) CMP CST into x CMP logN (CST) */
> (simplify
> (cmp:c (exps:s @0) REAL_CST@1)
> - (cmp @0 (logs @1))
> + (cmp @0 (logs! @1))
> 
> (for logs (LOG LOG2 LOG10 LOG10)
> exps (EXP EXP2 EXP10 POW10)
> diff --git a/gcc/testsuite/gcc.dg/pr b/gcc/testsuite/gcc.dg/pr
> new file mode 100644
> index 000..e69de29bb2d
> diff --git a/gcc/testsuite/gcc.dg/pr118490.c b/gcc/testsuite/gcc.dg/pr118490.c
> new file mode 100644
> index 000..4ae0dacefee
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr118490.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ffast-math -frounding-math -Wlogical-op" } */
> +
> +double exp(double);
> +int foo(int v) {
> + return v && exp(1.) < 2.;
> +}
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[committed] RISC-V: Add sifive_vector.h

2025-01-19 Thread Kito Cheng
sifive_vector.h is a vendor specfic header, it should include before
using sifive vector intrinsic, it's just include riscv_vector.h for now,
we will separate the implementation by adding new pragma in future.

gcc/ChangeLog:

* config.gcc (riscv*): Install sifive_vector.h.
* config/riscv/sifive_vector.h: New.

---

This patch is simple enough, and it's fixing compatibility issue with
clang, so I just committed ahead :)

---
 gcc/config.gcc   |  2 +-
 gcc/config/riscv/sifive_vector.h | 32 
 2 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/riscv/sifive_vector.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 87fed823118..371143e4f8d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -555,7 +555,7 @@ riscv*)
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o 
sifive-vector-builtins-bases.o"
extra_objs="${extra_objs} thead.o riscv-target-attr.o riscv-zicfilp.o"
d_target_objs="riscv-d.o"
-   extra_headers="riscv_vector.h riscv_crypto.h riscv_bitmanip.h 
riscv_th_vector.h riscv_cmo.h"
+   extra_headers="riscv_vector.h riscv_crypto.h riscv_bitmanip.h 
riscv_th_vector.h riscv_cmo.h sifive_vector.h"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.cc"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.h"
;;
diff --git a/gcc/config/riscv/sifive_vector.h b/gcc/config/riscv/sifive_vector.h
new file mode 100644
index 000..02d314e3b4a
--- /dev/null
+++ b/gcc/config/riscv/sifive_vector.h
@@ -0,0 +1,32 @@
+/* SiFive Vector Extension intrinsics include file.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#ifndef __SIFIVE_VECTOR_H
+#define __SIFIVE_VECTOR_H
+
+/* TODO: This should have a separate pragma to include only the SiFive
+ vector intrinsics. For now, we are including riscv_vector.h. */
+#include 
+
+#endif // __SIFIVE_VECTOR_H
-- 
2.34.1



[PATCH] match.pd: Fix indefinite recursion during exp-log transformations [PR118490]

2025-01-19 Thread Soumya AR
This patch fixes the ICE caused when comparing log or exp of a constant with
another constant.

The transform is now restricted to cases where the resultant
log/exp (CST) can be constant folded.

Signed-off-by: Soumya AR 

gcc/ChangeLog:

PR target/118490
* match.pd: Added ! to verify that log/exp (CST) can be constant folded.

gcc/testsuite/ChangeLog:

PR target/118490
* gcc.dg/pr118490.c: New test.
---
gcc/match.pd | 4 ++--
gcc/testsuite/gcc.dg/pr | 0
gcc/testsuite/gcc.dg/pr118490.c | 7 +++
3 files changed, 9 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr
create mode 100644 gcc/testsuite/gcc.dg/pr118490.c

diff --git a/gcc/match.pd b/gcc/match.pd
index b6cbb851897..fd1ddf627bf 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8317,12 +8317,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
/* Simplify logN (x) CMP CST into x CMP expN (CST) */
(simplify
(cmp:c (logs:s @0) REAL_CST@1)
- (cmp @0 (exps @1)))
+ (cmp @0 (exps! @1)))

/* Simplify expN (x) CMP CST into x CMP logN (CST) */
(simplify
(cmp:c (exps:s @0) REAL_CST@1)
- (cmp @0 (logs @1))
+ (cmp @0 (logs! @1))

(for logs (LOG LOG2 LOG10 LOG10)
exps (EXP EXP2 EXP10 POW10)
diff --git a/gcc/testsuite/gcc.dg/pr b/gcc/testsuite/gcc.dg/pr
new file mode 100644
index 000..e69de29bb2d
diff --git a/gcc/testsuite/gcc.dg/pr118490.c b/gcc/testsuite/gcc.dg/pr118490.c
new file mode 100644
index 000..4ae0dacefee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr118490.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -frounding-math -Wlogical-op" } */
+
+double exp(double);
+int foo(int v) {
+ return v && exp(1.) < 2.;
+}
-- 
2.43.2


Re: [PATCH] RISC-V: Correct the mode that is causing the program to fail for XTheadCondMov

2025-01-19 Thread Jin Ma
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
> > new file mode 100644
> > index ..33658b863514
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile  { target { rv64 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadcondmov -mabi=lp64d -O2" } */
> > +
> > +__attribute__((noinline, noclone)) long long int
> 
> The attributes are useless as nothing is calling the function in the
> tests.

Yes, my mistake, thanks.

BR
Jin


[PATCH] inline: Purge the abnormal edges as needed in fold_marked_statements [PR118077]

2025-01-19 Thread Andrew Pinski
While fixing PR target/117665, I had noticed that fold_marked_statements
would not purge the abnormal edges which could not be taken any more due
to folding a call (devirtualization or simplification of a [target] builtin).
Devirutalization could also cause a call that used to be able to have an
abornal edge become one not needing one too so this was needed for GCC 15.

Bootstrapped and tested on x86_64-linux-gnu

PR tree-optimization/118077
PR tree-optimization/117668

gcc/ChangeLog:

* tree-inline.cc (fold_marked_statements): Purge abnormal edges
as needed.

gcc/testsuite/ChangeLog:

* g++.dg/opt/devirt6.C: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/testsuite/g++.dg/opt/devirt6.C | 23 +++
 gcc/tree-inline.cc | 22 --
 2 files changed, 43 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/devirt6.C

diff --git a/gcc/testsuite/g++.dg/opt/devirt6.C 
b/gcc/testsuite/g++.dg/opt/devirt6.C
new file mode 100644
index 000..5caf9da7891
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/devirt6.C
@@ -0,0 +1,23 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-O3" }
+
+// PR tree-optimization/118077
+
+// This used to ICE because the devirtualization call
+// of bb inside f1 (which was inlined into f2) became
+// a direct call to c1::bb but the abnormal edge
+// was not removed even though bb was const.
+
+int f() __attribute__((returns_twice));
+struct c1 {
+  virtual int bb(void) const { return 0; }
+  bool f1(int a)
+  {
+return a && !bb();
+  }
+};
+struct c2 final : c1 { void f2(int); };
+void c2::f2(int a) {
+  if (!f1(a))
+f();
+}
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 11278e5c483..5b3539009a3 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -5430,6 +5430,7 @@ static void
 fold_marked_statements (int first, hash_set *statements)
 {
   auto_bitmap to_purge;
+  auto_bitmap to_purge_abnormal;
 
   auto_vec stack (n_basic_blocks_for_fn (cfun) + 2);
   auto_sbitmap visited (last_basic_block_for_fn (cfun));
@@ -5456,8 +5457,16 @@ fold_marked_statements (int first, hash_set 
*statements)
  continue;
 
gimple *old_stmt = gsi_stmt (gsi);
-   tree old_decl = (is_gimple_call (old_stmt)
-? gimple_call_fndecl (old_stmt) : 0);
+   bool can_make_abnormal_goto = false;
+   tree old_decl = NULL_TREE;
+
+   if (is_gimple_call (old_stmt))
+ {
+   old_decl = gimple_call_fndecl (old_stmt);
+   if (stmt_can_make_abnormal_goto (old_stmt))
+ can_make_abnormal_goto = true;
+ }
+
if (old_decl && fndecl_built_in_p (old_decl))
  {
/* Folding builtins can create multiple instructions,
@@ -5473,6 +5482,8 @@ fold_marked_statements (int first, hash_set 
*statements)
  {
cgraph_update_edges_for_call_stmt (old_stmt,
   old_decl, NULL);
+   if (can_make_abnormal_goto)
+ bitmap_set_bit (to_purge_abnormal, dest->index);
break;
  }
if (gsi_end_p (i2))
@@ -5501,6 +5512,9 @@ fold_marked_statements (int first, hash_set 
*statements)
if (maybe_clean_or_replace_eh_stmt (old_stmt,
new_stmt))
  bitmap_set_bit (to_purge, dest->index);
+   if (can_make_abnormal_goto
+   && !stmt_can_make_abnormal_goto (new_stmt))
+ bitmap_set_bit (to_purge_abnormal, dest->index);
break;
  }
gsi_next (&i2);
@@ -5521,6 +5535,9 @@ fold_marked_statements (int first, hash_set 
*statements)
 
if (maybe_clean_or_replace_eh_stmt (old_stmt, new_stmt))
  bitmap_set_bit (to_purge, dest->index);
+   if (can_make_abnormal_goto
+   && !stmt_can_make_abnormal_goto (new_stmt))
+ bitmap_set_bit (to_purge_abnormal, dest->index);
  }
  }
 
@@ -5542,6 +5559,7 @@ fold_marked_statements (int first, hash_set 
*statements)
 }
 
   gimple_purge_all_dead_eh_edges (to_purge);
+  gimple_purge_all_dead_abnormal_call_edges (to_purge_abnormal);
 }
 
 /* Expand calls to inline functions in the body of FN.  */
-- 
2.43.0



Re: [PATCH v2 1/2] LoongArch: Simplify using bstr{ins,pick} instructions for and

2025-01-19 Thread Lulu Cheng

LGTM!

Thanks!

在 2025/1/18 下午7:33, Xi Ruoyao 写道:

For bstrins, we can merge it into and3 instead of having a
separate define_insn.

For bstrpick, we can use the constraints to ensure the first source
register and the destination register are the same hardware register,
instead of emitting a move manually.

This will simplify the next commit where we'll reassociate bitwise
and left shift for better code generation.

gcc/ChangeLog:

* config/loongarch/constraints.md (Yy): New define_constriant.
* config/loongarch/loongarch.cc (loongarch_print_operand):
For "%M", output the index of bits to be used with
bstrins/bstrpick.
* config/loongarch/predicates.md (ins_zero_bitmask_operand):
Exclude low_bitmask_operand as for low_bitmask_operand it's
always better to use bstrpick instead of bstrins.
(and_operand): New define_predicate.
* config/loongarch/loongarch.md (any_or): New
define_code_iterator.
(bitwise_operand): New define_code_attr.
(*3): New define_insn.
(*and3): New define_insn.
(3): New define_expand.
(and3_extended): Remove, replaced by the 3rd alternative
of *and3.
(bstrins__for_mask): Remove, replaced by the 4th
alternative of *and3.
(*si3_internal): Remove, already covered by
the *3 and *and3 templates.
---
  gcc/config/loongarch/constraints.md |  4 ++
  gcc/config/loongarch/loongarch.cc   | 12 +
  gcc/config/loongarch/loongarch.md   | 77 +++--
  gcc/config/loongarch/predicates.md  |  8 ++-
  4 files changed, 53 insertions(+), 48 deletions(-)

diff --git a/gcc/config/loongarch/constraints.md 
b/gcc/config/loongarch/constraints.md
index 547d9161445..a7c31c2c4e0 100644
--- a/gcc/config/loongarch/constraints.md
+++ b/gcc/config/loongarch/constraints.md
@@ -292,6 +292,10 @@ (define_constraint "Yx"
 "@internal"
 (match_operand 0 "low_bitmask_operand"))
  
+(define_constraint "Yy"

+   "@internal"
+   (match_operand 0 "ins_zero_bitmask_operand"))
+
  (define_constraint "YI"
"@internal
 A replicated vector const in which the replicated value is in the range
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 3a8e1297bd3..1004b65a1ee 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -6142,6 +6142,8 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
hi64_part,
 'i'Print i if the operand is not a register.
 'L'  Print the low-part relocation associated with OP.
 'm'Print one less than CONST_INT OP in decimal.
+   'M' Print the indices of the lowest enabled bit and the highest
+   enabled bit in a mask (for bstr* instructions).
 'N'Print the inverse of the integer branch condition for 
comparison OP.
 'Q'  Print R_LARCH_RELAX for TLS IE.
 'r'  Print address 12-31bit relocation associated with OP.
@@ -6268,6 +6270,16 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
output_operand_lossage ("invalid use of '%%%c'", letter);
break;
  
+case 'M':

+  if (CONST_INT_P (op))
+   {
+ HOST_WIDE_INT mask = INTVAL (op);
+ fprintf (file, "%d,%d", floor_log2 (mask), ctz_hwi (mask));
+   }
+  else
+   output_operand_lossage ("invalid use of '%%%c'", letter);
+  break;
+
  case 'N':
loongarch_print_int_branch_condition (file, reverse_condition (code),
letter);
diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 1b46e8e4af0..995df1b8875 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -485,7 +485,11 @@ (define_code_iterator any_shift [ashift ashiftrt lshiftrt])
  ;; This code iterator allows the three bitwise instructions to be generated
  ;; from the same template.
  (define_code_iterator any_bitwise [and ior xor])
+(define_code_iterator any_or [ior xor])
  (define_code_iterator neg_bitwise [and ior])
+(define_code_attr bitwise_operand [(and "and_operand")
+  (ior "uns_arith_operand")
+  (xor "uns_arith_operand")])
  
  ;; This code iterator allows unsigned and signed division to be generated

  ;; from the same template.
@@ -1537,23 +1541,37 @@ (define_insn "neg2"
  ;;  
  ;;
  
-(define_insn "3"

-  [(set (match_operand:X 0 "register_operand" "=r,r")
-   (any_bitwise:X (match_operand:X 1 "register_operand" "%r,r")
-  (match_operand:X 2 "uns_arith_operand" "r,K")))]
+(define_insn "*3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+   (any_or:GPR (match_operand:GPR 1 "register_operand" "%r,r")
+   (match_operand:GPR 2 "uns_arith_operand" "r,K")))]
""
"%i2\t%0,%1,%2"
[(set_attr "type" "logical")
 (set_attr "mode" "")])
  
-(define_insn "*

[PATCH v2] RISC-V: Correct the mode that is causing the program to fail for XTheadCondMov

2025-01-19 Thread Jin Ma
For XTheadCondMov, the bit width of rs2 should always be XLEN-sized, otherwise
the program logic will be wrong.

Reference form
https://github.com/XUANTIE-RV/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf

Synopsis
Move if equal zero.

Mnemonic
th.mveqz rd, rs1, rs2

Description
This instruction moves the content of register rs1 into rd if the content of 
rs2 is 0x0.
Otherwise, the value of rd does not change.

Operation
if (reg[rs2] == 0x0)
  reg[rd] := reg[rs1]

gcc/ChangeLog:

* config/riscv/thead.md (*th_cond_mov):
Change GPR2 to X.
(*th_cond_mov): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadcondmov-bug.c: New test.
---
 gcc/config/riscv/thead.md  |  4 ++--
 gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c | 12 
 2 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c

diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
index 54b9737b4308..d816f3b86dde 100644
--- a/gcc/config/riscv/thead.md
+++ b/gcc/config/riscv/thead.md
@@ -154,11 +154,11 @@ (define_insn "*th_tst3"
 
 ;; XTheadCondMov
 
-(define_insn "*th_cond_mov"
+(define_insn "*th_cond_mov"
   [(set (match_operand:GPR 0 "register_operand" "=r,r")
(if_then_else:GPR
 (match_operator 4 "equality_operator"
-   [(match_operand:GPR2 1 "register_operand" "r,r")
+   [(match_operand:X 1 "register_operand" "r,r")
 (const_int 0)])
 (match_operand:GPR 2 "reg_or_0_operand" "rJ,0")
 (match_operand:GPR 3 "reg_or_0_operand" "0,rJ")))]
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c 
b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
new file mode 100644
index ..01cec6292919
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-bug.c
@@ -0,0 +1,12 @@
+/* { dg-do compile  { target { rv64 } } } */
+/* { dg-options "-march=rv64gc_xtheadcondmov -mabi=lp64d -O2" } */
+
+long long int
+foo (long long int x, long long int y)
+{
+  if (((int) x | (int) y) != 0)
+return 6;
+  return x + y;
+}
+
+/* { dg-final { scan-assembler-times {\msext\.w\M} 1 } } */
-- 
2.25.1



[PATCH] LoongArch: Correct the mode for mask{eq,ne}z

2025-01-19 Thread Xi Ruoyao
For mask{eq,ne}z, rk is always compared with 0 in the full width, thus
the mode for rk should be X.

I found the issue reviewing a patch fixing a similar issue for RISC-V
XTheadCondMov [1], but interestingly I cannot find a test case really
blowing up on LoongArch.  But as the issue is obvious enough let's fix
it anyway so it won't blow up in the future.

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674004.html

gcc/ChangeLog:

* config/loongarch/loongarch.md
(*sel_using_): Rename to ...
(*sel_using_): ... here.
(GPR2): Remove as nothing uses it now.
---

Bootstrapped & regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/loongarch.md | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index c17d2928fbf..10197b9d9d5 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -374,10 +374,6 @@ (define_asm_attributes
 ;; from the same template.
 (define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
 
-;; A copy of GPR that can be used when a pattern has two independent
-;; modes.
-(define_mode_iterator GPR2 [SI (DI "TARGET_64BIT")])
-
 ;; This mode iterator allows 16-bit and 32-bit GPR patterns and 32-bit 64-bit
 ;; FPR patterns to be generated from the same template.
 (define_mode_iterator JOIN_MODE [HI
@@ -2507,11 +2503,11 @@ (define_expand "cstore4"
 
 ;; Conditional move instructions.
 
-(define_insn "*sel_using_"
+(define_insn "*sel_using_"
   [(set (match_operand:GPR 0 "register_operand" "=r,r")
(if_then_else:GPR
-(equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
-  (const_int 0))
+(equality_op:X (match_operand:X 1 "register_operand" "r,r")
+   (const_int 0))
 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
   "register_operand (operands[2], mode)
-- 
2.48.1