date:20240405

Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-05 Thread rep . dot . nop

On 5 April 2024 03:03:05 CEST, "H.J. Lu"  wrote:
>On Thu, Apr 4, 2024 at 5:34 PM  wrote:
>>
>> On 3 April 2024 15:49:13 CEST, "H.J. Lu"  wrote:
>

>> +  /* Skip if it has been visited.  */
>> +  unsigned int uid = e->caller->get_uid ();
>> +  if (bitmap_bit_p (ifunc_ref_map, uid))
>> +   continue;
>> +  bitmap_set_bit (ifunc_ref_map, uid);
>>
>> I think you could have written this as
>> if (!bitmap_set_bit (ifunc_ref_map, uid))
>>   continue;
>>
>
>Feel free to submit a patch.

OK, could be that  
https://inbox.sourceware.org/gcc-patches/20211101220212.3d308d1f@nbbrfq/
was not applied yet, the bitmap_clear_bit is the same.
I'll try to remember these for next stage 1.

cheers

Re: [PATCH]middle-end vect: adjust loop upper bounds when peeling for gaps and early break [PR114403]

2024-04-05 Thread Richard Biener

On Thu, 4 Apr 2024, Tamar Christina wrote:

> Hi All,
> 
> The report shows that we end up in a situation where the code has been peeled
> for gaps and we have an early break.
> 
> The code for peeling for gaps assume that a scalar loop needs to perform at
> least one iteration.  However this doesn't take into account early break where
> the scalar loop may not need to be executed.

But we always re-start the vector iteration where the early break happens?

> That the early break loop can be partial is not accounted for in this 
> scenario.
> loop partiality is normally handled by setting bias_for_lowest to 1, but when
> peeling for gaps we end up with 0, which when the loop upper bounds are
> calculated means that a partial loop iteration loses the final partial iter:
> 
> Analyzing # of iterations of loop 1
>   exit condition [8, + , 18446744073709551615] != 0
>   bounds on difference of bases: -8 ... -8
>   result:
> # of iterations 8, bounded by 8
> 
> and a VF=4 calculating:
> 
> Loop 1 iterates at most 1 times.
> Loop 1 likely iterates at most 1 times.
> Analyzing # of iterations of loop 1
>   exit condition [1, + , 1](no_overflow) < bnd.5505_39
>   bounds on difference of bases: 0 ... 4611686018427387902
> Matching expression match.pd:2011, generic-match-8.cc:27
> Applying pattern match.pd:2067, generic-match-1.cc:4813
>   result:
> # of iterations bnd.5505_39 + 18446744073709551615, bounded by 
> 4611686018427387902
> Estimating sizes for loop 1
> ...
>Induction variable computation will be folded away.
>   size:   2 if (ivtmp_312 < bnd.5505_39)
>Exit condition will be eliminated in last copy.
> size: 24-3, last_iteration: 24-5
>   Loop size: 24
>   Estimated size after unrolling: 26
> ;; Guessed iterations of loop 1 is 0.858446. New upper bound 1.
> 
> upper bound should be 2 not 1.

Why?  This means the vector loop will iterate once (thus the body
executed twice), isn't that correct?  Peeling for gaps means the
main IV will exit the loop in time.

> 
> This patch forced the bias_for_lowest to be 1 even when peeling for gaps.

(*)

> I have however not been able to write a standalone reproducer for this so I 
> have
> no tests but bootstrap and LLVM build fine now.
> 
> The testcase:
> 
> #define COUNT 9
> #define SIZE COUNT * 4
> #define TYPE unsigned long
> 
> TYPE x[SIZE], y[SIZE];
> 
> void __attribute__((noipa))
> loop (TYPE val)
> {
>   for (int i = 0; i < COUNT; ++i)
> {
>   if (x[i * 4] > val || x[i * 4 + 1] > val)
> return;
>   x[i * 4] = y[i * 2] + 1;
>   x[i * 4 + 1] = y[i * 2] + 2;
>   x[i * 4 + 2] = y[i * 2 + 1] + 3;
>   x[i * 4 + 3] = y[i * 2 + 1] + 4;
> }
> }
> 
> does perform the peeling for gaps and early beak, however it creates a hybrid
> loop which works fine. adjusting the indices to non linear also works. So I'd
> like to submit the fix and work on a testcase separately if needed.

You can have peeling for gaps without SLP by doing interleaving.

#define COUNT 9
#define TYPE unsigned long

TYPE x[COUNT], y[COUNT*2];

void __attribute__((noipa))
loop (TYPE val)
{
  for (int i = 0; i < COUNT; ++i)
{ 
  if (x[i] > val)
return;
  x[i] = y[i * 2];
   }
}

gets me partial vectors and peeling for gaps with -O3 -march=armv8.2-a+sve 
-fno-vect-cost-model (with cost modeling we choose ADVSIMD).  Does
this reproduce the issue?

Richard.


> Bootstrapped Regtested on x86_64-pc-linux-gnu no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/114403
>   * tree-vect-loop.cc (vect_transform_loop): Adjust upper bounds for when
>   peeling for gaps and early break.
> 
> ---
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 
> 4375ebdcb493a90fd0501cbb4b07466077b525c3..bf1bb9b005c68fbb13ee1b1279424865b237245a
>  100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -12139,7 +12139,8 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple 
> *loop_vectorized_call)
>/* The minimum number of iterations performed by the epilogue.  This
>   is 1 when peeling for gaps because we always need a final scalar
>   iteration.  */
> -  int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) ? 1 : 0;
> +  int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
> +&& !LOOP_VINFO_EARLY_BREAKS (loop_vinfo) ? 1 : 0;

(*) This adjusts min_epilogue_iters though and honestly the whole code
looks like a mess.  I'm quoting a bit more here:

>/* +1 to convert latch counts to loop iteration counts,
>   -min_epilogue_iters to remove iterations that cannot be performed
> by the vector code.  */
  int bias_for_lowest = 1 - min_epilogue_iters;
  int bias_for_assumed = bias_for_lowest;

The variable names and comments now have nothing to do with the
actual magic we compute into them.

I think it would be an improvement to disentangle this a bit like
doing

   /* +1 to convert latch counts t

[committed] c++: Fix ICE with weird copy assignment operator [PR114572]

2024-04-05 Thread Jakub Jelinek

Hi!

While ctors/dtors don't return anything (undeclared void or this pointer
on arm) and copy assignment operators normally return a reference to *this,
it isn't invalid to return uselessly some class object which might need
destructing, but the OpenMP clause handling code wasn't expecting that.

The following patch fixes that.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk so
far.

2024-04-05  Jakub Jelinek  

PR c++/114572
* cp-gimplify.cc (cxx_omp_clause_apply_fn): Call build_cplus_new
on build_call_a result if it has class type.

* testsuite/libgomp.c++/pr114572.C: New test.

--- gcc/cp/cp-gimplify.cc.jj2024-04-02 13:07:58.540385210 +0200
+++ gcc/cp/cp-gimplify.cc   2024-04-04 16:41:50.781008370 +0200
@@ -2480,6 +2480,8 @@ cxx_omp_clause_apply_fn (tree fn, tree a
   TREE_PURPOSE (parm), fn,
   i - is_method, tf_warning_or_error);
   t = build_call_a (fn, i, argarray);
+  if (MAYBE_CLASS_TYPE_P (TREE_TYPE (t)))
+   t = build_cplus_new (TREE_TYPE (t), t, tf_warning_or_error);
   t = fold_convert (void_type_node, t);
   t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);
   append_to_statement_list (t, &ret);
@@ -2513,6 +2515,8 @@ cxx_omp_clause_apply_fn (tree fn, tree a
   TREE_PURPOSE (parm), fn,
   i - is_method, tf_warning_or_error);
   t = build_call_a (fn, i, argarray);
+  if (MAYBE_CLASS_TYPE_P (TREE_TYPE (t)))
+   t = build_cplus_new (TREE_TYPE (t), t, tf_warning_or_error);
   t = fold_convert (void_type_node, t);
   return fold_build_cleanup_point_expr (TREE_TYPE (t), t);
 }
--- libgomp/testsuite/libgomp.c++/pr114572.C.jj 2024-04-04 16:45:27.070034421 
+0200
+++ libgomp/testsuite/libgomp.c++/pr114572.C2024-04-04 16:45:03.529358222 
+0200
@@ -0,0 +1,24 @@
+// PR c++/114572
+// { dg-do run }
+// { dg-options "-fopenmp -O0" }
+
+#include 
+
+struct S
+{
+  S () : s (0) {}
+  ~S () {}
+  S operator= (const S &x) { s = x.s; return *this; }
+  int s;
+};
+
+int
+main ()
+{
+  S s;
+  #pragma omp parallel for lastprivate(s)
+  for (int i = 0; i < 10; ++i)
+s.s = i;
+  if (s.s != 9)
+abort ();
+}

Jakub

[PATCH] c++: Fix up maybe_warn_for_constant_evaluated calls [PR114580]

2024-04-05 Thread Jakub Jelinek

Hi!

When looking at maybe_warn_for_constant_evaluated for the trivial
infinite loops patch, I've noticed that it can emit weird diagnostics
for if constexpr in templates, first warn that std::is_constant_evaluted()
always evaluates to false (because the function template is not constexpr)
and then during instantiation warn that std::is_constant_evaluted()
always evaluates to true (because it is used in if constexpr condition).
Now, only the latter is actually true, even when the if constexpr
is in a non-constexpr function, it will still always evaluate to true.

So, the following patch fixes it to call maybe_warn_for_constant_evaluated
always with IF_STMT_CONSTEXPR_P (if_stmt) as the second argument rather than
true if it is if constexpr with non-dependent condition etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-05  Jakub Jelinek  

PR c++/114580
* semantics.cc (finish_if_stmt_cond): Call
maybe_warn_for_constant_evaluated with IF_STMT_CONSTEXPR_P (if_stmt)
as the second argument, rather than true/false depending on if
it is if constexpr with non-dependent constant expression with
bool type.

* g++.dg/cpp2a/is-constant-evaluated15.C: New test.

--- gcc/cp/semantics.cc.jj  2024-04-03 09:58:33.407772541 +0200
+++ gcc/cp/semantics.cc 2024-04-04 12:11:36.203886572 +0200
@@ -1126,6 +1126,9 @@ tree
 finish_if_stmt_cond (tree orig_cond, tree if_stmt)
 {
   tree cond = maybe_convert_cond (orig_cond);
+  maybe_warn_for_constant_evaluated (cond,
+/*constexpr_if=*/
+IF_STMT_CONSTEXPR_P (if_stmt));
   if (IF_STMT_CONSTEXPR_P (if_stmt)
   && !type_dependent_expression_p (cond)
   && require_constant_expression (cond)
@@ -1134,16 +1137,11 @@ finish_if_stmt_cond (tree orig_cond, tre
 converted to bool.  */
   && TYPE_MAIN_VARIANT (TREE_TYPE (cond)) == boolean_type_node)
 {
-  maybe_warn_for_constant_evaluated (cond, /*constexpr_if=*/true);
   cond = instantiate_non_dependent_expr (cond);
   cond = cxx_constant_value (cond);
 }
-  else
-{
-  maybe_warn_for_constant_evaluated (cond, /*constexpr_if=*/false);
-  if (processing_template_decl)
-   cond = orig_cond;
-}
+  else if (processing_template_decl)
+cond = orig_cond;
   finish_cond (&IF_COND (if_stmt), cond);
   add_stmt (if_stmt);
   THEN_CLAUSE (if_stmt) = push_stmt_list ();
--- gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated15.C.jj 2024-04-04 
12:23:36.706962932 +0200
+++ gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated15.C2024-04-04 
12:22:29.915882859 +0200
@@ -0,0 +1,28 @@
+// PR c++/114580
+// { dg-do compile { target c++17 } }
+// { dg-options "-Wtautological-compare" }
+
+namespace std {
+  constexpr inline bool
+  is_constant_evaluated () noexcept
+  {
+#if __cpp_if_consteval >= 202106L
+if consteval { return true; } else { return false; }
+#else
+return __builtin_is_constant_evaluated ();
+#endif
+  }
+}
+
+template 
+void foo ()
+{
+  if constexpr ((T) std::is_constant_evaluated ()) // { dg-warning 
"'std::is_constant_evaluated' always evaluates to true in 'if constexpr'" }
+;  // { dg-bogus 
"'std::is_constant_evaluated' always evaluates to false in a non-'constexpr' 
function" }
+}
+
+void
+bar ()
+{
+  foo  ();
+}

Jakub

[PATCH] c++, v2: Implement C++26 P2809R3 - Trivial infinite loops are not Undefined Behavior

2024-04-05 Thread Jakub Jelinek

Hi!

Here is a new version of the PR114462 P2809R3 patch.
As clarified on core, for trivially empty iteration statements
(where the condition isn't empty or INTEGER_CST, because those
loops can't contain std::is_constant_evaluated() in the condition
and the middle-end handles them right even with -ffinite-loops)
it attempts to fold_non_dependent_expr with manifestly_const_eval=true
the expression and if that returns integer_nonzerop, treats it
as trivial infinite loops, but keeps the condition as is.
Instead for -ffinite-loops it adds ANNOTATE_EXPR that the loop is
maybe infinite to override -ffinite-loops behavior for the particular loop.

And, it also emits maybe_warn_for_constant_evaluated warnings for loop
conditions.  For non-trivially empty loops or in immediate functions
or if !processing_template_decl and the condition of trivially empty loops
is not a constant expression or doesn't evaluate to integer non-zero
warns like for condition of non-constexpr if, and in non-constexpr functions
when the condition constant evaluates to true warns that the code has
weird behavior, that it evaluates to true when checking if the loop
is trivially infinite and to false at runtime.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-05  Jakub Jelinek  

PR c++/114462
gcc/
* tree-core.h (enum annot_expr_kind): Add
annot_expr_maybe_infinite_kind enumerator.
* gimplify.cc (gimple_boolify): Handle annot_expr_maybe_infinite_kind.
* tree-cfg.cc (replace_loop_annotate_in_block): Likewise.
(replace_loop_annotate): Likewise.  Move loop->finite_p initialization
before the replace_loop_annotate_in_block calls.
* tree-pretty-print.cc (dump_generic_node): Handle
annot_expr_maybe_infinite_kind.
gcc/cp/
* semantics.cc: Implement C++26 P2809R3 - Trivial infinite
loops are not Undefined Behavior.
(maybe_warn_for_constant_evaluated): Add trivial_infinite argument
and emit special diagnostics for that case.
(finish_if_stmt_cond): Adjust caller.
(finish_loop_cond): New function.
(finish_while_stmt): Use it.
(finish_do_stmt): Likewise.
(finish_for_stmt): Likewise.
gcc/testsuite/
* g++.dg/cpp26/trivial-infinite-loop1.C: New test.
* g++.dg/cpp26/trivial-infinite-loop2.C: New test.
* g++.dg/cpp26/trivial-infinite-loop3.C: New test.

--- gcc/tree-core.h.jj  2024-03-15 09:13:35.333541416 +0100
+++ gcc/tree-core.h 2024-04-04 12:39:42.707652592 +0200
@@ -983,6 +983,7 @@ enum annot_expr_kind {
   annot_expr_no_vector_kind,
   annot_expr_vector_kind,
   annot_expr_parallel_kind,
+  annot_expr_maybe_infinite_kind,
   annot_expr_kind_last
 };
 
--- gcc/gimplify.cc.jj  2024-03-11 12:36:18.494005912 +0100
+++ gcc/gimplify.cc 2024-04-04 12:40:44.619799465 +0200
@@ -4516,6 +4516,7 @@ gimple_boolify (tree expr)
case annot_expr_no_vector_kind:
case annot_expr_vector_kind:
case annot_expr_parallel_kind:
+   case annot_expr_maybe_infinite_kind:
  TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
  if (TREE_CODE (type) != BOOLEAN_TYPE)
TREE_TYPE (expr) = boolean_type_node;
--- gcc/tree-cfg.cc.jj  2024-02-15 09:51:34.608063945 +0100
+++ gcc/tree-cfg.cc 2024-04-04 12:42:38.271233376 +0200
@@ -297,6 +297,9 @@ replace_loop_annotate_in_block (basic_bl
  loop->can_be_parallel = true;
  loop->safelen = INT_MAX;
  break;
+   case annot_expr_maybe_infinite_kind:
+ loop->finite_p = false;
+ break;
default:
  gcc_unreachable ();
}
@@ -320,12 +323,12 @@ replace_loop_annotate (void)
 
   for (auto loop : loops_list (cfun, 0))
 {
+  /* Push the global flag_finite_loops state down to individual loops.  */
+  loop->finite_p = flag_finite_loops;
+
   /* Check all exit source blocks for annotations.  */
   for (auto e : get_loop_exit_edges (loop))
replace_loop_annotate_in_block (e->src, loop);
-
-  /* Push the global flag_finite_loops state down to individual loops.  */
-  loop->finite_p = flag_finite_loops;
 }
 
   /* Remove IFN_ANNOTATE.  Safeguard for the case loop->latch == NULL.  */
@@ -347,6 +350,7 @@ replace_loop_annotate (void)
case annot_expr_no_vector_kind:
case annot_expr_vector_kind:
case annot_expr_parallel_kind:
+   case annot_expr_maybe_infinite_kind:
  break;
default:
  gcc_unreachable ();
--- gcc/tree-pretty-print.cc.jj 2024-03-14 14:07:34.303423928 +0100
+++ gcc/tree-pretty-print.cc2024-04-04 12:39:35.369753709 +0200
@@ -3479,6 +3479,9 @@ dump_generic_node (pretty_printer *pp, t
case annot_expr_parallel_kind:
  pp_string (pp, ", parallel");
  break;
+   case annot_expr_maybe_infinite_kind:
+ pp_string (pp, ", maybe-infinite");
+ break;
def

[PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread Richard Biener

There's no default bitmap obstack during global CTORs, so allocate the
bitmap locally.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

PR middle-end/114599
* symtab.cc (ifunc_ref_map): Do not use auto_bitmap.
(is_caller_ifunc_resolver): Optimize bitmap_bit_p/bitmap_set_bit
pair.
(symtab_node::check_ifunc_callee_symtab_nodes): Properly
allocate ifunc_ref_map here.
---
 gcc/symtab.cc | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/symtab.cc b/gcc/symtab.cc
index 3256133891d..3b018ab3ea2 100644
--- a/gcc/symtab.cc
+++ b/gcc/symtab.cc
@@ -1383,7 +1383,7 @@ check_ifunc_resolver (cgraph_node *node, void *data)
   return false;
 }
 
-static auto_bitmap ifunc_ref_map;
+static bitmap ifunc_ref_map;
 
 /* Return true if any caller of NODE is an ifunc resolver.  */
 
@@ -1404,9 +1404,8 @@ is_caller_ifunc_resolver (cgraph_node *node)
 
   /* Skip if it has been visited.  */
   unsigned int uid = e->caller->get_uid ();
-  if (bitmap_bit_p (ifunc_ref_map, uid))
+  if (!bitmap_set_bit (ifunc_ref_map, uid))
continue;
-  bitmap_set_bit (ifunc_ref_map, uid);
 
   if (is_caller_ifunc_resolver (e->caller))
{
@@ -1437,6 +1436,9 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
 {
   symtab_node *node;
 
+  bitmap_obstack_initialize (NULL);
+  ifunc_ref_map = BITMAP_ALLOC (NULL);
+
   FOR_EACH_SYMBOL (node)
 {
   cgraph_node *cnode = dyn_cast  (node);
@@ -1455,7 +1457,8 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
cnode->called_by_ifunc_resolver = true;
 }
 
-  bitmap_clear (ifunc_ref_map);
+  BITMAP_FREE (ifunc_ref_map);
+  bitmap_obstack_release (NULL);
 }
 
 /* Verify symbol table for internal consistency.  */
-- 
2.35.3

[PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-05 Thread Ajit Agarwal

Hello Alex/Richard:

All review comments are incorporated.

Common infrastructure of load store pair fusion is divided into target
independent and target dependent changed code.

Target independent code is the Generic code with pure virtual function
to interface betwwen target independent and dependent code.

Target dependent code is the implementation of pure virtual function for
aarch64 target and the call to target independent code.

Thanks & Regards
Ajit


aarch64: Place target independent and dependent changed code in one file

Common infrastructure of load store pair fusion is divided into target
independent and target dependent changed code.

Target independent code is the Generic code with pure virtual function
to interface betwwen target independent and dependent code.

Target dependent code is the implementation of pure virtual function for
aarch64 target and the call to target independent code.

2024-04-06  Ajit Kumar Agarwal  

gcc/ChangeLog:

* config/aarch64/aarch64-ldp-fusion.cc: Place target
independent and dependent changed code.
---
 gcc/config/aarch64/aarch64-ldp-fusion.cc | 371 +++
 1 file changed, 249 insertions(+), 122 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 22ed95eb743..cb21b514ef7 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -138,8 +138,122 @@ struct alt_base
   poly_int64 offset;
 };
 
+// Virtual base class for load/store walkers used in alias analysis.
+struct alias_walker
+{
+  virtual bool conflict_p (int &budget) const = 0;
+  virtual insn_info *insn () const = 0;
+  virtual bool valid () const  = 0;
+  virtual void advance () = 0;
+};
+
+struct pair_fusion {
+
+  pair_fusion () {};
+  virtual bool fpsimd_op_p (rtx reg_op, machine_mode mem_mode,
+  bool load_p) = 0;
+
+  virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0;
+  virtual bool pair_trailing_writeback_p () = 0;
+  virtual bool pair_reg_operand_ok_p (bool load_p, rtx reg_op,
+ machine_mode mem_mode) = 0;
+  virtual int pair_mem_alias_check_limit () = 0;
+  virtual bool handle_writeback_opportunities () = 0 ;
+  virtual bool pair_mem_ok_with_policy (rtx first_mem, bool load_p,
+   machine_mode mode) = 0;
+  virtual rtx gen_mem_pair (rtx *pats,  rtx writeback,
+   bool load_p) = 0;
+  virtual bool pair_mem_promote_writeback_p (rtx pat) = 0;
+  virtual bool track_load_p () = 0;
+  virtual bool track_store_p () = 0;
+  virtual bool cand_insns_empty_p (std::list &insns) = 0;
+  virtual bool pair_mem_in_range_p (HOST_WIDE_INT off) = 0;
+  void ldp_fusion_bb (bb_info *bb);
+
+  ~pair_fusion() { }
+};
+
+struct aarch64_pair_fusion : public pair_fusion
+{
+public:
+  aarch64_pair_fusion () : pair_fusion () {};
+  bool fpsimd_op_p (rtx reg_op, machine_mode mem_mode,
+   bool load_p) override final
+  {
+const bool fpsimd_op_p
+  = reload_completed
+  ? (REG_P (reg_op) && FP_REGNUM_P (REGNO (reg_op)))
+  : (GET_MODE_CLASS (mem_mode) != MODE_INT
+&& (load_p || !aarch64_const_zero_rtx_p (reg_op)));
+return fpsimd_op_p;
+  }
+
+  bool pair_mem_promote_writeback_p (rtx pat)
+  {
+ if (reload_completed
+&& aarch64_ldp_writeback > 1
+&& GET_CODE (pat) == PARALLEL
+&& XVECLEN (pat, 0) == 2)
+   return true;
+
+ return false;
+  }
+
+  bool pair_mem_ok_with_policy (rtx first_mem, bool load_p,
+   machine_mode mode)
+  {
+return aarch64_mem_ok_with_ldpstp_policy_model (first_mem,
+load_p,
+mode);
+  }
+  bool pair_operand_mode_ok_p (machine_mode mode);
+
+  rtx gen_mem_pair (rtx *pats, rtx writeback, bool load_p);
+
+  bool pair_trailing_writeback_p  ()
+  {
+return aarch64_ldp_writeback > 1;
+  }
+  bool pair_reg_operand_ok_p (bool load_p, rtx reg_op,
+ machine_mode mem_mode)
+  {
+return (load_p
+? aarch64_ldp_reg_operand (reg_op, mem_mode)
+: aarch64_stp_reg_operand (reg_op, mem_mode));
+  }
+  int pair_mem_alias_check_limit ()
+  {
+return aarch64_ldp_alias_check_limit;
+  }
+  bool handle_writeback_opportunities ()
+  {
+return aarch64_ldp_writeback;
+  }
+  bool track_load_p ()
+  {
+const bool track_loads
+  = aarch64_tune_params.ldp_policy_model != AARCH64_LDP_STP_POLICY_NEVER;
+return track_loads;
+  }
+  bool track_store_p ()
+  {
+const bool track_stores
+  = aarch64_tune_params.stp_policy_model != AARCH64_LDP_STP_POLICY_NEVER;
+return track_stores;
+  }
+  bool cand_insns_empty_p (std::list &insns)
+  {
+return insns.empty();
+  }
+  bool pair_mem_in_range_p (HOST_WIDE_INT off)
+  {
+return (off < LDP_MIN_IMM || off

[PATCH] testsuite: Fix up error on gcov1.d

2024-04-05 Thread Jakub Jelinek

On Fri, Feb 23, 2024 at 12:18:00PM +0100, Jørgen Kvalsvik wrote:
> This is a mostly straight port from the gcov-19.c tests from the C test
> suite. The only notable differences from C to D are that D flips the
> true/false outcomes for loop headers, and the D front end ties loop and
> ternary conditions to slightly different locus.
> 
> The test for >64 conditions warning is disabled as it either needs
> support from the testing framework or a something similar to #pragma GCC
> diagnostic push to not cause a test failure from detecting a warning.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gdc.dg/gcov.exp: New test.
>   * gdc.dg/gcov1.d: New test.

Unfortunately, this doesn't work.
I see
PASS: gdc.dg/gcov1.d   execution test
ERROR: (DejaGnu) proc "run-gcov conditions { --conditions gcov1.d }" does not 
exist.
The error code is TCL LOOKUP COMMAND run-gcov
The info on the error is:
invalid command name "run-gcov"
while executing
"::tcl_unknown run-gcov conditions { --conditions gcov1.d }"
("uplevel" body line 1)
invoked from within
"uplevel 1 ::tcl_unknown $args"
ERROR: gdc.dg/gcov1.d  : error executing dg-final: invalid command name 
"run-gcov"
both on x86_64-linux and i686-linux.
The problem is that the test hasn't been added to a new directory, but
to a directory already covered by a different *.exp file - dg.exp.
Now, usually either one has a test directory like gcc.misc-tests where
there are many *.exp files but each *.exp file globs for its own tests,
or there is one *.exp per directory and covers everything in there.
By having both dg.exp and gcov.exp in the same directory with dg.exp
covering all *.d files in there and gcov gcov*.d in there, the gcov*.d
tests are tested twice, once using the dg.exp driver and once using gcov.exp
driver.  With the latter, they do work properly, with the former they don't
because gcov.exp lib file isn't loaded and so run-gcov isn't available.

The following patch fixes that similarly how g++.dg/modules/modules.exp,
gcc.target/s390/s390.exp or gcc.target/i386/i386.exp deal with that,
by pruning some tests based on glob patterns from the list.

Tested on x86_64-linux with make -j32 check-d, ok for trunk?

2024-04-05  Jakub Jelinek  

* gdc.dg/dg.exp: Prune gcov*.d from the list of tests to run.
* gdc.dg/gcov.exp: Update copyright years.

--- gcc/testsuite/gdc.dg/dg.exp.jj  2024-01-03 22:33:38.249693029 +0100
+++ gcc/testsuite/gdc.dg/dg.exp 2024-04-05 10:20:13.518823037 +0200
@@ -30,7 +30,8 @@ dg-init

 # Main loop.
 gdc-dg-runtest [lsort \
-   [glob -nocomplain $srcdir/$subdir/*.d ] ] "" $DEFAULT_DFLAGS
+   [prune [glob -nocomplain $srcdir/$subdir/*.d ] \
+ $srcdir/$subdir/gcov*.d ] ] "" $DEFAULT_DFLAGS

 # All done.
 dg-finish
--- gcc/testsuite/gdc.dg/gcov.exp.jj2024-04-04 21:45:56.025155257 +0200
+++ gcc/testsuite/gdc.dg/gcov.exp   2024-04-05 10:20:23.678682559 +0200
@@ -1,4 +1,4 @@
-#   Copyright (C) 1997-2023 Free Software Foundation, Inc.
+#   Copyright (C) 1997-2024 Free Software Foundation, Inc.

 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by

Jakub

Re: [PATCH V3 0/2] aarch64: Place target independent and dependent changed code in one file.

2024-04-05 Thread Ajit Agarwal

Hello Alex:

On 03/04/24 8:51 pm, Alex Coplan wrote:
> On 23/02/2024 16:41, Ajit Agarwal wrote:
>> Hello Richard/Alex/Segher:
> 
> Hi Ajit,
> 
> Sorry for the delay and thanks for working on this.
> 
> Generally this looks like the right sort of approach (IMO) but I've left
> some comments below.
> 
> I'll start with a meta comment: in the subject line you have marked this
> as 0/2, but usually 0/n is reserved for the cover letter of a patch
> series and wouldn't contain an actual patch.  I think this might have
> confused the Linaro CI suitably such that it didn't run regression tests
> on the patch.
> 
Sorry for that. I have changed that in latest patch.
>>
>> This patch adds the changed code for target independent and
>> dependent code for load store fusion.
>>
>> Common infrastructure of load store pair fusion is
>> divided into target independent and target dependent
>> changed code.
>>
>> Target independent code is the Generic code with
>> pure virtual function to interface betwwen target
>> independent and dependent code.
>>
>> Target dependent code is the implementation of pure
>> virtual function for aarch64 target and the call
>> to target independent code.
>>
>> Bootstrapped for aarch64-linux-gnu.
>>
>> Thanks & Regards
>> Ajit
>>
>> aarch64: Place target independent and dependent changed code in one file.
>>
>> Common infrastructure of load store pair fusion is
>> divided into target independent and target dependent
>> changed code.
>>
>> Target independent code is the Generic code with
>> pure virtual function to interface betwwen target
>> independent and dependent code.
>>
>> Target dependent code is the implementation of pure
>> virtual function for aarch64 target and the call
>> to target independent code.
>>
>> 2024-02-23  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-ldp-fusion.cc: Place target
>>  independent and dependent changed code.
>> ---
>>  gcc/config/aarch64/aarch64-ldp-fusion.cc | 437 ---
>>  1 file changed, 305 insertions(+), 132 deletions(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
>> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
>> index 22ed95eb743..2ef22ff1e96 100644
>> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
>> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
>> @@ -40,10 +40,10 @@
>>  
>>  using namespace rtl_ssa;
>>  
>> -static constexpr HOST_WIDE_INT LDP_IMM_BITS = 7;
>> -static constexpr HOST_WIDE_INT LDP_IMM_SIGN_BIT = (1 << (LDP_IMM_BITS - 1));
>> -static constexpr HOST_WIDE_INT LDP_MAX_IMM = LDP_IMM_SIGN_BIT - 1;
>> -static constexpr HOST_WIDE_INT LDP_MIN_IMM = -LDP_MAX_IMM - 1;
>> +static constexpr HOST_WIDE_INT PAIR_MEM_IMM_BITS = 7;
>> +static constexpr HOST_WIDE_INT PAIR_MEM_IMM_SIGN_BIT = (1 << 
>> (PAIR_MEM_IMM_BITS - 1));
>> +static constexpr HOST_WIDE_INT PAIR_MEM_MAX_IMM = PAIR_MEM_IMM_SIGN_BIT - 1;
>> +static constexpr HOST_WIDE_INT PAIR_MEM_MIN_IMM = -PAIR_MEM_MAX_IMM - 1;
> 
> These constants shouldn't be renamed: they are specific to aarch64 so the
> original names should be preserved in this file.
> 
> I expect we want to introduce virtual functions to validate an offset
> for a paired access.  The aarch64 code could then implement it by
> comparing the offset against LDP_{MIN,MAX}_IMM, and the generic code
> could use that hook to replace the code that queries those constants
> directly (i.e. in find_trailing_add and get_viable_bases).
> 
>>  
>>  // We pack these fields (load_p, fpsimd_p, and size) into an integer
>>  // (LFS) which we use as part of the key into the main hash tables.
>> @@ -138,8 +138,18 @@ struct alt_base
>>poly_int64 offset;
>>  };
>>  
>> +// Virtual base class for load/store walkers used in alias analysis.
>> +struct alias_walker
>> +{
>> +  virtual bool conflict_p (int &budget) const = 0;
>> +  virtual insn_info *insn () const = 0;
>> +  virtual bool valid () const  = 0;
>> +  virtual void advance () = 0;
>> +};
>> +
>> +
>>  // State used by the pass for a given basic block.
>> -struct ldp_bb_info
>> +struct pair_fusion
> 
> As a comment on the high-level design, I think we want a generic class
> for the overall pass, not just for the BB-specific structure.
> 
> That is because naturally we want the ldp_fusion_bb function itself to
> be a member of such a class, so that it can access virtual functions to
> query the target e.g. about the load/store pair policy, and whether to
> try and promote writeback pairs.
> 
> If we keep all of the virtual functions in such an outer class, then we
> can keep the ldp_fusion_bb class generic (not needing an override for
> each target) and that inner class can perhaps be given a pointer or
> reference to the outer class when it is instantiated in ldp_fusion_bb.
> 

Above comments is addressed in latest patch.

>>  {
>>using def_hash = nofree_ptr_hash;
>>using expr_key_t = pair_hash>;
>> @@ -161,13 +171,13 @@ struct ldp_bb_info
>>static const size_t obstack_alignment = sizeof (void *);
>>bb_inf

Re: [PATCH] testsuite: Fix up error on gcov1.d

2024-04-05 Thread Richard Biener

On Fri, 5 Apr 2024, Jakub Jelinek wrote:

> On Fri, Feb 23, 2024 at 12:18:00PM +0100, Jørgen Kvalsvik wrote:
> > This is a mostly straight port from the gcov-19.c tests from the C test
> > suite. The only notable differences from C to D are that D flips the
> > true/false outcomes for loop headers, and the D front end ties loop and
> > ternary conditions to slightly different locus.
> > 
> > The test for >64 conditions warning is disabled as it either needs
> > support from the testing framework or a something similar to #pragma GCC
> > diagnostic push to not cause a test failure from detecting a warning.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gdc.dg/gcov.exp: New test.
> > * gdc.dg/gcov1.d: New test.
> 
> Unfortunately, this doesn't work.
> I see
> PASS: gdc.dg/gcov1.d   execution test
> ERROR: (DejaGnu) proc "run-gcov conditions { --conditions gcov1.d }" does not 
> exist.
> The error code is TCL LOOKUP COMMAND run-gcov
> The info on the error is:
> invalid command name "run-gcov"
> while executing
> "::tcl_unknown run-gcov conditions { --conditions gcov1.d }"
> ("uplevel" body line 1)
> invoked from within
> "uplevel 1 ::tcl_unknown $args"
> ERROR: gdc.dg/gcov1.d  : error executing dg-final: invalid command name 
> "run-gcov"
> both on x86_64-linux and i686-linux.
> The problem is that the test hasn't been added to a new directory, but
> to a directory already covered by a different *.exp file - dg.exp.
> Now, usually either one has a test directory like gcc.misc-tests where
> there are many *.exp files but each *.exp file globs for its own tests,
> or there is one *.exp per directory and covers everything in there.
> By having both dg.exp and gcov.exp in the same directory with dg.exp
> covering all *.d files in there and gcov gcov*.d in there, the gcov*.d
> tests are tested twice, once using the dg.exp driver and once using gcov.exp
> driver.  With the latter, they do work properly, with the former they don't
> because gcov.exp lib file isn't loaded and so run-gcov isn't available.
> 
> The following patch fixes that similarly how g++.dg/modules/modules.exp,
> gcc.target/s390/s390.exp or gcc.target/i386/i386.exp deal with that,
> by pruning some tests based on glob patterns from the list.
> 
> Tested on x86_64-linux with make -j32 check-d, ok for trunk?

OK.

Richard.

> 2024-04-05  Jakub Jelinek  
> 
>   * gdc.dg/dg.exp: Prune gcov*.d from the list of tests to run.
>   * gdc.dg/gcov.exp: Update copyright years.
> 
> --- gcc/testsuite/gdc.dg/dg.exp.jj2024-01-03 22:33:38.249693029 +0100
> +++ gcc/testsuite/gdc.dg/dg.exp   2024-04-05 10:20:13.518823037 +0200
> @@ -30,7 +30,8 @@ dg-init
>  
>  # Main loop.
>  gdc-dg-runtest [lsort \
> -   [glob -nocomplain $srcdir/$subdir/*.d ] ] "" $DEFAULT_DFLAGS
> +   [prune [glob -nocomplain $srcdir/$subdir/*.d ] \
> +   $srcdir/$subdir/gcov*.d ] ] "" $DEFAULT_DFLAGS
>  
>  # All done.
>  dg-finish
> --- gcc/testsuite/gdc.dg/gcov.exp.jj  2024-04-04 21:45:56.025155257 +0200
> +++ gcc/testsuite/gdc.dg/gcov.exp 2024-04-05 10:20:23.678682559 +0200
> @@ -1,4 +1,4 @@
> -#   Copyright (C) 1997-2023 Free Software Foundation, Inc.
> +#   Copyright (C) 1997-2024 Free Software Foundation, Inc.
>  
>  # This program is free software; you can redistribute it and/or modify
>  # it under the terms of the GNU General Public License as published by
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566]

2024-04-05 Thread Jakub Jelinek

Hi!

The following testcase is miscompiled, because in the vectorized
epilogue the vectorizer assumes it can use aligned loads/stores
(if the base decl gets alignment increased), but it actually doesn't
increase that.
This is because r10-4203-g97c1460367 added the hunk following
patch removes.  The explanation feels reasonable, but actually it
is not true as the testcase proves.
The thing is, we vectorize the main loop with 64-byte vectors
and the corresponding data refs have base_alignment 16 (the
a array has DECL_ALIGN 128) and offset_alignment 32.  Now, because
of the offset_alignment 32 rather than 64, we need to use unaligned
loads/stores in the main loop (and ditto in the first load/store
in vectorized epilogue).  But the second load/store in the vectorized
epilogue uses only 32-byte vectors and because it is a multiple
of offset_alignment, it checks if we could increase alignment of the
a VAR_DECL, the function returns true, sets base_misaligned = true
and says the access is then aligned.
But when update_epilogue_loop_vinfo clears base_misaligned with the
assumption that the var had to have the alignment increased already,
the update of DECL_ALIGN doesn't happen anymore.

Now, I'd think this base_alignment = false was needed before
r10-4030-gd2db7f7901 change was committed where it incorrectly
overwrote DECL_ALIGN even if it was already larger, rather than
just always increasing it.  But with that change in, it doesn't
make sense to me anymore.

Note, the testcase is latent on the trunk, but reproduces on the 13
branch.

Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk,
plus tested with the testcase on 13 branch with -m32/-m64 without/with
the tree-vect-loop.cc patch (where it FAILed before and now PASSes).
Ok for trunk?

2024-04-05  Jakub Jelinek  

PR tree-optimization/114566
* tree-vect-loop.cc (update_epilogue_loop_vinfo): Don't clear
base_misaligned.

* gcc.target/i386/avx512f-pr114566.c: New test.

--- gcc/tree-vect-loop.cc.jj2024-04-04 00:48:05.932072711 +0200
+++ gcc/tree-vect-loop.cc   2024-04-05 00:59:33.743101468 +0200
@@ -11590,9 +11590,7 @@ find_in_mapping (tree t, void *context)
corresponding dr_vec_info need to be reconnected to the EPILOGUE's
stmt_vec_infos, their statements need to point to their corresponding copy,
if they are gather loads or scatter stores then their reference needs to be
-   updated to point to its corresponding copy and finally we set
-   'base_misaligned' to false as we have already peeled for alignment in the
-   prologue of the main loop.  */
+   updated to point to its corresponding copy.  */
 
 static void
 update_epilogue_loop_vinfo (class loop *epilogue, tree advance)
@@ -11736,10 +11734,6 @@ update_epilogue_loop_vinfo (class loop *
}
   DR_STMT (dr) = STMT_VINFO_STMT (stmt_vinfo);
   stmt_vinfo->dr_aux.stmt = stmt_vinfo;
-  /* The vector size of the epilogue is smaller than that of the main loop
-so the alignment is either the same or lower. This means the dr will
-thus by definition be aligned.  */
-  STMT_VINFO_DR_INFO (stmt_vinfo)->base_misaligned = false;
 }
 
   epilogue_vinfo->shared->datarefs_copy.release ();
--- gcc/testsuite/gcc.target/i386/avx512f-pr114566.c.jj 2024-04-05 
11:21:04.282639386 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr114566.c2024-04-05 
11:21:04.282639386 +0200
@@ -0,0 +1,34 @@
+/* PR tree-optimization/114566 */
+/* { dg-do run } */
+/* { dg-options "-O3 -mavx512f" } */
+/* { dg-additional-options "-fstack-protector-strong" { target 
fstack_protector } } */
+/* { dg-require-effective-target avx512f } */
+
+#define AVX512F
+#include "avx512f-helper.h"
+
+__attribute__((noipa)) int
+foo (float x, float y)
+{
+  float a[8][56];
+  __builtin_memset (a, 0, sizeof (a));
+
+  for (int j = 0; j < 8; j++)
+for (int k = 0; k < 56; k++)
+  {
+   float b = k * y;
+   if (b < 0.)
+ b = 0.;
+   if (b > 0.)
+ b = 0.;
+   a[j][k] += b;
+  }
+
+  return __builtin_log (x);
+}
+
+void
+TEST (void)
+{
+  foo (86.25f, 0.625f);
+}

Jakub

[committed] libatomic: Regenerate configure properly

2024-04-05 Thread Jakub Jelinek

Hi!

On Tue, Mar 26, 2024 at 11:55:41AM +, Wilco Dijkstra wrote:
> * acinclude.m4: Remove ARCH_AARCH64_HAVE_LSE128.
> * configure: Regenerated.

Seems configure hasn't been regenerated properly after the last
acinclude.m4 change as e.g. noticed by the autoregen CI.

I've committed this to trunk as obvious.

2024-04-05  Jakub Jelinek  

* configure: Regenerate.

--- libatomic/configure.jj  2024-04-05 09:19:48.593040783 +0200
+++ libatomic/configure 2024-04-05 12:13:08.315682702 +0200
@@ -11456,7 +11456,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11461 "configure"
+#line 11459 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11562,7 +11562,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11567 "configure"
+#line 11565 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -14739,6 +14739,7 @@ _ACEOF
 
 
 
+
  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether byte ordering is 
bigendian" >&5
 $as_echo_n "checking whether byte ordering is bigendian... " >&6; }
 if ${ac_cv_c_bigendian+:} false; then :
@@ -16031,6 +16032,7 @@ if test -z "${ENABLE_DARWIN_AT_RPATH_TRU
   as_fn_error $? "conditional \"ENABLE_DARWIN_AT_RPATH\" was never defined.
 Usually this means the macro was only invoked conditionally." "$LINENO" 5
 fi
+
 if test -z "${LIBAT_BUILD_VERSIONED_SHLIB_TRUE}" && test -z 
"${LIBAT_BUILD_VERSIONED_SHLIB_FALSE}"; then
   as_fn_error $? "conditional \"LIBAT_BUILD_VERSIONED_SHLIB\" was never 
defined.
 Usually this means the macro was only invoked conditionally." "$LINENO" 5


Jakub

Re: [PATCH] modula2: Add m2.install-dvi in gcc/m2/Make-lang.in

2024-04-05 Thread Gaius Mulley

Christophe Lyon  writes:

> m2 has a m2.dvi build rule, but lacks the m2.install-dvi one.
>
> 2024-04-04  Christophe Lyon  
>
>   gcc/m2/
>   * Make-lang.in (m2.install-dvi): New rule.
> ---
>  gcc/m2/Make-lang.in | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
> index e56240b4c44..0abd8ce1455 100644
> --- a/gcc/m2/Make-lang.in
> +++ b/gcc/m2/Make-lang.in
> @@ -162,6 +162,18 @@ m2.dvi: doc/m2.dvi
>  doc/m2.dvi: $(TEXISRC) $(objdir)/m2/images/gnu.eps
>   $(TEXI2DVI) -c -I $(objdir)/m2 -I $(srcdir)/doc/include -o $@ 
> $(srcdir)/doc/gm2.texi
>  
> +M2_DVIFILES = doc/m2.dvi
> +
> +m2.install-dvi: $(M2_DVIFILES)
> + @$(NORMAL_INSTALL)
> + test -z "$(dvidir)/gcc" || $(mkinstalldirs) "$(DESTDIR)$(dvidir)/gcc"
> + @list='$(M2_DVIFILES)'; for p in $$list; do \
> +   if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
> +   f=$(dvi__strip_dir) \
> +   echo " $(INSTALL_DATA) '$$d$$p' '$(DESTDIR)$(dvidir)/gcc/$$f'"; \
> +   $(INSTALL_DATA) "$$d$$p" "$(DESTDIR)$(dvidir)/gcc/$$f"; \
> + done
> +
>  doc/m2.ps: doc/m2.dvi
>   dvips -o $@ $<

many thanks - lgtm,

regards,
Gaius

Re: [PATCH] vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566]

2024-04-05 Thread Richard Biener

On Fri, 5 Apr 2024, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled, because in the vectorized
> epilogue the vectorizer assumes it can use aligned loads/stores
> (if the base decl gets alignment increased), but it actually doesn't
> increase that.
> This is because r10-4203-g97c1460367 added the hunk following
> patch removes.  The explanation feels reasonable, but actually it
> is not true as the testcase proves.
> The thing is, we vectorize the main loop with 64-byte vectors
> and the corresponding data refs have base_alignment 16 (the
> a array has DECL_ALIGN 128) and offset_alignment 32.  Now, because
> of the offset_alignment 32 rather than 64, we need to use unaligned
> loads/stores in the main loop (and ditto in the first load/store
> in vectorized epilogue).  But the second load/store in the vectorized
> epilogue uses only 32-byte vectors and because it is a multiple
> of offset_alignment, it checks if we could increase alignment of the
> a VAR_DECL, the function returns true, sets base_misaligned = true
> and says the access is then aligned.
> But when update_epilogue_loop_vinfo clears base_misaligned with the
> assumption that the var had to have the alignment increased already,
> the update of DECL_ALIGN doesn't happen anymore.
> 
> Now, I'd think this base_alignment = false was needed before
> r10-4030-gd2db7f7901 change was committed where it incorrectly
> overwrote DECL_ALIGN even if it was already larger, rather than
> just always increasing it.  But with that change in, it doesn't
> make sense to me anymore.

I think it was incorrect before - basically it assumed that the
main loop would never use misaligned accesses and the epilogue
analysis would align.  But for this we'd need to make sure to
never request such during epilogue analysis.

> Note, the testcase is latent on the trunk, but reproduces on the 13
> branch.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk,
> plus tested with the testcase on 13 branch with -m32/-m64 without/with
> the tree-vect-loop.cc patch (where it FAILed before and now PASSes).
> Ok for trunk?

OK.

Thanks,
Richard.

> 2024-04-05  Jakub Jelinek  
> 
>   PR tree-optimization/114566
>   * tree-vect-loop.cc (update_epilogue_loop_vinfo): Don't clear
>   base_misaligned.
> 
>   * gcc.target/i386/avx512f-pr114566.c: New test.
> 
> --- gcc/tree-vect-loop.cc.jj  2024-04-04 00:48:05.932072711 +0200
> +++ gcc/tree-vect-loop.cc 2024-04-05 00:59:33.743101468 +0200
> @@ -11590,9 +11590,7 @@ find_in_mapping (tree t, void *context)
> corresponding dr_vec_info need to be reconnected to the EPILOGUE's
> stmt_vec_infos, their statements need to point to their corresponding 
> copy,
> if they are gather loads or scatter stores then their reference needs to 
> be
> -   updated to point to its corresponding copy and finally we set
> -   'base_misaligned' to false as we have already peeled for alignment in the
> -   prologue of the main loop.  */
> +   updated to point to its corresponding copy.  */
>  
>  static void
>  update_epilogue_loop_vinfo (class loop *epilogue, tree advance)
> @@ -11736,10 +11734,6 @@ update_epilogue_loop_vinfo (class loop *
>   }
>DR_STMT (dr) = STMT_VINFO_STMT (stmt_vinfo);
>stmt_vinfo->dr_aux.stmt = stmt_vinfo;
> -  /* The vector size of the epilogue is smaller than that of the main 
> loop
> -  so the alignment is either the same or lower. This means the dr will
> -  thus by definition be aligned.  */
> -  STMT_VINFO_DR_INFO (stmt_vinfo)->base_misaligned = false;
>  }
>  
>epilogue_vinfo->shared->datarefs_copy.release ();
> --- gcc/testsuite/gcc.target/i386/avx512f-pr114566.c.jj   2024-04-05 
> 11:21:04.282639386 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr114566.c  2024-04-05 
> 11:21:04.282639386 +0200
> @@ -0,0 +1,34 @@
> +/* PR tree-optimization/114566 */
> +/* { dg-do run } */
> +/* { dg-options "-O3 -mavx512f" } */
> +/* { dg-additional-options "-fstack-protector-strong" { target 
> fstack_protector } } */
> +/* { dg-require-effective-target avx512f } */
> +
> +#define AVX512F
> +#include "avx512f-helper.h"
> +
> +__attribute__((noipa)) int
> +foo (float x, float y)
> +{
> +  float a[8][56];
> +  __builtin_memset (a, 0, sizeof (a));
> +
> +  for (int j = 0; j < 8; j++)
> +for (int k = 0; k < 56; k++)
> +  {
> + float b = k * y;
> + if (b < 0.)
> +   b = 0.;
> + if (b > 0.)
> +   b = 0.;
> + a[j][k] += b;
> +  }
> +
> +  return __builtin_log (x);
> +}
> +
> +void
> +TEST (void)
> +{
> +  foo (86.25f, 0.625f);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v2 2/3] aarch64: Add support for aarch64-gnu (GNU/Hurd on AArch64)

2024-04-05 Thread Sergey Bugaev

Hello,

On Tue, Apr 2, 2024 at 8:26 PM Richard Sandiford
 wrote:
> I don't know if you're waiting on me, but just in case: this and patch 3
> still LGTM if Thomas is OK with them.

Thanks. Thomas asked me to resubmit with Changelog entries added (but
hasn't pointed out anything else), so this is waiting for him to
confirm that this looks OK now.

Sergey

nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_libintl: Restore 'libgomp.c/reverse-offload-sm30.c' testing (was: [Patch] nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_li

2024-04-05 Thread Thomas Schwinge

Hi!

On 2024-04-03T14:06:45+0200, Tobias Burnus  wrote:
> Nvptx's mkoffload.cc contains 14 'fatal_error' calls and one 'warning_at' 
> call,
> which stands out more clearly (color, bold) when enabling
>diagnostic_color_init
> which this patch does. — Additionally, the call gcc_init_libintl permits that
> the already translated error messages also show up as translation.
>
> OK for mainline?

But you've not regression-tested this?  Pushed to trunk branch
commit 679f81a32f706645f45900fdb1659fb5fe607f77
"nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_libintl: Restore 
'libgomp.c/reverse-offload-sm30.c' testing",
see attached.


Instead of adding support for all the '-fdiagnostics-color' variants, I
suppose we should rather switch the 'mkoffload's to use GCC's standard
option handling machinery (like in 'gcc/lto-wrapper.cc', for example)?


Grüße
 Thomas


> PS: Example: 'nvptx mkoffload:' is bold and 'fatal error:' is in red
> in English and some language variants.
>
> nvptx mkoffload: fatal error: COLLECT_GCC must be set.
> nvptx mkoffload: 致命的エラー: COLLECT_GCC must be set.
> nvptx mkoffload: erreur fatale: COLLECT_GCC doit être défini.
> nvptx mkoffload: schwerwiegender Fehler: COLLECT_GCC muss gesetzt sein.
>
> (BTW: It looks as if many languages did not translate the error string
> itself, e.g. jp or zh or pl or zh_TW/zh_CN or fi or ...)
> nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_libintl
>
> gcc/ChangeLog:
>
>   * config/nvptx/mkoffload.cc (main): Call
>   gcc_init_libintl and diagnostic_color_init.
>
>  gcc/config/nvptx/mkoffload.cc | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
> index a7fc28cbd3f..503b1abcefd 100644
> --- a/gcc/config/nvptx/mkoffload.cc
> +++ b/gcc/config/nvptx/mkoffload.cc
> @@ -638,7 +638,9 @@ main (int argc, char **argv)
>const char *outname = 0;
>  
>progname = tool_name;
> +  gcc_init_libintl ();
>diagnostic_initialize (global_dc, 0);
> +  diagnostic_color_init (global_dc);
>  
>if (atexit (mkoffload_cleanup) != 0)
>  fatal_error (input_location, "atexit failed");


>From 679f81a32f706645f45900fdb1659fb5fe607f77 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 5 Apr 2024 14:04:53 +0200
Subject: [PATCH] nvptx: In mkoffload.cc, call diagnostic_color_init +
 gcc_init_libintl: Restore 'libgomp.c/reverse-offload-sm30.c' testing

With commit 7520a4992c94254016085a461c58c972497c4483
"nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_libintl",
we regressed:

[-PASS:-]{+FAIL:+} libgomp.c/reverse-offload-sm30.c  at line 15 (test for warnings, line )
[-PASS:-]{+FAIL:+} libgomp.c/reverse-offload-sm30.c (test for excess errors)

	libgomp/
	* testsuite/libgomp.c/reverse-offload-sm30.c: Set 'GCC_COLORS' to the empty string.
---
 libgomp/testsuite/libgomp.c/reverse-offload-sm30.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
index 7f10fd4ded9..cae75f03462 100644
--- a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
+++ b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
@@ -12,4 +12,7 @@ main ()
   return 0;
 }
 
+/* The 'mkoffload's currently don't obey '-fno-diagnostics-color' etc., so use a different way to effect the same thing:
+   { dg-set-compiler-env-var GCC_COLORS "" }
+   ..., so that the following regexp doesn't have to deal with color code escape sequences.  */
 /* { dg-warning "'omp requires reverse_offload' requires at least 'sm_35' for '-foffload-options=nvptx-none=-march=' - disabling offload-code generation for this device type" "" { target *-*-* } 0 } */
-- 
2.34.1

[PATCH] Add extra copy of the ifcombine pass after pre [PR102793]

2024-04-05 Thread Manolis Tsamis

If we consider code like:

if (bar1 == x)
  return foo();
if (bar2 != y)
  return foo();
return 0;

We would like the ifcombine pass to convert this to:

if (bar1 == x || bar2 != y)
  return foo();
return 0;

The ifcombine pass can handle this transformation but it is ran very early and
it misses the opportunity because there are two seperate blocks for foo().
The pre pass is good at removing duplicate code and blocks and due to that
running ifcombine again after it can increase the number of successful
conversions.

PR 102793

gcc/ChangeLog:

* common.opt: -ftree-ifcombine option, enabled by default.
* doc/invoke.texi: Document.
* passes.def: Re-run ssa-ifcombine after pre.
* tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
* gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
* gcc.target/aarch64/pr102793.c: New test.

Signed-off-by: Manolis Tsamis 
---

 gcc/common.opt  |  4 +++
 gcc/doc/invoke.texi |  5 
 gcc/passes.def  |  1 +
 gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
 gcc/testsuite/gcc.dg/uninit-pred-6_c.c  |  4 ---
 gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +
 gcc/tree-ssa-ifcombine.cc   |  5 
 7 files changed, 46 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c

diff --git a/gcc/common.opt b/gcc/common.opt
index ad348844775..e943202bcf1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3163,6 +3163,10 @@ ftree-phiprop
 Common Var(flag_tree_phiprop) Init(1) Optimization
 Enable hoisting loads from conditional pointers.
 
+ftree-ifcombine
+Common Var(flag_tree_ifcombine) Init(1) Optimization
+Merge some conditional branches to simplify control flow.
+
 ftree-pre
 Common Var(flag_tree_pre) Optimization
 Enable SSA-PRE optimization on trees.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e2edf7a6c13..8d2ff6b4512 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and 
higher.
 Perform hoisting of loads from conditional pointers on trees.  This
 pass is enabled by default at @option{-O1} and higher.
 
+@opindex ftree-ifcombine
+@item -ftree-ifcombine
+Merge some conditional branches to simplify control flow.  This pass
+is enabled by default at @option{-O1} and higher.
+
 @opindex fhoist-adjacent-loads
 @item -fhoist-adjacent-loads
 Speculatively hoist loads from both branches of an if-then-else if the
diff --git a/gcc/passes.def b/gcc/passes.def
index 1cbbd413097..1765b476131 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_lim);
   NEXT_PASS (pass_walloca, false);
   NEXT_PASS (pass_pre);
+  NEXT_PASS (pass_tree_ifcombine);
   NEXT_PASS (pass_sink_code, false /* unsplit edges */);
   NEXT_PASS (pass_sancov);
   NEXT_PASS (pass_asan);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
index 16c79da9521..66c9f481a2f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
+/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
 
 struct rtx_def;
 typedef struct rtx_def *rtx;
diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c 
b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
index f60868dad23..2d8e6501a45 100644
--- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
+++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
@@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
   if ( (n > 10) && l)
   blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
 
-  if (l)
-if (n > 12)
-  blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
-
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c 
b/gcc/testsuite/gcc.target/aarch64/pr102793.c
new file mode 100644
index 000..78d48e01637
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr102793.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef unsigned long uint64_t;
+
+int ccmp1(uint64_t* s1, uint64_t* s2, int(*foo)(void))
+{
+  uint64_t d1, d2, bar;
+  d1 = *s1++;
+  d2 = *s2++;
+  bar = (d1 + d2) & 0xabcd;
+  if (bar == 0 || d1 != d2)
+return foo();
+  return 0;
+}
+
+int ccmp2(uint64_t* s1, uint64_t* s2, int(*foo)(void))
+{
+  uint64_t d1, d2, bar;
+  d1 = *s1++;
+  d2 = *s2++;
+  bar = (d1 + d2) & 0xabcd;
+  if (bar == 0)
+return foo();
+  if (d1 != d2)
+return foo();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "ccmp\t" 2 } } */
\ No newline at end of fi

Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]

2024-04-05 Thread Richard Biener

On Fri, Apr 5, 2024 at 2:28 PM Manolis Tsamis  wrote:
>
> If we consider code like:
>
> if (bar1 == x)
>   return foo();
> if (bar2 != y)
>   return foo();
> return 0;
>
> We would like the ifcombine pass to convert this to:
>
> if (bar1 == x || bar2 != y)
>   return foo();
> return 0;
>
> The ifcombine pass can handle this transformation but it is ran very early and
> it misses the opportunity because there are two seperate blocks for foo().
> The pre pass is good at removing duplicate code and blocks and due to that
> running ifcombine again after it can increase the number of successful
> conversions.
>
> PR 102793
>
> gcc/ChangeLog:
>
> * common.opt: -ftree-ifcombine option, enabled by default.
> * doc/invoke.texi: Document.
> * passes.def: Re-run ssa-ifcombine after pre.
> * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
> * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
> * gcc.target/aarch64/pr102793.c: New test.
>
> Signed-off-by: Manolis Tsamis 
> ---
>
>  gcc/common.opt  |  4 +++
>  gcc/doc/invoke.texi |  5 
>  gcc/passes.def  |  1 +
>  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
>  gcc/testsuite/gcc.dg/uninit-pred-6_c.c  |  4 ---
>  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +
>  gcc/tree-ssa-ifcombine.cc   |  5 
>  7 files changed, 46 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index ad348844775..e943202bcf1 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3163,6 +3163,10 @@ ftree-phiprop
>  Common Var(flag_tree_phiprop) Init(1) Optimization
>  Enable hoisting loads from conditional pointers.
>
> +ftree-ifcombine

Please don't add further -ftree-X flags, 'tree' means nothing
to users.  -fif-combine would be better.

> +Common Var(flag_tree_ifcombine) Init(1) Optimization
> +Merge some conditional branches to simplify control flow.
> +
>  ftree-pre
>  Common Var(flag_tree_pre) Optimization
>  Enable SSA-PRE optimization on trees.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e2edf7a6c13..8d2ff6b4512 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and 
> higher.
>  Perform hoisting of loads from conditional pointers on trees.  This
>  pass is enabled by default at @option{-O1} and higher.
>
> +@opindex ftree-ifcombine
> +@item -ftree-ifcombine
> +Merge some conditional branches to simplify control flow.  This pass
> +is enabled by default at @option{-O1} and higher.
> +
>  @opindex fhoist-adjacent-loads
>  @item -fhoist-adjacent-loads
>  Speculatively hoist loads from both branches of an if-then-else if the
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1cbbd413097..1765b476131 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
>NEXT_PASS (pass_lim);
>NEXT_PASS (pass_walloca, false);
>NEXT_PASS (pass_pre);
> +  NEXT_PASS (pass_tree_ifcombine);
>NEXT_PASS (pass_sink_code, false /* unsplit edges */);

Please move it here, after sinking.

>NEXT_PASS (pass_sancov);
>NEXT_PASS (pass_asan);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> index 16c79da9521..66c9f481a2f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
>
>  struct rtx_def;
>  typedef struct rtx_def *rtx;
> diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c 
> b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> index f60868dad23..2d8e6501a45 100644
> --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
>if ( (n > 10) && l)
>blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
>
> -  if (l)
> -if (n > 12)
> -  blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> -

What's "inconsistent" about this check?  I suppose we now diagnose this?
The appropriate way would be to XFAIL this but I'd like you to explain
why we now diagnose this (I don't see obvious if-combining opportunities).

On a general note you rely on the tail-merging pass which is part of PRE
and which hasn't seen any love and which isn't very powerful either.  I'm not
sure it's worth doing if-combining on the whole IL again because of it.
It might be

[PATCH] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)

2024-04-05 Thread Martin Jambor

Hi,

interactions of IPA-CP and IPA-SRA on the same data is a rather big
source of issues, I'm afraid.  PR 113964 is a situation where IPA-CP
propagates an unsigned short in a union parameter into a function
which itself calls a different function which has a same union
parameter and both these union parameters are split with IPA-SRA.  The
leaf function however uses a signed short member of the union.

In the calling function, we get the unsigned constant as the
replacement for the union and it is then passed in the call without
any type compatibility checks.  Apparently on riscv64 it matters
whether the parameter is signed or unsigned short and so the leaf
function can see different values.

Fixed by using useless_type_conversion_p at the appropriate place and
if it fails, use force_value_to type as elsewhere in similar
situations.

Bootstrapped and tested on x86_64-linux, the reporter has also run the
testsuite with this patch on riscv64 and reported in Bugzilla there were
no issues.

OK for master and GCC 13?

Thanks,

Martin


gcc/ChangeLog:

2024-04-04  Martin Jambor  

PR ipa/113964
* ipa-param-manipulation.cc (ipa_param_adjustments::modify_call):
Force values obtined through pass-through maps to the expected
split type.

gcc/testsuite/ChangeLog:

2024-04-04  Patrick O'Neill  
Martin Jambor  

PR ipa/113964
* gcc.dg/ipa/pr114247.c: New test.
---
 gcc/ipa-param-manipulation.cc   |  6 ++
 gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr114247.c

diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
index 3e0df6a6f77..b4ca78b652e 100644
--- a/gcc/ipa-param-manipulation.cc
+++ b/gcc/ipa-param-manipulation.cc
@@ -740,6 +740,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs,
  }
   if (repl)
{
+ if (!useless_type_conversion_p(apm->type, repl->typed.type))
+   {
+ repl = force_value_to_type (apm->type, repl);
+ repl = force_gimple_operand_gsi (&gsi, repl,
+  true, NULL, true, GSI_SAME_STMT);
+   }
  vargs.quick_push (repl);
  continue;
}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c 
b/gcc/testsuite/gcc.dg/ipa/pr114247.c
new file mode 100644
index 000..60aa2bc0122
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */
+
+union a {
+  unsigned short b;
+  int c;
+  signed short d;
+};
+int e, f = 1, g;
+long h;
+const int **i;
+void j(union a k, int l, unsigned m) {
+  const int *a[100];
+  i = &a[0];
+  h = k.d;
+}
+static int o(union a k) {
+  k.d = -1;
+  while (1)
+if (f)
+  break;
+  j(k, g, e);
+  return 0;
+}
+int main() {
+  union a n = {1};
+  o(n);
+  if (h != -1)
+__builtin_abort();
+  return 0;
+}
-- 
2.44.0

Re: [PATCH] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)

2024-04-05 Thread Jan Hubicka

> Hi,
> 
> interactions of IPA-CP and IPA-SRA on the same data is a rather big
> source of issues, I'm afraid.  PR 113964 is a situation where IPA-CP
> propagates an unsigned short in a union parameter into a function
> which itself calls a different function which has a same union
> parameter and both these union parameters are split with IPA-SRA.  The
> leaf function however uses a signed short member of the union.
> 
> In the calling function, we get the unsigned constant as the
> replacement for the union and it is then passed in the call without
> any type compatibility checks.  Apparently on riscv64 it matters
> whether the parameter is signed or unsigned short and so the leaf
> function can see different values.
> 
> Fixed by using useless_type_conversion_p at the appropriate place and
> if it fails, use force_value_to type as elsewhere in similar
> situations.
> 
> Bootstrapped and tested on x86_64-linux, the reporter has also run the
> testsuite with this patch on riscv64 and reported in Bugzilla there were
> no issues.
> 
> OK for master and GCC 13?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2024-04-04  Martin Jambor  
> 
>   PR ipa/113964
>   * ipa-param-manipulation.cc (ipa_param_adjustments::modify_call):
>   Force values obtined through pass-through maps to the expected
>   split type.
> 
> gcc/testsuite/ChangeLog:
> 
> 2024-04-04  Patrick O'Neill  
>   Martin Jambor  
> 
>   PR ipa/113964
>   * gcc.dg/ipa/pr114247.c: New test.
OK,
thanks!
Honza

Re: [PATCH v10 1/2] Add condition coverage (MC/DC)

2024-04-05 Thread Rainer Orth

Hi Jørgen,

> On 04/04/2024 14:10, Jan Hubicka wrote:
>>> gcc/ChangeLog:
>>>
>>> * builtins.cc (expand_builtin_fork_or_exec): Check
>>>   condition_coverage_flag.
>>> * collect2.cc (main): Add -fno-condition-coverage to OBSTACK.
>>> * common.opt: Add new options -fcondition-coverage and
>>>   -Wcoverage-too-many-conditions.
>>> * doc/gcov.texi: Add --conditions documentation.
>>> * doc/invoke.texi: Add -fcondition-coverage documentation.
>>> * function.cc (free_after_compilation): Free cond_uids.
>>> * function.h (struct function): Add cond_uids.
>>> * gcc.cc: Link gcov on -fcondition-coverage.
>>> * gcov-counter.def (GCOV_COUNTER_CONDS): New.
>>> * gcov-dump.cc (tag_conditions): New.
>>> * gcov-io.h (GCOV_TAG_CONDS): New.
>>> (GCOV_TAG_CONDS_LENGTH): New.
>>> (GCOV_TAG_CONDS_NUM): New.
>>> * gcov.cc (class condition_info): New.
>>> (condition_info::condition_info): New.
>>> (condition_info::popcount): New.
>>> (struct coverage_info): New.
>>> (add_condition_counts): New.
>>> (output_conditions): New.
>>> (print_usage): Add -g, --conditions.
>>> (process_args): Likewise.
>>> (output_intermediate_json_line): Output conditions.
>>> (read_graph_file): Read condition counters.
>>> (read_count_file): Likewise.
>>> (file_summary): Print conditions.
>>> (accumulate_line_info): Accumulate conditions.
>>> (output_line_details): Print conditions.
>>> * gimplify.cc (next_cond_uid): New.
>>> (reset_cond_uid): New.
>>> (shortcut_cond_r): Set condition discriminator.
>>> (tag_shortcut_cond): New.
>>> (gimple_associate_condition_with_expr): New.
>>> (shortcut_cond_expr): Set condition discriminator.
>>> (gimplify_cond_expr): Likewise.
>>> (gimplify_function_tree): Call reset_cond_uid.
>>> * ipa-inline.cc (can_early_inline_edge_p): Check
>>>   condition_coverage_flag.
>>> * ipa-split.cc (pass_split_functions::gate): Likewise.
>>> * passes.cc (finish_optimization_passes): Likewise.
>>> * profile.cc (struct condcov): New declaration.
>>> (cov_length): Likewise.
>>> (cov_blocks): Likewise.
>>> (cov_masks): Likewise.
>>> (cov_maps): Likewise.
>>> (cov_free): Likewise.
>>> (instrument_decisions): New.
>>> (read_thunk_profile): Control output to file.
>>> (branch_prob): Call find_conditions, instrument_decisions.
>>> (init_branch_prob): Add total_num_conds.
>>> (end_branch_prob): Likewise.
>>> * tree-core.h (struct tree_exp): Add condition_uid.
>>> * tree-profile.cc (struct conds_ctx): New.
>>> (CONDITIONS_MAX_TERMS): New.
>>> (EDGE_CONDITION): New.
>>> (topological_cmp): New.
>>> (index_of): New.
>>> (single_p): New.
>>> (single_edge): New.
>>> (contract_edge_up): New.
>>> (struct outcomes): New.
>>> (conditional_succs): New.
>>> (condition_index): New.
>>> (condition_uid): New.
>>> (masking_vectors): New.
>>> (emit_assign): New.
>>> (emit_bitwise_op): New.
>>> (make_top_index_visit): New.
>>> (make_top_index): New.
>>> (paths_between): New.
>>> (struct condcov): New.
>>> (cov_length): New.
>>> (cov_blocks): New.
>>> (cov_masks): New.
>>> (cov_maps): New.
>>> (cov_free): New.
>>> (find_conditions): New.
>>> (struct counters): New.
>>> (find_counters): New.
>>> (resolve_counter): New.
>>> (resolve_counters): New.
>>> (instrument_decisions): New.
>>> (tree_profiling): Check condition_coverage_flag.
>>> (pass_ipa_tree_profile::gate): Likewise.
>>> * tree.h (SET_EXPR_UID): New.
>>> (EXPR_COND_UID): New.
>>>
>>> libgcc/ChangeLog:
>>>
>>> * libgcov-merge.c (__gcov_merge_ior): New.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * lib/gcov.exp: Add condition coverage test function.
>>> * g++.dg/gcov/gcov-18.C: New test.
>>> * gcc.misc-tests/gcov-19.c: New test.
>>> * gcc.misc-tests/gcov-20.c: New test.
>>> * gcc.misc-tests/gcov-21.c: New test.
>>> * gcc.misc-tests/gcov-22.c: New test.
>>> * gcc.misc-tests/gcov-23.c: New test.
>>> ---
>>>   gcc/builtins.cc|2 +-
>>>   gcc/collect2.cc|7 +-
>>>   gcc/common.opt |9 +
>>>   gcc/doc/gcov.texi  |   38 +
>>>   gcc/doc/invoke.texi|   21 +
>>>   gcc/function.cc|1 +
>>>   gcc/function.h |4 +
>>>   gcc/gcc.cc |4 +-
>>>   gcc/gcov-counter.def   |3 +
>>>   gcc/gcov-dump.cc   |   24 +
>>>   gcc/gcov-io.h  |3 +
>>>   gcc/gcov.cc|  209 ++-
>>>   gcc/gimplify.cc|  123 +-
>>>   gcc/ipa-inline.cc  |2 +-
>>>   gcc/ipa-split.cc   |2 +-
>>>   gcc/passes.cc

[pushed] c-family: remove dead #undef

2024-04-05 Thread Marek Polacek

Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
The #undef was added in r0-90320-g100d537d7a7b5c but it never did
anything.

gcc/c-family/ChangeLog:

* c-warn.cc (warn_about_parentheses): Remove an #undef.
---
 gcc/c-family/c-warn.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index 8168696fa45..bff87be05ae 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2176,7 +2176,6 @@ warn_about_parentheses (location_t loc, enum tree_code 
code,
}
   return;
 }
-#undef NOT_A_BOOLEAN_EXPR_P
 }
 
 /* If LABEL (a LABEL_DECL) has not been used, issue a warning.  */

base-commit: 679f81a32f706645f45900fdb1659fb5fe607f77
-- 
2.44.0

[PATCH 1/3] Pass reliable down to infer_loop_bounds_from_array

2024-04-05 Thread Richard Biener

The following passes down whether a stmt is always executed from
infer_loop_bounds_from_undefined to infer_loop_bounds_from_array.
The parameters were already documented.  The patch doesn't remove
possibly redundant checks from idx_infer_loop_bounds yet.

Boostrapped on x86_64-unknown-linux-gnu, testing in progress.

* tree-ssa-loop-niter.cc (ilb_data::reliable): New.
(idx_infer_loop_bounds): Initialize upper from reliable.
(infer_loop_bounds_from_ref): Get and pass through reliable flag.
(infer_loop_bounds_from_array): Likewise.
(infer_loop_bounds_from_undefined): Pass reliable flag to
infer_loop_bounds_from_array.
---
 gcc/tree-ssa-loop-niter.cc | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index c6d010f6d89..0a77c1bb544 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -4123,6 +4123,7 @@ struct ilb_data
 {
   class loop *loop;
   gimple *stmt;
+  bool reliable;
 };
 
 static bool
@@ -4131,7 +4132,7 @@ idx_infer_loop_bounds (tree base, tree *idx, void *dta)
   struct ilb_data *data = (struct ilb_data *) dta;
   tree ev, init, step;
   tree low, high, type, next;
-  bool sign, upper = true, has_flexible_size = false;
+  bool sign, upper = data->reliable, has_flexible_size = false;
   class loop *loop = data->loop;
 
   if (TREE_CODE (base) != ARRAY_REF)
@@ -4224,12 +4225,14 @@ idx_infer_loop_bounds (tree base, tree *idx, void *dta)
STMT is guaranteed to be executed in every iteration of LOOP.*/
 
 static void
-infer_loop_bounds_from_ref (class loop *loop, gimple *stmt, tree ref)
+infer_loop_bounds_from_ref (class loop *loop, gimple *stmt, tree ref,
+   bool reliable)
 {
   struct ilb_data data;
 
   data.loop = loop;
   data.stmt = stmt;
+  data.reliable = reliable;
   for_each_index (&ref, idx_infer_loop_bounds, &data);
 }
 
@@ -4238,7 +4241,7 @@ infer_loop_bounds_from_ref (class loop *loop, gimple 
*stmt, tree ref)
executed in every iteration of LOOP.  */
 
 static void
-infer_loop_bounds_from_array (class loop *loop, gimple *stmt)
+infer_loop_bounds_from_array (class loop *loop, gimple *stmt, bool reliable)
 {
   if (is_gimple_assign (stmt))
 {
@@ -4248,10 +4251,10 @@ infer_loop_bounds_from_array (class loop *loop, gimple 
*stmt)
   /* For each memory access, analyze its access function
 and record a bound on the loop iteration domain.  */
   if (REFERENCE_CLASS_P (op0))
-   infer_loop_bounds_from_ref (loop, stmt, op0);
+   infer_loop_bounds_from_ref (loop, stmt, op0, reliable);
 
   if (REFERENCE_CLASS_P (op1))
-   infer_loop_bounds_from_ref (loop, stmt, op1);
+   infer_loop_bounds_from_ref (loop, stmt, op1, reliable);
 }
   else if (is_gimple_call (stmt))
 {
@@ -4260,13 +4263,13 @@ infer_loop_bounds_from_array (class loop *loop, gimple 
*stmt)
 
   lhs = gimple_call_lhs (stmt);
   if (lhs && REFERENCE_CLASS_P (lhs))
-   infer_loop_bounds_from_ref (loop, stmt, lhs);
+   infer_loop_bounds_from_ref (loop, stmt, lhs, reliable);
 
   for (i = 0; i < n; i++)
{
  arg = gimple_call_arg (stmt, i);
  if (REFERENCE_CLASS_P (arg))
-   infer_loop_bounds_from_ref (loop, stmt, arg);
+   infer_loop_bounds_from_ref (loop, stmt, arg, reliable);
}
 }
 }
@@ -4410,7 +4413,7 @@ infer_loop_bounds_from_undefined (class loop *loop, 
basic_block *bbs)
{
  gimple *stmt = gsi_stmt (bsi);
 
- infer_loop_bounds_from_array (loop, stmt);
+ infer_loop_bounds_from_array (loop, stmt, reliable);
 
  if (reliable)
 {
-- 
2.35.3

[PATCH 2/3] Add get_loop_body_in_rpo

2024-04-05 Thread Richard Biener

The following adds another get_loop_body variant, one to get blocks
in RPO.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

* cfgloop.h (get_loop_body_in_rpo): Declare.
* cfgloop.cc (get_loop_body_in_rpo): Compute loop body in RPO.
---
 gcc/cfgloop.cc | 68 ++
 gcc/cfgloop.h  |  1 +
 2 files changed, 69 insertions(+)

diff --git a/gcc/cfgloop.cc b/gcc/cfgloop.cc
index 5202c3865d1..d79a006554f 100644
--- a/gcc/cfgloop.cc
+++ b/gcc/cfgloop.cc
@@ -1021,6 +1021,74 @@ get_loop_body_in_bfs_order (const class loop *loop)
   return blocks;
 }
 
+/* Get the body of LOOP in FN in reverse post order.  */
+
+basic_block *
+get_loop_body_in_rpo (function *fn, const class loop *loop)
+{
+  auto_vec stack (loop->num_nodes + 1);
+  auto_bb_flag visited (fn);
+
+  basic_block *blocks = XNEWVEC (basic_block, loop->num_nodes);
+  int rev_post_order_num = loop->num_nodes - 1;
+
+  /* Find a block leading to the loop header.  */
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, loop->header->preds)
+if (!flow_bb_inside_loop_p (loop, e->src))
+  break;
+  basic_block preheader = e->src;
+
+  stack.quick_push (ei_start (preheader->succs));
+
+  while (!stack.is_empty ())
+{
+  basic_block src;
+  basic_block dest;
+
+  /* Look at the edge on the top of the stack.  */
+  edge_iterator ei = stack.last ();
+  src = ei_edge (ei)->src;
+  dest = ei_edge (ei)->dest;
+
+  /* Check if the edge destination has been visited yet.  */
+  if (flow_bb_inside_loop_p (loop, dest)
+ && ! (dest->flags & visited))
+   {
+ /* Mark that we have visited the destination.  */
+ dest->flags |= visited;
+
+ if (EDGE_COUNT (dest->succs) > 0)
+   /* Since the DEST node has been visited for the first
+  time, check its successors.  */
+   stack.quick_push (ei_start (dest->succs));
+ else
+   /* There are no successors for the DEST node so record
+  the block.  */
+   blocks[rev_post_order_num--] = dest;
+   }
+  else
+   {
+ if (ei_one_before_end_p (ei)
+ && src != preheader)
+   /* There are no more successors for the SRC node
+  so record the block.  */
+   blocks[rev_post_order_num--] = src;
+
+ if (!ei_one_before_end_p (ei))
+   ei_next (&stack.last ());
+ else
+   stack.pop ();
+   }
+}
+
+  for (int i = rev_post_order_num + 1; i < (int) loop->num_nodes; ++i)
+blocks[i]->flags &= ~visited;
+
+  return blocks;
+}
+
 /* Hash function for struct loop_exit.  */
 
 hashval_t
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 30b5e40d0d9..42f3079102d 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -385,6 +385,7 @@ extern basic_block *get_loop_body_in_custom_order (const 
class loop *,
   int (*) (const void *, const void *));
 extern basic_block *get_loop_body_in_custom_order (const class loop *, void *,
   int (*) (const void *, const void *, void *));
+extern basic_block *get_loop_body_in_rpo (function *, const class loop *);
 
 extern auto_vec get_loop_exit_edges (const class loop *, basic_block * = 
NULL);
 extern edge single_exit (const class loop *);
-- 
2.35.3

[PATCH 3/3] tree-optimization/114052 - niter analysis from undefined behavior

2024-04-05 Thread Richard Biener

The following makes sure to only compute upper bounds for the number
of iterations of loops from undefined behavior invoked by stmts when
those are executed in each loop iteration, in particular also in the
last one.  The latter cannot be guaranteed if there's possible
infinite loops or calls with side-effects possibly executed before
the stmt.  Rather than adjusting the bound by one or using the bound as
estimate the following for now gives up.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

PR tree-optimization/114052
* tree-ssa-loop-niter.cc (infer_loop_bounds_from_undefined):
When we enter a possibly infinite loop or when we come across
a call with side-effects record the last iteration might not
execute all stmts.  Consider bounds as unreliable in that case.

* gcc.dg/tree-ssa/pr114052-1.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c | 16 ++
 gcc/tree-ssa-loop-niter.cc | 35 ++
 2 files changed, 45 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c
new file mode 100644
index 000..54a2181e67e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int foo(void)
+{
+  int counter = 0;
+  while (1)
+{
+  if (counter >= 2)
+   continue;
+  __builtin_printf("%i\n", counter++);
+}
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "unreachable" "optimized" } } */
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 0a77c1bb544..52a39eb3500 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -4397,7 +4397,7 @@ infer_loop_bounds_from_undefined (class loop *loop, 
basic_block *bbs)
   unsigned i;
   gimple_stmt_iterator bsi;
   basic_block bb;
-  bool reliable;
+  bool may_have_exited = false;
 
   for (i = 0; i < loop->num_nodes; i++)
 {
@@ -4407,21 +4407,44 @@ infer_loop_bounds_from_undefined (class loop *loop, 
basic_block *bbs)
 use the operations in it to infer reliable upper bound on the
 # of iterations of the loop.  However, we can use it as a guess. 
 Reliable guesses come only from array bounds.  */
-  reliable = dominated_by_p (CDI_DOMINATORS, loop->latch, bb);
+  bool reliable = dominated_by_p (CDI_DOMINATORS, loop->latch, bb);
+
+  /* A possibly infinite inner loop makes further blocks not always
+executed.  Key on the entry of such a loop as that avoids RPO
+issues with where the exits of that loop are.  Any block
+inside an irreducible sub-region is problematic as well.
+???  Note this technically only makes the last iteration
+possibly partially executed.  */
+  if (!may_have_exited
+ && bb != loop->header
+ && (!loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS)
+ || bb->flags & BB_IRREDUCIBLE_LOOP
+ || (bb->loop_father->header == bb
+ && !finite_loop_p (bb->loop_father
+   may_have_exited = true;
 
   for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
  gimple *stmt = gsi_stmt (bsi);
 
- infer_loop_bounds_from_array (loop, stmt, reliable);
+ /* When there's a call that might not return the last iteration
+is possibly partial.  This matches what we check in invariant
+motion.
+???  For the call argument evaluation it would be still OK.  */
+ if (!may_have_exited
+ && is_gimple_call (stmt)
+ && gimple_has_side_effects (stmt))
+   may_have_exited = true;
+
+ infer_loop_bounds_from_array (loop, stmt,
+   reliable && !may_have_exited);
 
- if (reliable)
+ if (reliable && !may_have_exited)
 {
   infer_loop_bounds_from_signedness (loop, stmt);
   infer_loop_bounds_from_pointer_arith (loop, stmt);
 }
}
-
 }
 }
 
@@ -4832,7 +4855,7 @@ estimate_numbers_of_iterations (class loop *loop)
  diagnose those loops with -Waggressive-loop-optimizations.  */
   number_of_latch_executions (loop);
 
-  basic_block *body = get_loop_body (loop);
+  basic_block *body = get_loop_body_in_rpo (cfun, loop);
   auto_vec exits = get_loop_exit_edges (loop, body);
   likely_exit = single_likely_exit (loop, exits);
   FOR_EACH_VEC_ELT (exits, i, ex)
-- 
2.35.3

Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]

2024-04-05 Thread Manolis Tsamis

On Fri, Apr 5, 2024 at 3:43 PM Richard Biener
 wrote:
>
> On Fri, Apr 5, 2024 at 2:28 PM Manolis Tsamis  wrote:
> >
> > If we consider code like:
> >
> > if (bar1 == x)
> >   return foo();
> > if (bar2 != y)
> >   return foo();
> > return 0;
> >
> > We would like the ifcombine pass to convert this to:
> >
> > if (bar1 == x || bar2 != y)
> >   return foo();
> > return 0;
> >
> > The ifcombine pass can handle this transformation but it is ran very early 
> > and
> > it misses the opportunity because there are two seperate blocks for foo().
> > The pre pass is good at removing duplicate code and blocks and due to that
> > running ifcombine again after it can increase the number of successful
> > conversions.
> >
> > PR 102793
> >
> > gcc/ChangeLog:
> >
> > * common.opt: -ftree-ifcombine option, enabled by default.
> > * doc/invoke.texi: Document.
> > * passes.def: Re-run ssa-ifcombine after pre.
> > * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate 
> > function.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
> > * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
> > * gcc.target/aarch64/pr102793.c: New test.
> >
> > Signed-off-by: Manolis Tsamis 
> > ---
> >
> >  gcc/common.opt  |  4 +++
> >  gcc/doc/invoke.texi |  5 
> >  gcc/passes.def  |  1 +
> >  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
> >  gcc/testsuite/gcc.dg/uninit-pred-6_c.c  |  4 ---
> >  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +
> >  gcc/tree-ssa-ifcombine.cc   |  5 
> >  7 files changed, 46 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index ad348844775..e943202bcf1 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -3163,6 +3163,10 @@ ftree-phiprop
> >  Common Var(flag_tree_phiprop) Init(1) Optimization
> >  Enable hoisting loads from conditional pointers.
> >
> > +ftree-ifcombine
>
> Please don't add further -ftree-X flags, 'tree' means nothing
> to users.  -fif-combine would be better.
>
Ok, thanks for pointing out, will do.

> > +Common Var(flag_tree_ifcombine) Init(1) Optimization
> > +Merge some conditional branches to simplify control flow.
> > +
> >  ftree-pre
> >  Common Var(flag_tree_pre) Optimization
> >  Enable SSA-PRE optimization on trees.
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index e2edf7a6c13..8d2ff6b4512 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} 
> > and higher.
> >  Perform hoisting of loads from conditional pointers on trees.  This
> >  pass is enabled by default at @option{-O1} and higher.
> >
> > +@opindex ftree-ifcombine
> > +@item -ftree-ifcombine
> > +Merge some conditional branches to simplify control flow.  This pass
> > +is enabled by default at @option{-O1} and higher.
> > +
> >  @opindex fhoist-adjacent-loads
> >  @item -fhoist-adjacent-loads
> >  Speculatively hoist loads from both branches of an if-then-else if the
> > diff --git a/gcc/passes.def b/gcc/passes.def
> > index 1cbbd413097..1765b476131 100644
> > --- a/gcc/passes.def
> > +++ b/gcc/passes.def
> > @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
> >NEXT_PASS (pass_lim);
> >NEXT_PASS (pass_walloca, false);
> >NEXT_PASS (pass_pre);
> > +  NEXT_PASS (pass_tree_ifcombine);
> >NEXT_PASS (pass_sink_code, false /* unsplit edges */);
>
> Please move it here, after sinking.
>
My initial placement was there, but I saw that in some cases
(including the testcase), sinking would introduce some blocks that
prohibited the ifcombining.

> >NEXT_PASS (pass_sancov);
> >NEXT_PASS (pass_asan);
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > index 16c79da9521..66c9f481a2f 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> > +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
> >
> >  struct rtx_def;
> >  typedef struct rtx_def *rtx;
> > diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c 
> > b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > index f60868dad23..2d8e6501a45 100644
> > --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
> >if ( (n > 10) && l)
> >blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> >
> > -  if (l)
> > -if (n > 12)
> > -  blah(v); /*

Re: [PATCH v10 1/2] Add condition coverage (MC/DC)

2024-04-05 Thread Jørgen Kvalsvik


On 05/04/2024 15:00, Rainer Orth wrote:

Hi Jørgen,


On 04/04/2024 14:10, Jan Hubicka wrote:

gcc/ChangeLog:

* builtins.cc (expand_builtin_fork_or_exec): Check
  condition_coverage_flag.
* collect2.cc (main): Add -fno-condition-coverage to OBSTACK.
* common.opt: Add new options -fcondition-coverage and
  -Wcoverage-too-many-conditions.
* doc/gcov.texi: Add --conditions documentation.
* doc/invoke.texi: Add -fcondition-coverage documentation.
* function.cc (free_after_compilation): Free cond_uids.
* function.h (struct function): Add cond_uids.
* gcc.cc: Link gcov on -fcondition-coverage.
* gcov-counter.def (GCOV_COUNTER_CONDS): New.
* gcov-dump.cc (tag_conditions): New.
* gcov-io.h (GCOV_TAG_CONDS): New.
(GCOV_TAG_CONDS_LENGTH): New.
(GCOV_TAG_CONDS_NUM): New.
* gcov.cc (class condition_info): New.
(condition_info::condition_info): New.
(condition_info::popcount): New.
(struct coverage_info): New.
(add_condition_counts): New.
(output_conditions): New.
(print_usage): Add -g, --conditions.
(process_args): Likewise.
(output_intermediate_json_line): Output conditions.
(read_graph_file): Read condition counters.
(read_count_file): Likewise.
(file_summary): Print conditions.
(accumulate_line_info): Accumulate conditions.
(output_line_details): Print conditions.
* gimplify.cc (next_cond_uid): New.
(reset_cond_uid): New.
(shortcut_cond_r): Set condition discriminator.
(tag_shortcut_cond): New.
(gimple_associate_condition_with_expr): New.
(shortcut_cond_expr): Set condition discriminator.
(gimplify_cond_expr): Likewise.
(gimplify_function_tree): Call reset_cond_uid.
* ipa-inline.cc (can_early_inline_edge_p): Check
  condition_coverage_flag.
* ipa-split.cc (pass_split_functions::gate): Likewise.
* passes.cc (finish_optimization_passes): Likewise.
* profile.cc (struct condcov): New declaration.
(cov_length): Likewise.
(cov_blocks): Likewise.
(cov_masks): Likewise.
(cov_maps): Likewise.
(cov_free): Likewise.
(instrument_decisions): New.
(read_thunk_profile): Control output to file.
(branch_prob): Call find_conditions, instrument_decisions.
(init_branch_prob): Add total_num_conds.
(end_branch_prob): Likewise.
* tree-core.h (struct tree_exp): Add condition_uid.
* tree-profile.cc (struct conds_ctx): New.
(CONDITIONS_MAX_TERMS): New.
(EDGE_CONDITION): New.
(topological_cmp): New.
(index_of): New.
(single_p): New.
(single_edge): New.
(contract_edge_up): New.
(struct outcomes): New.
(conditional_succs): New.
(condition_index): New.
(condition_uid): New.
(masking_vectors): New.
(emit_assign): New.
(emit_bitwise_op): New.
(make_top_index_visit): New.
(make_top_index): New.
(paths_between): New.
(struct condcov): New.
(cov_length): New.
(cov_blocks): New.
(cov_masks): New.
(cov_maps): New.
(cov_free): New.
(find_conditions): New.
(struct counters): New.
(find_counters): New.
(resolve_counter): New.
(resolve_counters): New.
(instrument_decisions): New.
(tree_profiling): Check condition_coverage_flag.
(pass_ipa_tree_profile::gate): Likewise.
* tree.h (SET_EXPR_UID): New.
(EXPR_COND_UID): New.

libgcc/ChangeLog:

* libgcov-merge.c (__gcov_merge_ior): New.

gcc/testsuite/ChangeLog:

* lib/gcov.exp: Add condition coverage test function.
* g++.dg/gcov/gcov-18.C: New test.
* gcc.misc-tests/gcov-19.c: New test.
* gcc.misc-tests/gcov-20.c: New test.
* gcc.misc-tests/gcov-21.c: New test.
* gcc.misc-tests/gcov-22.c: New test.
* gcc.misc-tests/gcov-23.c: New test.
---
   gcc/builtins.cc|2 +-
   gcc/collect2.cc|7 +-
   gcc/common.opt |9 +
   gcc/doc/gcov.texi  |   38 +
   gcc/doc/invoke.texi|   21 +
   gcc/function.cc|1 +
   gcc/function.h |4 +
   gcc/gcc.cc |4 +-
   gcc/gcov-counter.def   |3 +
   gcc/gcov-dump.cc   |   24 +
   gcc/gcov-io.h  |3 +
   gcc/gcov.cc|  209 ++-
   gcc/gimplify.cc|  123 +-
   gcc/ipa-inline.cc  |2 +-
   gcc/ipa-split.cc   |2 +-
   gcc/passes.cc  |3 +-
   gcc/profile

Re: [PATCH 3/3] tree-optimization/114052 - niter analysis from undefined behavior

2024-04-05 Thread Richard Biener

On Fri, 5 Apr 2024, Richard Biener wrote:

> The following makes sure to only compute upper bounds for the number
> of iterations of loops from undefined behavior invoked by stmts when
> those are executed in each loop iteration, in particular also in the
> last one.  The latter cannot be guaranteed if there's possible
> infinite loops or calls with side-effects possibly executed before
> the stmt.  Rather than adjusting the bound by one or using the bound as
> estimate the following for now gives up.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

It FAILs

FAIL: gcc.dg/pr53265.c  at line 91 (test for warnings, line 90)
FAIL: gcc.dg/pr53265.c  at line 92 (test for warnings, line 90)

for diagnostic purposes we'd need to treat the call as not terminating

FAIL: gcc.dg/tree-ssa/cunroll-10.c scan-tree-dump-times cunroll 
"Forced statement unreachable" 2
FAIL: gcc.dg/tree-ssa/cunroll-11.c scan-tree-dump cunroll "Loop 1 
iterates at most 3 times"
FAIL: gcc.dg/tree-ssa/cunroll-9.c scan-tree-dump-times cunrolli 
"Removed pointless exit:" 1
FAIL: gcc.dg/tree-ssa/loop-38.c scan-tree-dump cunrolli "Loop 1 
iterates at most 11 times"
FAIL: gcc.dg/tree-ssa/pr68234.c scan-tree-dump vrp2 ">> 6"
FAIL: c-c++-common/ubsan/unreachable-3.c   -O0   scan-tree-dump 
optimized "__builtin___ubsan_handle_builtin_unreachable"
...
FAIL: c-c++-common/ubsan/unreachable-3.c   -Os   scan-tree-dump 
optimized "__builtin___ubsan_handle_builtin_unreachable"


>   PR tree-optimization/114052
>   * tree-ssa-loop-niter.cc (infer_loop_bounds_from_undefined):
>   When we enter a possibly infinite loop or when we come across
>   a call with side-effects record the last iteration might not
>   execute all stmts.  Consider bounds as unreliable in that case.
> 
>   * gcc.dg/tree-ssa/pr114052-1.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c | 16 ++
>  gcc/tree-ssa-loop-niter.cc | 35 ++
>  2 files changed, 45 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c
> new file mode 100644
> index 000..54a2181e67e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114052-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +int foo(void)
> +{
> +  int counter = 0;
> +  while (1)
> +{
> +  if (counter >= 2)
> + continue;
> +  __builtin_printf("%i\n", counter++);
> +}
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "unreachable" "optimized" } } */
> diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
> index 0a77c1bb544..52a39eb3500 100644
> --- a/gcc/tree-ssa-loop-niter.cc
> +++ b/gcc/tree-ssa-loop-niter.cc
> @@ -4397,7 +4397,7 @@ infer_loop_bounds_from_undefined (class loop *loop, 
> basic_block *bbs)
>unsigned i;
>gimple_stmt_iterator bsi;
>basic_block bb;
> -  bool reliable;
> +  bool may_have_exited = false;
>  
>for (i = 0; i < loop->num_nodes; i++)
>  {
> @@ -4407,21 +4407,44 @@ infer_loop_bounds_from_undefined (class loop *loop, 
> basic_block *bbs)
>use the operations in it to infer reliable upper bound on the
># of iterations of the loop.  However, we can use it as a guess. 
>Reliable guesses come only from array bounds.  */
> -  reliable = dominated_by_p (CDI_DOMINATORS, loop->latch, bb);
> +  bool reliable = dominated_by_p (CDI_DOMINATORS, loop->latch, bb);
> +
> +  /* A possibly infinite inner loop makes further blocks not always
> +  executed.  Key on the entry of such a loop as that avoids RPO
> +  issues with where the exits of that loop are.  Any block
> +  inside an irreducible sub-region is problematic as well.
> +  ???  Note this technically only makes the last iteration
> +  possibly partially executed.  */
> +  if (!may_have_exited
> +   && bb != loop->header
> +   && (!loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS)
> +   || bb->flags & BB_IRREDUCIBLE_LOOP
> +   || (bb->loop_father->header == bb
> +   && !finite_loop_p (bb->loop_father
> + may_have_exited = true;
>  
>for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>   {
> gimple *stmt = gsi_stmt (bsi);
>  
> -   infer_loop_bounds_from_array (loop, stmt, reliable);
> +   /* When there's a call that might not return the last iteration
> +  is possibly partial.  This matches what we check in invariant
> +  motion.
> +  ???  For the call argument evaluation it would be still OK.  */
> +   if (!may_have_exited
> +   && is_gimple_call (stmt)
> +   && gimple_has_side_effects (stmt))
> + may_have_exited = true;
> +
> +   infer_loo

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread H.J. Lu

On Fri, Apr 5, 2024 at 1:21 AM Richard Biener  wrote:
>
> There's no default bitmap obstack during global CTORs, so allocate the
> bitmap locally.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> PR middle-end/114599
> * symtab.cc (ifunc_ref_map): Do not use auto_bitmap.
> (is_caller_ifunc_resolver): Optimize bitmap_bit_p/bitmap_set_bit
> pair.
> (symtab_node::check_ifunc_callee_symtab_nodes): Properly
> allocate ifunc_ref_map here.
> ---
>  gcc/symtab.cc | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/symtab.cc b/gcc/symtab.cc
> index 3256133891d..3b018ab3ea2 100644
> --- a/gcc/symtab.cc
> +++ b/gcc/symtab.cc
> @@ -1383,7 +1383,7 @@ check_ifunc_resolver (cgraph_node *node, void *data)
>return false;
>  }
>
> -static auto_bitmap ifunc_ref_map;
> +static bitmap ifunc_ref_map;
>
>  /* Return true if any caller of NODE is an ifunc resolver.  */
>
> @@ -1404,9 +1404,8 @@ is_caller_ifunc_resolver (cgraph_node *node)
>
>/* Skip if it has been visited.  */
>unsigned int uid = e->caller->get_uid ();
> -  if (bitmap_bit_p (ifunc_ref_map, uid))
> +  if (!bitmap_set_bit (ifunc_ref_map, uid))
> continue;
> -  bitmap_set_bit (ifunc_ref_map, uid);
>
>if (is_caller_ifunc_resolver (e->caller))
> {
> @@ -1437,6 +1436,9 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
>  {
>symtab_node *node;
>
> +  bitmap_obstack_initialize (NULL);
> +  ifunc_ref_map = BITMAP_ALLOC (NULL);
> +
>FOR_EACH_SYMBOL (node)
>  {
>cgraph_node *cnode = dyn_cast  (node);
> @@ -1455,7 +1457,8 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
> cnode->called_by_ifunc_resolver = true;
>  }
>
> -  bitmap_clear (ifunc_ref_map);
> +  BITMAP_FREE (ifunc_ref_map);
> +  bitmap_obstack_release (NULL);
>  }
>
>  /* Verify symbol table for internal consistency.  */
> --
> 2.35.3

The bug isn't fixed:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599#c5

-- 
H.J.

[pushed] aarch64: Fix bogus cnot optimisation [PR114603]

2024-04-05 Thread Richard Sandiford

aarch64-sve.md had a pattern that combined:

cmpeq   pb.T, pa/z, zc.T, #0
mov zd.T, pb/z, #1

into:

cnotzd.T, pa/m, zc.T

But this is only valid if pa.T is a ptrue.  In other cases, the
original would set inactive elements of zd.T to 0, whereas the
combined form would copy elements from zc.T.

This isn't a regression on a known testcase.  However, it's a nasty
wrong code bug that could conceivably trigger for autovec code (although
I've not been able to construct a reproducer so far).  That fix is also
quite localised to the buggy operation.  I'd therefore prefer to push
the fix now rather than wait for GCC 15.

Tested on aarch64-linux-gnu & pushed.  I'll backport to branches if
there is no fallout.

Richard

gcc/
PR target/114603
* config/aarch64/aarch64-sve.md (@aarch64_pred_cnot): Replace
with...
(@aarch64_ptrue_cnot): ...this, requiring operand 1 to be
a ptrue.
(*cnot): Require operand 1 to be a ptrue.
* config/aarch64/aarch64-sve-builtins-base.cc (svcnot_impl::expand):
Use aarch64_ptrue_cnot for _x operations that are predicated
with a ptrue.  Represent other _x operations as fully-defined _m
operations.

gcc/testsuite/
PR target/114603
* gcc.target/aarch64/sve/acle/general/cnot_1.c: New test.
---
 .../aarch64/aarch64-sve-builtins-base.cc  | 25 ---
 gcc/config/aarch64/aarch64-sve.md | 22 
 .../aarch64/sve/acle/general/cnot_1.c | 23 +
 3 files changed, 50 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/cnot_1.c

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 257ca5bf6ad..5be2315a3c6 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -517,15 +517,22 @@ public:
   expand (function_expander &e) const override
   {
 machine_mode mode = e.vector_mode (0);
-if (e.pred == PRED_x)
-  {
-   /* The pattern for CNOT includes an UNSPEC_PRED_Z, so needs
-  a ptrue hint.  */
-   e.add_ptrue_hint (0, e.gp_mode (0));
-   return e.use_pred_x_insn (code_for_aarch64_pred_cnot (mode));
-  }
-
-return e.use_cond_insn (code_for_cond_cnot (mode), 0);
+machine_mode pred_mode = e.gp_mode (0);
+/* The underlying _x pattern is effectively:
+
+dst = src == 0 ? 1 : 0
+
+   rather than an UNSPEC_PRED_X.  Using this form allows autovec
+   constructs to be matched by combine, but it means that the
+   predicate on the src == 0 comparison must be all-true.
+
+   For simplicity, represent other _x operations as fully-defined _m
+   operations rather than using a separate bespoke pattern.  */
+if (e.pred == PRED_x
+   && gen_lowpart (pred_mode, e.args[0]) == CONSTM1_RTX (pred_mode))
+  return e.use_pred_x_insn (code_for_aarch64_ptrue_cnot (mode));
+return e.use_cond_insn (code_for_cond_cnot (mode),
+   e.pred == PRED_x ? 1 : 0);
   }
 };
 
diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index eca8623e587..0434358122d 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -3363,24 +3363,24 @@ (define_insn_and_split 
"trunc2"
 ;; - CNOT
 ;; -
 
-;; Predicated logical inverse.
-(define_expand "@aarch64_pred_cnot"
+;; Logical inverse, predicated with a ptrue.
+(define_expand "@aarch64_ptrue_cnot"
   [(set (match_operand:SVE_FULL_I 0 "register_operand")
(unspec:SVE_FULL_I
  [(unspec:
 [(match_operand: 1 "register_operand")
- (match_operand:SI 2 "aarch64_sve_ptrue_flag")
+ (const_int SVE_KNOWN_PTRUE)
  (eq:
-   (match_operand:SVE_FULL_I 3 "register_operand")
-   (match_dup 4))]
+   (match_operand:SVE_FULL_I 2 "register_operand")
+   (match_dup 3))]
 UNSPEC_PRED_Z)
-  (match_dup 5)
-  (match_dup 4)]
+  (match_dup 4)
+  (match_dup 3)]
  UNSPEC_SEL))]
   "TARGET_SVE"
   {
-operands[4] = CONST0_RTX (mode);
-operands[5] = CONST1_RTX (mode);
+operands[3] = CONST0_RTX (mode);
+operands[4] = CONST1_RTX (mode);
   }
 )
 
@@ -3389,7 +3389,7 @@ (define_insn "*cnot"
(unspec:SVE_I
  [(unspec:
 [(match_operand: 1 "register_operand")
- (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+ (const_int SVE_KNOWN_PTRUE)
  (eq:
(match_operand:SVE_I 2 "register_operand")
(match_operand:SVE_I 3 "aarch64_simd_imm_zero"))]
@@ -11001,4 +11001,4 @@ (define_insn "@aarch64_sve_set_neonq_"
   GET_MODE (operands[2]));
 return "sel\t

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread Richard Biener




> Am 05.04.2024 um 15:46 schrieb H.J. Lu :
> 
> On Fri, Apr 5, 2024 at 1:21 AM Richard Biener  wrote:
>> 
>> There's no default bitmap obstack during global CTORs, so allocate the
>> bitmap locally.
>> 
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>> 
>> Richard.
>> 
>>PR middle-end/114599
>>* symtab.cc (ifunc_ref_map): Do not use auto_bitmap.
>>(is_caller_ifunc_resolver): Optimize bitmap_bit_p/bitmap_set_bit
>>pair.
>>(symtab_node::check_ifunc_callee_symtab_nodes): Properly
>>allocate ifunc_ref_map here.
>> ---
>> gcc/symtab.cc | 11 +++
>> 1 file changed, 7 insertions(+), 4 deletions(-)
>> 
>> diff --git a/gcc/symtab.cc b/gcc/symtab.cc
>> index 3256133891d..3b018ab3ea2 100644
>> --- a/gcc/symtab.cc
>> +++ b/gcc/symtab.cc
>> @@ -1383,7 +1383,7 @@ check_ifunc_resolver (cgraph_node *node, void *data)
>>   return false;
>> }
>> 
>> -static auto_bitmap ifunc_ref_map;
>> +static bitmap ifunc_ref_map;
>> 
>> /* Return true if any caller of NODE is an ifunc resolver.  */
>> 
>> @@ -1404,9 +1404,8 @@ is_caller_ifunc_resolver (cgraph_node *node)
>> 
>>   /* Skip if it has been visited.  */
>>   unsigned int uid = e->caller->get_uid ();
>> -  if (bitmap_bit_p (ifunc_ref_map, uid))
>> +  if (!bitmap_set_bit (ifunc_ref_map, uid))
>>continue;
>> -  bitmap_set_bit (ifunc_ref_map, uid);
>> 
>>   if (is_caller_ifunc_resolver (e->caller))
>>{
>> @@ -1437,6 +1436,9 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
>> {
>>   symtab_node *node;
>> 
>> +  bitmap_obstack_initialize (NULL);
>> +  ifunc_ref_map = BITMAP_ALLOC (NULL);
>> +
>>   FOR_EACH_SYMBOL (node)
>> {
>>   cgraph_node *cnode = dyn_cast  (node);
>> @@ -1455,7 +1457,8 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
>>cnode->called_by_ifunc_resolver = true;
>> }
>> 
>> -  bitmap_clear (ifunc_ref_map);
>> +  BITMAP_FREE (ifunc_ref_map);
>> +  bitmap_obstack_release (NULL);
>> }
>> 
>> /* Verify symbol table for internal consistency.  */
>> --
>> 2.35.3
> 
> The bug isn't fixed:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599#c5

Ah, I reproduced with -coverage and that is fixed now.  The still existing bug 
must be something unrelated.

Richard 

> 
> --
> H.J.

Re: [PATCH v10 1/2] Add condition coverage (MC/DC)

2024-04-05 Thread H.J. Lu

On Thu, Apr 4, 2024 at 5:54 AM Jørgen Kvalsvik  wrote:
>
> On 04/04/2024 14:10, Jan Hubicka wrote:
> >> gcc/ChangeLog:
> >>
> >>  * builtins.cc (expand_builtin_fork_or_exec): Check
> >>condition_coverage_flag.
> >>  * collect2.cc (main): Add -fno-condition-coverage to OBSTACK.
> >>  * common.opt: Add new options -fcondition-coverage and
> >>-Wcoverage-too-many-conditions.
> >>  * doc/gcov.texi: Add --conditions documentation.
> >>  * doc/invoke.texi: Add -fcondition-coverage documentation.
> >>  * function.cc (free_after_compilation): Free cond_uids.
> >>  * function.h (struct function): Add cond_uids.
> >>  * gcc.cc: Link gcov on -fcondition-coverage.
> >>  * gcov-counter.def (GCOV_COUNTER_CONDS): New.
> >>  * gcov-dump.cc (tag_conditions): New.
> >>  * gcov-io.h (GCOV_TAG_CONDS): New.
> >>  (GCOV_TAG_CONDS_LENGTH): New.
> >>  (GCOV_TAG_CONDS_NUM): New.
> >>  * gcov.cc (class condition_info): New.
> >>  (condition_info::condition_info): New.
> >>  (condition_info::popcount): New.
> >>  (struct coverage_info): New.
> >>  (add_condition_counts): New.
> >>  (output_conditions): New.
> >>  (print_usage): Add -g, --conditions.
> >>  (process_args): Likewise.
> >>  (output_intermediate_json_line): Output conditions.
> >>  (read_graph_file): Read condition counters.
> >>  (read_count_file): Likewise.
> >>  (file_summary): Print conditions.
> >>  (accumulate_line_info): Accumulate conditions.
> >>  (output_line_details): Print conditions.
> >>  * gimplify.cc (next_cond_uid): New.
> >>  (reset_cond_uid): New.
> >>  (shortcut_cond_r): Set condition discriminator.
> >>  (tag_shortcut_cond): New.
> >>  (gimple_associate_condition_with_expr): New.
> >>  (shortcut_cond_expr): Set condition discriminator.
> >>  (gimplify_cond_expr): Likewise.
> >>  (gimplify_function_tree): Call reset_cond_uid.
> >>  * ipa-inline.cc (can_early_inline_edge_p): Check
> >>condition_coverage_flag.
> >>  * ipa-split.cc (pass_split_functions::gate): Likewise.
> >>  * passes.cc (finish_optimization_passes): Likewise.
> >>  * profile.cc (struct condcov): New declaration.
> >>  (cov_length): Likewise.
> >>  (cov_blocks): Likewise.
> >>  (cov_masks): Likewise.
> >>  (cov_maps): Likewise.
> >>  (cov_free): Likewise.
> >>  (instrument_decisions): New.
> >>  (read_thunk_profile): Control output to file.
> >>  (branch_prob): Call find_conditions, instrument_decisions.
> >>  (init_branch_prob): Add total_num_conds.
> >>  (end_branch_prob): Likewise.
> >>  * tree-core.h (struct tree_exp): Add condition_uid.
> >>  * tree-profile.cc (struct conds_ctx): New.
> >>  (CONDITIONS_MAX_TERMS): New.
> >>  (EDGE_CONDITION): New.
> >>  (topological_cmp): New.
> >>  (index_of): New.
> >>  (single_p): New.
> >>  (single_edge): New.
> >>  (contract_edge_up): New.
> >>  (struct outcomes): New.
> >>  (conditional_succs): New.
> >>  (condition_index): New.
> >>  (condition_uid): New.
> >>  (masking_vectors): New.
> >>  (emit_assign): New.
> >>  (emit_bitwise_op): New.
> >>  (make_top_index_visit): New.
> >>  (make_top_index): New.
> >>  (paths_between): New.
> >>  (struct condcov): New.
> >>  (cov_length): New.
> >>  (cov_blocks): New.
> >>  (cov_masks): New.
> >>  (cov_maps): New.
> >>  (cov_free): New.
> >>  (find_conditions): New.
> >>  (struct counters): New.
> >>  (find_counters): New.
> >>  (resolve_counter): New.
> >>  (resolve_counters): New.
> >>  (instrument_decisions): New.
> >>  (tree_profiling): Check condition_coverage_flag.
> >>  (pass_ipa_tree_profile::gate): Likewise.
> >>  * tree.h (SET_EXPR_UID): New.
> >>  (EXPR_COND_UID): New.
> >>
> >> libgcc/ChangeLog:
> >>
> >>  * libgcov-merge.c (__gcov_merge_ior): New.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>  * lib/gcov.exp: Add condition coverage test function.
> >>  * g++.dg/gcov/gcov-18.C: New test.
> >>  * gcc.misc-tests/gcov-19.c: New test.
> >>  * gcc.misc-tests/gcov-20.c: New test.
> >>  * gcc.misc-tests/gcov-21.c: New test.
> >>  * gcc.misc-tests/gcov-22.c: New test.
> >>  * gcc.misc-tests/gcov-23.c: New test.
> >> ---
> >>   gcc/builtins.cc|2 +-
> >>   gcc/collect2.cc|7 +-
> >>   gcc/common.opt |9 +
> >>   gcc/doc/gcov.texi  |   38 +
> >>   gcc/doc/invoke.texi|   21 +
> >>   gcc/function.cc|1 +
> >>   gcc/function.h |4 +
> >>   gcc/gcc.cc |4 +-
> >>   gcc/gcov-counter.def   |3 +
> >>   gcc/gcov-dump.cc   |   24 +
> >>   gcc/gcov-io.h  |3

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread H.J. Lu

On Fri, Apr 5, 2024 at 6:52 AM Richard Biener  wrote:
>
>
>
> > Am 05.04.2024 um 15:46 schrieb H.J. Lu :
> >
> > On Fri, Apr 5, 2024 at 1:21 AM Richard Biener  wrote:
> >>
> >> There's no default bitmap obstack during global CTORs, so allocate the
> >> bitmap locally.
> >>
> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >>
> >> Richard.
> >>
> >>PR middle-end/114599
> >>* symtab.cc (ifunc_ref_map): Do not use auto_bitmap.
> >>(is_caller_ifunc_resolver): Optimize bitmap_bit_p/bitmap_set_bit
> >>pair.
> >>(symtab_node::check_ifunc_callee_symtab_nodes): Properly
> >>allocate ifunc_ref_map here.
> >> ---
> >> gcc/symtab.cc | 11 +++
> >> 1 file changed, 7 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/gcc/symtab.cc b/gcc/symtab.cc
> >> index 3256133891d..3b018ab3ea2 100644
> >> --- a/gcc/symtab.cc
> >> +++ b/gcc/symtab.cc
> >> @@ -1383,7 +1383,7 @@ check_ifunc_resolver (cgraph_node *node, void *data)
> >>   return false;
> >> }
> >>
> >> -static auto_bitmap ifunc_ref_map;
> >> +static bitmap ifunc_ref_map;
> >>
> >> /* Return true if any caller of NODE is an ifunc resolver.  */
> >>
> >> @@ -1404,9 +1404,8 @@ is_caller_ifunc_resolver (cgraph_node *node)
> >>
> >>   /* Skip if it has been visited.  */
> >>   unsigned int uid = e->caller->get_uid ();
> >> -  if (bitmap_bit_p (ifunc_ref_map, uid))
> >> +  if (!bitmap_set_bit (ifunc_ref_map, uid))
> >>continue;
> >> -  bitmap_set_bit (ifunc_ref_map, uid);
> >>
> >>   if (is_caller_ifunc_resolver (e->caller))
> >>{
> >> @@ -1437,6 +1436,9 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
> >> {
> >>   symtab_node *node;
> >>
> >> +  bitmap_obstack_initialize (NULL);
> >> +  ifunc_ref_map = BITMAP_ALLOC (NULL);
> >> +
> >>   FOR_EACH_SYMBOL (node)
> >> {
> >>   cgraph_node *cnode = dyn_cast  (node);
> >> @@ -1455,7 +1457,8 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
> >>cnode->called_by_ifunc_resolver = true;
> >> }
> >>
> >> -  bitmap_clear (ifunc_ref_map);
> >> +  BITMAP_FREE (ifunc_ref_map);
> >> +  bitmap_obstack_release (NULL);
> >> }
> >>
> >> /* Verify symbol table for internal consistency.  */
> >> --
> >> 2.35.3
> >
> > The bug isn't fixed:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599#c5
>
> Ah, I reproduced with -coverage and that is fixed now.  The still existing 
> bug must be something unrelated.
>

It is a bug in the -fcondition-coverage patch.

-- 
H.J.

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread Jørgen Kvalsvik


On 05/04/2024 16:05, H.J. Lu wrote:

On Fri, Apr 5, 2024 at 6:52 AM Richard Biener  wrote:





Am 05.04.2024 um 15:46 schrieb H.J. Lu :

On Fri, Apr 5, 2024 at 1:21 AM Richard Biener  wrote:


There's no default bitmap obstack during global CTORs, so allocate the
bitmap locally.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

PR middle-end/114599
* symtab.cc (ifunc_ref_map): Do not use auto_bitmap.
(is_caller_ifunc_resolver): Optimize bitmap_bit_p/bitmap_set_bit
pair.
(symtab_node::check_ifunc_callee_symtab_nodes): Properly
allocate ifunc_ref_map here.
---
gcc/symtab.cc | 11 +++
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/symtab.cc b/gcc/symtab.cc
index 3256133891d..3b018ab3ea2 100644
--- a/gcc/symtab.cc
+++ b/gcc/symtab.cc
@@ -1383,7 +1383,7 @@ check_ifunc_resolver (cgraph_node *node, void *data)
   return false;
}

-static auto_bitmap ifunc_ref_map;
+static bitmap ifunc_ref_map;

/* Return true if any caller of NODE is an ifunc resolver.  */

@@ -1404,9 +1404,8 @@ is_caller_ifunc_resolver (cgraph_node *node)

   /* Skip if it has been visited.  */
   unsigned int uid = e->caller->get_uid ();
-  if (bitmap_bit_p (ifunc_ref_map, uid))
+  if (!bitmap_set_bit (ifunc_ref_map, uid))
continue;
-  bitmap_set_bit (ifunc_ref_map, uid);

   if (is_caller_ifunc_resolver (e->caller))
{
@@ -1437,6 +1436,9 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
{
   symtab_node *node;

+  bitmap_obstack_initialize (NULL);
+  ifunc_ref_map = BITMAP_ALLOC (NULL);
+
   FOR_EACH_SYMBOL (node)
 {
   cgraph_node *cnode = dyn_cast  (node);
@@ -1455,7 +1457,8 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
cnode->called_by_ifunc_resolver = true;
 }

-  bitmap_clear (ifunc_ref_map);
+  BITMAP_FREE (ifunc_ref_map);
+  bitmap_obstack_release (NULL);
}

/* Verify symbol table for internal consistency.  */
--
2.35.3


The bug isn't fixed:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599#c5


Ah, I reproduced with -coverage and that is fixed now.  The still existing bug 
must be something unrelated.



It is a bug in the -fcondition-coverage patch.



Thanks for the report, I'll have a look right away.

Re: [PATCH] middle-end/114579 - speed up add_scope_conflicts

2024-04-05 Thread Jeff Law





On 4/4/24 6:41 AM, Richard Biener wrote:

The following speeds up stack variable conflict detection by recognizing
that the all-to-all conflict recording is only necessary for CFG merges
as it's the unioning of the live variable sets that doesn't come with
explicit mentions we record conflicts for.

If we employ this optimization we have to make sure to perform the
all-to-all conflict recording for all CFG merges even those into
empty blocks where we might previously have skipped this.

I have reworded the comment before the all-to-all conflict recording
since it seemed to be confusing and missing the point - but maybe I
am also missing something here.

Nevertheless for the testcase in the PR the compile-time spend in
add_scope_conflicts at -O1 drops from previously 67s (39%) to 10s (9%).

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?

Thanks,
Richard.

PR middle-end/114579
* cfgexpand.cc (add_scope_conflicts_1): Record all-to-all
conflicts only when there's a CFG merge but for all CFG merges.

OK.  Your call on whether or not to include in gcc-14 or wait for gcc-15.

jeff

[COMMITTED] Regenerate common.opt.urls

2024-04-05 Thread Mark Wielaard

The new support for gcov modified condition/decision coverage
introduced two new flags for gcc, -Wcoverage-too-many-conditions and
-fcondition-coverage. But didn't regenerate the gcc/common.opt.urls.

Fixes: 08a52331803f ("Add condition coverage (MC/DC)")

gcc/ChangeLog:

* common.opt.urls: Regenerate.
---

Pushed as obvious.

 gcc/common.opt.urls | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls
index db4354989fcc..f71ed80a34b4 100644
--- a/gcc/common.opt.urls
+++ b/gcc/common.opt.urls
@@ -280,6 +280,9 @@ UrlSuffix(gcc/Warning-Options.html#index-Wcoverage-mismatch)
 Wcoverage-invalid-line-number
 UrlSuffix(gcc/Warning-Options.html#index-Wcoverage-invalid-line-number)
 
+Wcoverage-too-many-conditions
+UrlSuffix(gcc/Warning-Options.html#index-Wcoverage-too-many-conditions)
+
 Wmissing-profile
 UrlSuffix(gcc/Warning-Options.html#index-Wmissing-profile)
 
@@ -1057,6 +1060,9 @@ 
UrlSuffix(gcc/Instrumentation-Options.html#index-fprofile-abs-path)
 ;   duplicate: 'gcc/Instrumentation-Options.html#index-fprofile-arcs'
 ;   duplicate: 'gcc/Other-Builtins.html#index-fprofile-arcs-1'
 
+fcondition-coverage
+UrlSuffix(gcc/Instrumentation-Options.html#index-fcondition-coverage)
+
 fprofile-dir=
 UrlSuffix(gcc/Instrumentation-Options.html#index-fprofile-dir)
 
-- 
2.44.0

[PATCH] x86: Use explicit shift count in double-precision shifts

2024-04-05 Thread H.J. Lu

Don't use implicit shift count in double-precision shifts in AT&T syntax
since they aren't in Intel SDM.  Keep the 's' modifier for backward
compatibility with inline asm statements.

PR target/114590
* config/i386/i386.md (x86_64_shld): Use explicit shift count in
AT&T syntax.
(x86_64_shld_ndd): Likewise.
(x86_shld): Likewise.
(x86_shld_ndd): Likewise.
(x86_64_shrd): Likewise.
(x86_64_shrd_ndd): Likewise.
(x86_shrd): Likewise.
(x86_shrd_ndd): Likewise.
---
 gcc/config/i386/i386.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6ac401154e4..bb2c72f3473 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -14503,7 +14503,7 @@ (define_insn "x86_64_shld"
  (and:QI (match_dup 2) (const_int 63 0)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT"
-  "shld{q}\t{%s2%1, %0|%0, %1, %2}"
+  "shld{q}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
(set_attr "mode" "DI")
@@ -14524,7 +14524,7 @@ (define_insn "x86_64_shld_ndd"
  (and:QI (match_dup 3) (const_int 63 0)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_APX_NDD"
-  "shld{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}"
+  "shld{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ishift")
(set_attr "mode" "DI")])
 
@@ -14681,7 +14681,7 @@ (define_insn "x86_shld"
  (and:QI (match_dup 2) (const_int 31 0)))
(clobber (reg:CC FLAGS_REG))]
   ""
-  "shld{l}\t{%s2%1, %0|%0, %1, %2}"
+  "shld{l}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
(set_attr "mode" "SI")
@@ -14703,7 +14703,7 @@ (define_insn "x86_shld_ndd"
  (and:QI (match_dup 3) (const_int 31 0)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_APX_NDD"
-  "shld{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}"
+  "shld{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ishift")
(set_attr "mode" "SI")])
 
@@ -15792,7 +15792,7 @@ (define_insn "x86_64_shrd"
  (and:QI (match_dup 2) (const_int 63 0)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT"
-  "shrd{q}\t{%s2%1, %0|%0, %1, %2}"
+  "shrd{q}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
(set_attr "mode" "DI")
@@ -15813,7 +15813,7 @@ (define_insn "x86_64_shrd_ndd"
  (and:QI (match_dup 3) (const_int 63 0)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_APX_NDD"
-  "shrd{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}"
+  "shrd{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ishift")
(set_attr "mode" "DI")])
 
@@ -15971,7 +15971,7 @@ (define_insn "x86_shrd"
  (and:QI (match_dup 2) (const_int 31 0)))
(clobber (reg:CC FLAGS_REG))]
   ""
-  "shrd{l}\t{%s2%1, %0|%0, %1, %2}"
+  "shrd{l}\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ishift")
(set_attr "prefix_0f" "1")
(set_attr "mode" "SI")
@@ -15993,7 +15993,7 @@ (define_insn "x86_shrd_ndd"
  (and:QI (match_dup 3) (const_int 31 0)))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_APX_NDD"
-  "shrd{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}"
+  "shrd{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ishift")
(set_attr "mode" "SI")])
 
-- 
2.44.0

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread Jørgen Kvalsvik


On 05/04/2024 16:05, H.J. Lu wrote:

On Fri, Apr 5, 2024 at 6:52 AM Richard Biener  wrote:





Am 05.04.2024 um 15:46 schrieb H.J. Lu :

On Fri, Apr 5, 2024 at 1:21 AM Richard Biener  wrote:


There's no default bitmap obstack during global CTORs, so allocate the
bitmap locally.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

PR middle-end/114599
* symtab.cc (ifunc_ref_map): Do not use auto_bitmap.
(is_caller_ifunc_resolver): Optimize bitmap_bit_p/bitmap_set_bit
pair.
(symtab_node::check_ifunc_callee_symtab_nodes): Properly
allocate ifunc_ref_map here.
---
gcc/symtab.cc | 11 +++
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/symtab.cc b/gcc/symtab.cc
index 3256133891d..3b018ab3ea2 100644
--- a/gcc/symtab.cc
+++ b/gcc/symtab.cc
@@ -1383,7 +1383,7 @@ check_ifunc_resolver (cgraph_node *node, void *data)
   return false;
}

-static auto_bitmap ifunc_ref_map;
+static bitmap ifunc_ref_map;

/* Return true if any caller of NODE is an ifunc resolver.  */

@@ -1404,9 +1404,8 @@ is_caller_ifunc_resolver (cgraph_node *node)

   /* Skip if it has been visited.  */
   unsigned int uid = e->caller->get_uid ();
-  if (bitmap_bit_p (ifunc_ref_map, uid))
+  if (!bitmap_set_bit (ifunc_ref_map, uid))
continue;
-  bitmap_set_bit (ifunc_ref_map, uid);

   if (is_caller_ifunc_resolver (e->caller))
{
@@ -1437,6 +1436,9 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
{
   symtab_node *node;

+  bitmap_obstack_initialize (NULL);
+  ifunc_ref_map = BITMAP_ALLOC (NULL);
+
   FOR_EACH_SYMBOL (node)
 {
   cgraph_node *cnode = dyn_cast  (node);
@@ -1455,7 +1457,8 @@ symtab_node::check_ifunc_callee_symtab_nodes (void)
cnode->called_by_ifunc_resolver = true;
 }

-  bitmap_clear (ifunc_ref_map);
+  BITMAP_FREE (ifunc_ref_map);
+  bitmap_obstack_release (NULL);
}

/* Verify symbol table for internal consistency.  */
--
2.35.3


The bug isn't fixed:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599#c5


Ah, I reproduced with -coverage and that is fixed now.  The still existing bug 
must be something unrelated.



It is a bug in the -fcondition-coverage patch.



It is. What seems to happen is that the -O2 causes a different path 
through gimplification so the gcond is not associated with the 
expression, and the condition coverage just assumes that all if it is 
even invoked, all gconds it finds will be associated. Since the cond_uid 
map is created on the first association (which does not happen) there is 
a nullptr deref when the cond id is looked up.


I think the fix is to simply guard the cond_uid access and fall back to 
the same path as on key misses, which is to skip the expression 
altogether. I would like to understand _why_ the gcond isn't associated 
with the expr in the first place, if anything to make a better decision 
here.


Thanks,
Jørgen

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-05 Thread Alex Coplan

On 05/04/2024 13:53, Ajit Agarwal wrote:
> Hello Alex/Richard:
> 
> All review comments are incorporated.

Thanks, I was kind-of expecting you to also send the renaming patch as a
preparatory patch as we discussed.

Sorry for another meta comment, but: I think the reason that the Linaro
CI isn't running tests on your patches is actually because you're
sending 1/3 of a series but not sending the rest of the series.

So please can you either send this as an individual preparatory patch
(not marked as a series) or if you're going to send a series (e.g. with
a preparatory rename patch as 1/2 and this as 2/2) then send the entire
series when you make updates.  That way the CI should test your patches,
which would be helpful.

> 
> Common infrastructure of load store pair fusion is divided into target
> independent and target dependent changed code.
> 
> Target independent code is the Generic code with pure virtual function
> to interface betwwen target independent and dependent code.
> 
> Target dependent code is the implementation of pure virtual function for
> aarch64 target and the call to target independent code.
> 
> Thanks & Regards
> Ajit
> 
> 
> aarch64: Place target independent and dependent changed code in one file
> 
> Common infrastructure of load store pair fusion is divided into target
> independent and target dependent changed code.
> 
> Target independent code is the Generic code with pure virtual function
> to interface betwwen target independent and dependent code.
> 
> Target dependent code is the implementation of pure virtual function for
> aarch64 target and the call to target independent code.
> 
> 2024-04-06  Ajit Kumar Agarwal  
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-ldp-fusion.cc: Place target
>   independent and dependent changed code.

You're going to need a proper ChangeLog eventually, but I guess there's
no need for that right now.

> ---
>  gcc/config/aarch64/aarch64-ldp-fusion.cc | 371 +++
>  1 file changed, 249 insertions(+), 122 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> index 22ed95eb743..cb21b514ef7 100644
> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> @@ -138,8 +138,122 @@ struct alt_base
>poly_int64 offset;
>  };
>  
> +// Virtual base class for load/store walkers used in alias analysis.
> +struct alias_walker
> +{
> +  virtual bool conflict_p (int &budget) const = 0;
> +  virtual insn_info *insn () const = 0;
> +  virtual bool valid () const  = 0;

Heh, looking at this made me realise there is a whitespace bug here in
the existing code (double space after const).  Sorry about that!  I'll
push an obvious fix for that.

> +  virtual void advance () = 0;
> +};
> +
> +struct pair_fusion {
> +
> +  pair_fusion () {};

This ctor looks pointless at the moment.  Perhaps instead we could put
the contents of ldp_fusion_init in here and then delete that function?

> +  virtual bool fpsimd_op_p (rtx reg_op, machine_mode mem_mode,
> +bool load_p) = 0;

Please can we have comments above each of these virtual functions
describing any parameters, what the purpose of the hook is, and the
interpretation of the return value?  This will serve as the
documentation for other targets that want to make use of the pass.

It might make sense to have a default-false implementation for
fpsimd_op_p, especially if you don't want to make use of this bit for
rs6000.

> +
> +  virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0;
> +  virtual bool pair_trailing_writeback_p () = 0;

Sorry for the run-around, but: I think this and
handle_writeback_opportunities () should be the same function, either
returning an enum or taking an enum and returning a boolean.

At a minimum they should have more similar sounding names.

> +  virtual bool pair_reg_operand_ok_p (bool load_p, rtx reg_op,
> +   machine_mode mem_mode) = 0;
> +  virtual int pair_mem_alias_check_limit () = 0;
> +  virtual bool handle_writeback_opportunities () = 0 ;
> +  virtual bool pair_mem_ok_with_policy (rtx first_mem, bool load_p,
> + machine_mode mode) = 0;
> +  virtual rtx gen_mem_pair (rtx *pats,  rtx writeback,

Nit: excess whitespace after pats,

> + bool load_p) = 0;
> +  virtual bool pair_mem_promote_writeback_p (rtx pat) = 0;
> +  virtual bool track_load_p () = 0;
> +  virtual bool track_store_p () = 0;

I think it would probably make more sense for these two to have
default-true implementations rather than being pure virtual functions.

Also, sorry for the bikeshedding, but please can we keep the plural
names (so track_loads_p and track_stores_p).

> +  virtual bool cand_insns_empty_p (std::list &insns) = 0;

Why does this need to be virtualised?  I would it expect it to
just be insns.empty () on all targets.

> +  virtual bool pair_mem_i

[PATCH][committed] aarch64: Fix whitespace in aarch64-ldp-fusion.cc:alias_walker

2024-04-05 Thread Alex Coplan

I spotted this whitespace error during the review of
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648846.html.

Pushing as obvious after testing on aarch64-linux-gnu.

Thanks,
Alex

gcc/ChangeLog:

* config/aarch64/aarch64-ldp-fusion.cc (struct alias_walker):
Fix double space after const qualifier on valid ().
diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 22ed95eb743..365dcf48b22 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -2138,7 +2138,7 @@ struct alias_walker
 {
   virtual bool conflict_p (int &budget) const = 0;
   virtual insn_info *insn () const = 0;
-  virtual bool valid () const  = 0;
+  virtual bool valid () const = 0;
   virtual void advance () = 0;
 };

[pushed] c++: add fixed test [PR91079]

2024-04-05 Thread Marek Polacek

Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
Fixed by r12-2975.

PR c++/91079
DR 1881

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_std_layout5.C: New test.
---
 gcc/testsuite/g++.dg/ext/is_std_layout5.C | 13 +
 1 file changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_std_layout5.C

diff --git a/gcc/testsuite/g++.dg/ext/is_std_layout5.C 
b/gcc/testsuite/g++.dg/ext/is_std_layout5.C
new file mode 100644
index 000..875f3c0948d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_std_layout5.C
@@ -0,0 +1,13 @@
+// PR c++/91079
+// DR 1881 - Standard-layout classes and unnamed bit-fields
+// { dg-do compile { target c++11 } }
+
+struct A { int a : 4; };
+struct B : A { int b : 3; };
+static_assert(__is_standard_layout(A), "");
+static_assert(!__is_standard_layout(B), "");
+
+struct C { int : 0; };
+struct D : C { int : 0; };
+static_assert(__is_standard_layout(C), "");
+static_assert(!__is_standard_layout(D), "");

base-commit: e7d015b2506a1d9e84d9f7182e42e097147527e1
-- 
2.44.0

Re: [patch, libgfortran] PR114304 - [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-04-05 Thread Jerry D


On 4/4/24 2:41 PM, Tobias Burnus wrote:

Hi Jerry,

I think for the current testcases, I like the patch – the question is 
only what's about:


   ',3' as input for 'comma'   (or '.3' as input for 'point')

For 'point' – 0.3 is read and ios = 0 (as expected)
But for 'comma':
* GCC 12 reads nothing and has ios = 0.
* GCC 13/mainline has an error (ios != 0 – and reads nothing)
* GCC with your patch: Same result: ios != 0 and nothing read.

Expected: Same as with ','/'comma' – namely: read-in value is 0.3.
→ https://godbolt.org/z/4rc8fz4sT for the full example, which works with 
ifort, ifx and flang


* * *

Can you check and fix this? It looks perfectly valid to me to have 
remove the '0' in the floating point numbers '0.3' or '0,3' seems to be 
permitted – and it works for '.' (with 'point') but not for ',' (with 
'comma').



Yes, I found the spot to fix this.

F2023's "13.10.3.1 List-directed input forms" refers to "13.7.2.3.2 F 
editing", which states:


"The standard form of the input field [...] The form of the mantissa is 
an optional sign, followed by a string of one or more digits optionally 
containing a decimal symbol."


The latter does not require that the digit has to be before the decimal 
sign and as for output, it is optional, it is surely intended that ",3" 
is a valid floating-point number for decimal='comma'.




Agree


* * *

I extended the testcase to check for this – see attached diff. All 
'point' work, all 'comma' fail.


Thanks for working on this!

Tobias


Thanks much. After I fix it, I will use your extended test case in the 
test suite.


Jerry -

[pushed] testsuite, Darwin: Account for block labels in function body scans.

2024-04-05 Thread Iain Sandoe

tested on aarch64-apple-darwin21, pushed to trunk, thanks.
Iain

--- 8< ---

When we have '-O3 -g' we emit a bunch of LB{B,E} local labels which
were not currently being discarded, leading to some test fails.

Fixed by adding this case to the ignored labels.

gcc/testsuite/ChangeLog:

* lib/scanasm.exp: Add 'LB*' to the local labels that are
ignored for Darwin.

Signed-off-by: Iain Sandoe 
---
 gcc/testsuite/lib/scanasm.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 741a5a048b8..6cf9997240d 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -895,7 +895,7 @@ proc configure_check-function-bodies { config } {
# example).
set up_config(fluff) {^\s*(?://)}
 } elseif { [istarget *-*-darwin*] } {
-   set up_config(fluff) {^\s*(?:\.|//|@)|^L[0-9ACESV]}
+   set up_config(fluff) {^\s*(?:\.|//|@)|^L[0-9ABCESV]}
 } else {
# Skip lines beginning with labels ('.L[...]:') or other directives
# ('.align', '.cfi_startproc', '.quad [...]', '.text', etc.), '//' or
-- 
2.39.2 (Apple Git-143)

Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]

2024-04-05 Thread Andrew Pinski

On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis  wrote:
>
> If we consider code like:
>
> if (bar1 == x)
>   return foo();
> if (bar2 != y)
>   return foo();
> return 0;
>
> We would like the ifcombine pass to convert this to:
>
> if (bar1 == x || bar2 != y)
>   return foo();
> return 0;
>
> The ifcombine pass can handle this transformation but it is ran very early and
> it misses the opportunity because there are two seperate blocks for foo().
> The pre pass is good at removing duplicate code and blocks and due to that
> running ifcombine again after it can increase the number of successful
> conversions.

I do think we should have something similar to re-running
ssa-ifcombine but I think it should be much later, like after the loop
optimizations are done.
Maybe just a simplified version of it (that does the combining and not
the optimizations part) included in isel or pass_optimize_widening_mul
(which itself should most likely become part of isel or renamed since
it handles more than just widening multiply these days).


Thanks,
Andrew Pinski


>
> PR 102793
>
> gcc/ChangeLog:
>
> * common.opt: -ftree-ifcombine option, enabled by default.
> * doc/invoke.texi: Document.
> * passes.def: Re-run ssa-ifcombine after pre.
> * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
> * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
> * gcc.target/aarch64/pr102793.c: New test.
>
> Signed-off-by: Manolis Tsamis 
> ---
>
>  gcc/common.opt  |  4 +++
>  gcc/doc/invoke.texi |  5 
>  gcc/passes.def  |  1 +
>  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
>  gcc/testsuite/gcc.dg/uninit-pred-6_c.c  |  4 ---
>  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +
>  gcc/tree-ssa-ifcombine.cc   |  5 
>  7 files changed, 46 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index ad348844775..e943202bcf1 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3163,6 +3163,10 @@ ftree-phiprop
>  Common Var(flag_tree_phiprop) Init(1) Optimization
>  Enable hoisting loads from conditional pointers.
>
> +ftree-ifcombine
> +Common Var(flag_tree_ifcombine) Init(1) Optimization
> +Merge some conditional branches to simplify control flow.
> +
>  ftree-pre
>  Common Var(flag_tree_pre) Optimization
>  Enable SSA-PRE optimization on trees.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e2edf7a6c13..8d2ff6b4512 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and 
> higher.
>  Perform hoisting of loads from conditional pointers on trees.  This
>  pass is enabled by default at @option{-O1} and higher.
>
> +@opindex ftree-ifcombine
> +@item -ftree-ifcombine
> +Merge some conditional branches to simplify control flow.  This pass
> +is enabled by default at @option{-O1} and higher.
> +
>  @opindex fhoist-adjacent-loads
>  @item -fhoist-adjacent-loads
>  Speculatively hoist loads from both branches of an if-then-else if the
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1cbbd413097..1765b476131 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
>NEXT_PASS (pass_lim);
>NEXT_PASS (pass_walloca, false);
>NEXT_PASS (pass_pre);
> +  NEXT_PASS (pass_tree_ifcombine);
>NEXT_PASS (pass_sink_code, false /* unsplit edges */);
>NEXT_PASS (pass_sancov);
>NEXT_PASS (pass_asan);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> index 16c79da9521..66c9f481a2f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
>
>  struct rtx_def;
>  typedef struct rtx_def *rtx;
> diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c 
> b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> index f60868dad23..2d8e6501a45 100644
> --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
>if ( (n > 10) && l)
>blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
>
> -  if (l)
> -if (n > 12)
> -  blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> -
>return 0;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c 
> b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> new file mode 100

[PATCH] target: missing -Whardened with -fcf-protection=none [PR114606]

2024-04-05 Thread Marek Polacek

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
-Whardened warns when -fhardened couldn't enable a hardening option
because that option was disabled on the command line, e.g.:

$ ./cc1plus -quiet g.C -fhardened -O2 -fstack-protector
cc1plus: warning: '-fstack-protector-strong' is not enabled by '-fhardened' 
because it was specified on the command line [-Whardened]

but it doesn't work as expected with -fcf-protection=none:

$ ./cc1plus -quiet g.C -fhardened -O2 -fcf-protection=none

because we're checking == CF_NONE which doesn't distinguish between nothing
and -fcf-protection=none.  I should have used OPTION_SET_P, like below.

PR target/114606

gcc/ChangeLog:

* config/i386/i386-options.cc (ix86_option_override_internal): Use
OPTION_SET_P rather than checking == CF_NONE.

gcc/testsuite/ChangeLog:

* gcc.target/i386/fhardened-1.c: New test.
* gcc.target/i386/fhardened-2.c: New test.
---
 gcc/config/i386/i386-options.cc | 2 +-
 gcc/testsuite/gcc.target/i386/fhardened-1.c | 8 
 gcc/testsuite/gcc.target/i386/fhardened-2.c | 8 
 3 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-2.c

diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index 7896d576977..20c6dc48090 100644
--- a/gcc/config/i386/i386-options.cc
+++ b/gcc/config/i386/i386-options.cc
@@ -3242,7 +3242,7 @@ ix86_option_override_internal (bool main_args_p,
  on the command line.  */
   if (opts->x_flag_hardened && cf_okay_p)
 {
-  if (opts->x_flag_cf_protection == CF_NONE)
+  if (!OPTION_SET_P (flag_cf_protection))
opts->x_flag_cf_protection = CF_FULL;
   else if (opts->x_flag_cf_protection != CF_FULL)
warning_at (UNKNOWN_LOCATION, OPT_Whardened,
diff --git a/gcc/testsuite/gcc.target/i386/fhardened-1.c 
b/gcc/testsuite/gcc.target/i386/fhardened-1.c
new file mode 100644
index 000..55d1718ff55
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fhardened-1.c
@@ -0,0 +1,8 @@
+/* PR target/114606 */
+/* { dg-options "-fhardened -O2 -fcf-protection=none" } */
+
+#ifdef __CET__
+# error "-fcf-protection enabled when it should not be"
+#endif
+
+/* { dg-warning ".-fcf-protection=full. is not enabled by .-fhardened. because 
it was specified" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/fhardened-2.c 
b/gcc/testsuite/gcc.target/i386/fhardened-2.c
new file mode 100644
index 000..9b8c1381c19
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fhardened-2.c
@@ -0,0 +1,8 @@
+/* PR target/114606 */
+/* { dg-options "-fhardened -O2" } */
+
+#if __CET__ != 3
+# error "-fcf-protection not enabled"
+#endif
+
+/* { dg-bogus ".-fcf-protection=full. is not enabled by .-fhardened. because 
it was specified" "" { target *-*-* } 0 } */

base-commit: e7d015b2506a1d9e84d9f7182e42e097147527e1
-- 
2.44.0

Re: [PATCH] target: missing -Whardened with -fcf-protection=none [PR114606]

2024-04-05 Thread Jakub Jelinek

On Fri, Apr 05, 2024 at 02:22:18PM -0400, Marek Polacek wrote:
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> -- >8 --
> -Whardened warns when -fhardened couldn't enable a hardening option
> because that option was disabled on the command line, e.g.:
> 
> $ ./cc1plus -quiet g.C -fhardened -O2 -fstack-protector
> cc1plus: warning: '-fstack-protector-strong' is not enabled by '-fhardened' 
> because it was specified on the command line [-Whardened]
> 
> but it doesn't work as expected with -fcf-protection=none:
> 
> $ ./cc1plus -quiet g.C -fhardened -O2 -fcf-protection=none
> 
> because we're checking == CF_NONE which doesn't distinguish between nothing
> and -fcf-protection=none.  I should have used OPTION_SET_P, like below.
> 
>   PR target/114606
> 
> gcc/ChangeLog:
> 
>   * config/i386/i386-options.cc (ix86_option_override_internal): Use
>   OPTION_SET_P rather than checking == CF_NONE.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/fhardened-1.c: New test.
>   * gcc.target/i386/fhardened-2.c: New test.
> ---
>  gcc/config/i386/i386-options.cc | 2 +-
>  gcc/testsuite/gcc.target/i386/fhardened-1.c | 8 
>  gcc/testsuite/gcc.target/i386/fhardened-2.c | 8 
>  3 files changed, 17 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-2.c
> 
> diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
> index 7896d576977..20c6dc48090 100644
> --- a/gcc/config/i386/i386-options.cc
> +++ b/gcc/config/i386/i386-options.cc
> @@ -3242,7 +3242,7 @@ ix86_option_override_internal (bool main_args_p,
>   on the command line.  */
>if (opts->x_flag_hardened && cf_okay_p)
>  {
> -  if (opts->x_flag_cf_protection == CF_NONE)
> +  if (!OPTION_SET_P (flag_cf_protection))

This function is passed explicit opts and opts_set arguments, so it
shouldn't be using flag_something macros nor OPTION_SET_P, as the former
use global_options.x_flag_something rather than opts->x_flag_something
and the latter uses global_options_set.x_flag_something.

So, I think you want to use if (!opts_set->x_flag_cf_protection)
instead.

>   opts->x_flag_cf_protection = CF_FULL;
>else if (opts->x_flag_cf_protection != CF_FULL)
>   warning_at (UNKNOWN_LOCATION, OPT_Whardened,

Otherwise LGTM.

Jakub

[PATCH v2] target: missing -Whardened with -fcf-protection=none [PR114606]

2024-04-05 Thread Marek Polacek

On Fri, Apr 05, 2024 at 08:28:08PM +0200, Jakub Jelinek wrote:
> On Fri, Apr 05, 2024 at 02:22:18PM -0400, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > -Whardened warns when -fhardened couldn't enable a hardening option
> > because that option was disabled on the command line, e.g.:
> > 
> > $ ./cc1plus -quiet g.C -fhardened -O2 -fstack-protector
> > cc1plus: warning: '-fstack-protector-strong' is not enabled by '-fhardened' 
> > because it was specified on the command line [-Whardened]
> > 
> > but it doesn't work as expected with -fcf-protection=none:
> > 
> > $ ./cc1plus -quiet g.C -fhardened -O2 -fcf-protection=none
> > 
> > because we're checking == CF_NONE which doesn't distinguish between nothing
> > and -fcf-protection=none.  I should have used OPTION_SET_P, like below.
> > 
> > PR target/114606
> > 
> > gcc/ChangeLog:
> > 
> > * config/i386/i386-options.cc (ix86_option_override_internal): Use
> > OPTION_SET_P rather than checking == CF_NONE.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.target/i386/fhardened-1.c: New test.
> > * gcc.target/i386/fhardened-2.c: New test.
> > ---
> >  gcc/config/i386/i386-options.cc | 2 +-
> >  gcc/testsuite/gcc.target/i386/fhardened-1.c | 8 
> >  gcc/testsuite/gcc.target/i386/fhardened-2.c | 8 
> >  3 files changed, 17 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-2.c
> > 
> > diff --git a/gcc/config/i386/i386-options.cc 
> > b/gcc/config/i386/i386-options.cc
> > index 7896d576977..20c6dc48090 100644
> > --- a/gcc/config/i386/i386-options.cc
> > +++ b/gcc/config/i386/i386-options.cc
> > @@ -3242,7 +3242,7 @@ ix86_option_override_internal (bool main_args_p,
> >   on the command line.  */
> >if (opts->x_flag_hardened && cf_okay_p)
> >  {
> > -  if (opts->x_flag_cf_protection == CF_NONE)
> > +  if (!OPTION_SET_P (flag_cf_protection))
> 
> This function is passed explicit opts and opts_set arguments, so it
> shouldn't be using flag_something macros nor OPTION_SET_P, as the former
> use global_options.x_flag_something rather than opts->x_flag_something
> and the latter uses global_options_set.x_flag_something.

Ah right, so the other uses of OPTION_SET_P in ix86_option_override_internal
are also wrong?

> So, I think you want to use if (!opts_set->x_flag_cf_protection)
> instead.

Fixed below, thanks.

New tests passed on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
-Whardened warns when -fhardened couldn't enable a hardening option
because that option was disabled on the command line, e.g.:

$ ./cc1plus -quiet g.C -fhardened -O2 -fstack-protector
cc1plus: warning: '-fstack-protector-strong' is not enabled by '-fhardened' 
because it was specified on the command line [-Whardened]

but it doesn't work as expected with -fcf-protection=none:

$ ./cc1plus -quiet g.C -fhardened -O2 -fcf-protection=none

because we're checking == CF_NONE which doesn't distinguish between nothing
and -fcf-protection=none.  I should have used opts_set, like below.

PR target/114606

gcc/ChangeLog:

* config/i386/i386-options.cc (ix86_option_override_internal): Use
opts_set rather than checking == CF_NONE.

gcc/testsuite/ChangeLog:

* gcc.target/i386/fhardened-1.c: New test.
* gcc.target/i386/fhardened-2.c: New test.
---
 gcc/config/i386/i386-options.cc | 2 +-
 gcc/testsuite/gcc.target/i386/fhardened-1.c | 8 
 gcc/testsuite/gcc.target/i386/fhardened-2.c | 8 
 3 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/fhardened-2.c

diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index 7896d576977..68a2e1c6910 100644
--- a/gcc/config/i386/i386-options.cc
+++ b/gcc/config/i386/i386-options.cc
@@ -3242,7 +3242,7 @@ ix86_option_override_internal (bool main_args_p,
  on the command line.  */
   if (opts->x_flag_hardened && cf_okay_p)
 {
-  if (opts->x_flag_cf_protection == CF_NONE)
+  if (!opts_set->x_flag_cf_protection)
opts->x_flag_cf_protection = CF_FULL;
   else if (opts->x_flag_cf_protection != CF_FULL)
warning_at (UNKNOWN_LOCATION, OPT_Whardened,
diff --git a/gcc/testsuite/gcc.target/i386/fhardened-1.c 
b/gcc/testsuite/gcc.target/i386/fhardened-1.c
new file mode 100644
index 000..55d1718ff55
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fhardened-1.c
@@ -0,0 +1,8 @@
+/* PR target/114606 */
+/* { dg-options "-fhardened -O2 -fcf-protection=none" } */
+
+#ifdef __CET__
+# error "-fcf-protection enabled when it should not be"
+#endif
+
+/* { dg-warning ".-fcf-protection=full. is not enabled by .-fhardened. because 
it was specified" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/fharden

Re: [PATCH] c++: Fix up maybe_warn_for_constant_evaluated calls [PR114580]

2024-04-05 Thread Marek Polacek

On Fri, Apr 05, 2024 at 09:40:48AM +0200, Jakub Jelinek wrote:
> Hi!
> 
> When looking at maybe_warn_for_constant_evaluated for the trivial
> infinite loops patch, I've noticed that it can emit weird diagnostics
> for if constexpr in templates, first warn that std::is_constant_evaluted()
> always evaluates to false (because the function template is not constexpr)
> and then during instantiation warn that std::is_constant_evaluted()
> always evaluates to true (because it is used in if constexpr condition).
> Now, only the latter is actually true, even when the if constexpr
> is in a non-constexpr function, it will still always evaluate to true.
> 
> So, the following patch fixes it to call maybe_warn_for_constant_evaluated
> always with IF_STMT_CONSTEXPR_P (if_stmt) as the second argument rather than
> true if it is if constexpr with non-dependent condition etc.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2024-04-05  Jakub Jelinek  
> 
>   PR c++/114580
>   * semantics.cc (finish_if_stmt_cond): Call
>   maybe_warn_for_constant_evaluated with IF_STMT_CONSTEXPR_P (if_stmt)
>   as the second argument, rather than true/false depending on if
>   it is if constexpr with non-dependent constant expression with
>   bool type.
> 
>   * g++.dg/cpp2a/is-constant-evaluated15.C: New test.
> 
> --- gcc/cp/semantics.cc.jj2024-04-03 09:58:33.407772541 +0200
> +++ gcc/cp/semantics.cc   2024-04-04 12:11:36.203886572 +0200
> @@ -1126,6 +1126,9 @@ tree
>  finish_if_stmt_cond (tree orig_cond, tree if_stmt)
>  {
>tree cond = maybe_convert_cond (orig_cond);
> +  maybe_warn_for_constant_evaluated (cond,
> +  /*constexpr_if=*/
> +  IF_STMT_CONSTEXPR_P (if_stmt));

I don't think we need the comment anymore since it's clear what the
argument does, and then the whole call can fit on a single line.

But either way, the patch looks good, thanks.

>if (IF_STMT_CONSTEXPR_P (if_stmt)
>&& !type_dependent_expression_p (cond)
>&& require_constant_expression (cond)
> @@ -1134,16 +1137,11 @@ finish_if_stmt_cond (tree orig_cond, tre
>converted to bool.  */
>&& TYPE_MAIN_VARIANT (TREE_TYPE (cond)) == boolean_type_node)
>  {
> -  maybe_warn_for_constant_evaluated (cond, /*constexpr_if=*/true);
>cond = instantiate_non_dependent_expr (cond);
>cond = cxx_constant_value (cond);
>  }
> -  else
> -{
> -  maybe_warn_for_constant_evaluated (cond, /*constexpr_if=*/false);
> -  if (processing_template_decl)
> - cond = orig_cond;
> -}
> +  else if (processing_template_decl)
> +cond = orig_cond;
>finish_cond (&IF_COND (if_stmt), cond);
>add_stmt (if_stmt);
>THEN_CLAUSE (if_stmt) = push_stmt_list ();
> --- gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated15.C.jj   2024-04-04 
> 12:23:36.706962932 +0200
> +++ gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated15.C  2024-04-04 
> 12:22:29.915882859 +0200
> @@ -0,0 +1,28 @@
> +// PR c++/114580
> +// { dg-do compile { target c++17 } }
> +// { dg-options "-Wtautological-compare" }
> +
> +namespace std {
> +  constexpr inline bool
> +  is_constant_evaluated () noexcept
> +  {
> +#if __cpp_if_consteval >= 202106L
> +if consteval { return true; } else { return false; }
> +#else
> +return __builtin_is_constant_evaluated ();
> +#endif
> +  }
> +}
> +
> +template 
> +void foo ()
> +{
> +  if constexpr ((T) std::is_constant_evaluated ())   // { dg-warning 
> "'std::is_constant_evaluated' always evaluates to true in 'if constexpr'" }
> +;// { dg-bogus 
> "'std::is_constant_evaluated' always evaluates to false in a non-'constexpr' 
> function" }
> +}
> +
> +void
> +bar ()
> +{
> +  foo  ();
> +}
> 
>   Jakub
> 

Marek

[pushed] analyzer: respect GCC_COLORS in out-of-bounds diagrams [PR114588]

2024-04-05 Thread David Malcolm

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9817-g4b02dd48f531ea.

gcc/analyzer/ChangeLog:
PR analyzer/114588
* access-diagram.cc (access_diagram_impl::access_diagram_impl):
Replace hardcoded colors for valid_style and invalid_style with
calls to text_art::get_style_from_color_cap_name.

gcc/ChangeLog:
PR analyzer/114588
* diagnostic-color.cc (color_dict): Add "valid" and "invalid" as
color capability names.
* doc/invoke.texi: Document them in description of GCC_COLORS.
* text-art/style.cc: Include "diagnostic-color.h".
(text_art::get_style_from_color_cap_name): New.
* text-art/types.h (get_style_from_color_cap_name): New decl.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/access-diagram.cc |  8 ++--
 gcc/diagnostic-color.cc|  2 ++
 gcc/doc/invoke.texi| 10 +-
 gcc/text-art/style.cc  | 18 ++
 gcc/text-art/types.h   |  2 ++
 5 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/gcc/analyzer/access-diagram.cc b/gcc/analyzer/access-diagram.cc
index 4cb6570e90b9..85e1049bb898 100644
--- a/gcc/analyzer/access-diagram.cc
+++ b/gcc/analyzer/access-diagram.cc
@@ -2059,14 +2059,10 @@ public:
 
 /* Register painting styles.  */
 {
-  style valid_style;
-  valid_style.m_fg_color = style::named_color::GREEN;
-  valid_style.m_bold = true;
+  style valid_style (get_style_from_color_cap_name ("valid"));
   m_valid_style_id = m_sm.get_or_create_id (valid_style);
 
-  style invalid_style;
-  invalid_style.m_fg_color = style::named_color::RED;
-  invalid_style.m_bold = true;
+  style invalid_style (get_style_from_color_cap_name ("invalid"));
   m_invalid_style_id = m_sm.get_or_create_id (invalid_style);
 }
 
diff --git a/gcc/diagnostic-color.cc b/gcc/diagnostic-color.cc
index 4859f36da6a9..f01a0fc2e377 100644
--- a/gcc/diagnostic-color.cc
+++ b/gcc/diagnostic-color.cc
@@ -101,6 +101,8 @@ static struct color_cap color_dict[] =
   { "diff-delete", SGR_SEQ (COLOR_FG_RED), 11, false },
   { "diff-insert", SGR_SEQ (COLOR_FG_GREEN), 11, false },
   { "type-diff", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 9, false 
},
+  { "valid", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 5, false },
+  { "invalid", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_RED), 7, false },
   { NULL, NULL, 0, false }
 };
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c584664e1688..73bb2aec7c31 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5241,7 +5241,7 @@ The default @env{GCC_COLORS} is
 error=01;31:warning=01;35:note=01;36:range1=32:range2=34:locus=01:\
 quote=01:path=01;36:fixit-insert=32:fixit-delete=31:\
 diff-filename=01:diff-hunk=32:diff-delete=31:diff-insert=32:\
-type-diff=01;32:fnname=01;32:targs=35
+type-diff=01;32:fnname=01;32:targs=35:valid=01;31:invalid=01;32
 @end smallexample
 @noindent
 where @samp{01;31} is bold red, @samp{01;35} is bold magenta,
@@ -5324,6 +5324,14 @@ SGR substring for inserted lines within generated 
patches.
 @item type-diff=
 SGR substring for highlighting mismatching types within template
 arguments in the C++ frontend.
+
+@vindex valid GCC_COLORS @r{capability}
+@item valid=
+SGR substring for highlighting valid elements within text art diagrams.
+
+@vindex invalid GCC_COLORS @r{capability}
+@item invalid=
+SGR substring for highlighting invalid elements within text art diagrams.
 @end table
 
 @opindex fdiagnostics-urls
diff --git a/gcc/text-art/style.cc b/gcc/text-art/style.cc
index e74efe23014c..5c58d432cf48 100644
--- a/gcc/text-art/style.cc
+++ b/gcc/text-art/style.cc
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "text-art/selftests.h"
 #include "text-art/types.h"
 #include "color-macros.h"
+#include "diagnostic-color.h"
 
 using namespace text_art;
 
@@ -256,6 +257,23 @@ style::print_changes (pretty_printer *pp,
 }
 }
 
+/* Look up the current SGR codes for a color capability NAME
+   (from GCC_COLORS or the defaults), and convert them to
+   a text_art::style.  */
+
+style
+text_art::get_style_from_color_cap_name (const char *name)
+{
+  const char *sgr_codes = colorize_start (true, name);
+  gcc_assert (sgr_codes);
+
+  /* Parse the sgr codes.  We expect the resulting styled_string to be
+ empty; we're interested in the final style created during parsing.  */
+  style_manager sm;
+  styled_string styled_str (sm, sgr_codes);
+  return sm.get_style (sm.get_num_styles () - 1);
+}
+
 /* class text_art::style_manager.  */
 
 style_manager::style_manager ()
diff --git a/gcc/text-art/types.h b/gcc/text-art/types.h
index 02de2e86122e..2b9f8b387c71 100644
--- a/gcc/text-art/types.h
+++ b/gcc/text-art/types.h
@@ -332,6 +332,8 @@ struct style
   std::vector m_url; // empty = no URL
 };
 
+extern style get_style_from_color_cap_name (const char *name);
+
 /* A class to keep t

Re: [PATCH 3/3] tree-optimization/114052 - niter analysis from undefined behavior

2024-04-05 Thread Jan Hubicka

> +   /* When there's a call that might not return the last iteration
> +  is possibly partial.  This matches what we check in invariant
> +  motion.
> +  ???  For the call argument evaluation it would be still OK.  */
> +   if (!may_have_exited
> +   && is_gimple_call (stmt)
> +   && gimple_has_side_effects (stmt))
> + may_have_exited = true;

I think you are missing here non-call EH, volatile asms and traps.
 We have stmt_may_terminate_function_p which tests there.

Honza
> +
> +   infer_loop_bounds_from_array (loop, stmt,
> + reliable && !may_have_exited);
>  
> -   if (reliable)
> +   if (reliable && !may_have_exited)
>  {
>infer_loop_bounds_from_signedness (loop, stmt);
>infer_loop_bounds_from_pointer_arith (loop, stmt);
>  }
>   }
> -
>  }
>  }
>  
> @@ -4832,7 +4855,7 @@ estimate_numbers_of_iterations (class loop *loop)
>   diagnose those loops with -Waggressive-loop-optimizations.  */
>number_of_latch_executions (loop);
>  
> -  basic_block *body = get_loop_body (loop);
> +  basic_block *body = get_loop_body_in_rpo (cfun, loop);
>auto_vec exits = get_loop_exit_edges (loop, body);
>likely_exit = single_likely_exit (loop, exits);
>FOR_EACH_VEC_ELT (exits, i, ex)
> -- 
> 2.35.3

[PATCH 0/2] Condition coverage fixes

2024-04-05 Thread Jørgen Kvalsvik

Hi,

I propose these fixes for the current issues with the condition
coverage.

Rainer, I propose to simply delete the test with __sigsetjmp. I don't
think it actually detects anything reasonable any more, I kept it around
to prevent a regression. Since then I have built a lot of programs (with
optimization enabled) and not really seen this problem.

H.J., the problem you found with -O2 was really a problem of
tree-inlining, which was actually caught earlier by Jan [1]. It probably
warrants some more testing, but I could reproduce by tuning your test
case to use always_inline and not -O2 and trigger the error.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648785.html

Thanks,
Jørgen

Jørgen Kvalsvik (2):
  Remove unecessary and broken MC/DC compile test
  Copy condition->expr map when inlining [PR114599]

 gcc/testsuite/gcc.misc-tests/gcov-19.c   | 11 -
 gcc/testsuite/gcc.misc-tests/gcov-pr114599.c | 25 
 gcc/tree-inline.cc   | 20 +++-
 3 files changed, 44 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-pr114599.c

-- 
2.30.2

[PATCH 2/2] Copy condition->expr map when inlining [PR114599]

2024-04-05 Thread Jørgen Kvalsvik

When a function is tree-inlined, copy the condition -> expression mapping
from the inlined function into the caller, shifted so uids are not mixed
Tree inlining was always problematic under condition coverage - either
through a nullptr dereference (like in the test case), or through
quietly mixing conditions when the assigned IDs overlapped.

PR middle-end/114599

gcc/ChangeLog:

* tree-inline.cc (add_local_variables): Copy cond_uids mappings.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-pr114599.c: New test.
---
 gcc/testsuite/gcc.misc-tests/gcov-pr114599.c | 25 
 gcc/tree-inline.cc   | 20 +++-
 2 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-pr114599.c

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-pr114599.c 
b/gcc/testsuite/gcc.misc-tests/gcov-pr114599.c
new file mode 100644
index 000..e4c78c9c121
--- /dev/null
+++ b/gcc/testsuite/gcc.misc-tests/gcov-pr114599.c
@@ -0,0 +1,25 @@
+/* PR middle-end/114599 */
+/* { dg-do compile } */
+/* { dg-options "-fcondition-coverage" } */
+
+/* Check that a function with a condition inlined into a function without a
+   conditional works.  When inlining happens the condition -> expression
+   mapping must be carried over.  */
+
+extern int type;
+
+void fn (void);
+
+__attribute__((always_inline))
+inline void
+do_all_fn_doall_arg (void)
+{
+  if (type)
+fn ();
+}
+
+void
+do_all_fn_LHASH_DOALL_ARG_arg2 (void)
+{
+  do_all_fn_doall_arg ();
+}
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index eebcea8a029..b18917707cc 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4659,7 +4659,8 @@ prepend_lexical_block (tree current_block, tree new_block)
   BLOCK_SUPERCONTEXT (new_block) = current_block;
 }
 
-/* Add local variables from CALLEE to CALLER.  */
+/* Add local variables from CALLEE to CALLER.  If set for condition coverage,
+   copy basic condition -> expression mapping to CALLER.  */
 
 static inline void
 add_local_variables (struct function *callee, struct function *caller,
@@ -4689,6 +4690,23 @@ add_local_variables (struct function *callee, struct 
function *caller,
  }
add_local_decl (caller, new_var);
   }
+
+  /* If -fcondition-coverage is used and the caller has conditions, copy the
+ mapping into the caller but and the end so the caller and callee
+ expressions aren't mixed.  */
+  if (callee->cond_uids)
+{
+  if (!caller->cond_uids)
+   caller->cond_uids = new hash_map  ();
+
+  unsigned dst_max_uid = 0;
+  for (auto itr : *callee->cond_uids)
+   if (itr.second >= dst_max_uid)
+ dst_max_uid = itr.second + 1;
+
+  for (auto itr : *callee->cond_uids)
+   caller->cond_uids->put (itr.first, itr.second + dst_max_uid);
+}
 }
 
 /* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might
-- 
2.30.2

[PATCH 1/2] Remove unecessary and broken MC/DC compile test

2024-04-05 Thread Jørgen Kvalsvik

The __sigsetjmp test was added as a regression test, which an early
iteration of the MC/DC support caused an internal compiler error,
triggered by a code path which did not make it through to the final
revision.  Since this test really only worked on linux and does not
serve a purpose any more it can be removed.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-19.c: Remove test.
---
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index 17f1fb4e923..b83a38531ba 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -869,17 +869,6 @@ dest:
 goto * 0;
 }
 
-int __sigsetjmp ();
-
-/* This should compile, but not called. */
-void
-mcdc021c ()
-{
-  while (x) /* conditions(0/2) true(0) false(0)*/
-   /* conditions(end) */
- __sigsetjmp ();
-}
-
 /* If edges are not properly contracted the a && id (b) will be interpreted as
two independent expressions. */
 void
-- 
2.30.2

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-05 Thread Segher Boessenkool

Hi!

On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote:
> The following avoids re-walking and re-combining the instructions
> between i2 and i3 when the pattern of i2 doesn't change.
> 
> Bootstrap and regtest running ontop of a reversal of 
> r14-9692-g839bc42772ba7a.

Please include that in the patch (or series, preferably).

> It brings down memory use frmo 9GB to 400MB and compile-time from
> 80s to 3.5s.  r14-9692-g839bc42772ba7a does better in both metrics
> but has shown code generation regressions across acrchitectures.
> 
> OK to revert r14-9692-g839bc42772ba7a?

No.

The patch solved a very real problem.  How does your replacement handle
that?  You don't say.  It looks like it only battles symptoms a bit,
instead :-(

We had this before: 3->2 combinations that leave an instruction
identical to what was there before.  This was just a combination with
context as well.  The only reason this wasn't a huge problem then
already was because this is a 3->2 combination, even if it really is a
2->1 one it still is beneficial in all the same cases.  But in the new
case it can iterate indefinitely -- well not quite, but some polynomial
number of times, for a polynomial at least of degree three, possibly
more :-(

With this patch you need to show combine still is linear.  I don't think
it is, but some deeper analysis might show it still is.

  ~ - ~ - ~

What should *really* be done is something that has been on the wish list
for decades: an uncse pass.

The things that combine no longer works on after my patch are actually
1->1 combinations (which we never do currently, although we probably
should); or alternatively, an un-CSE followed by a 2->1 combination.

We can do the latter of course, but we need to do an actual uncse first!
Somewhere before combine, and then redo a CSE after it.  An actual CSE,
not doing ten gazillion other things.

Segher

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-05 Thread Jeff Law





On 4/5/24 3:27 PM, Segher Boessenkool wrote:

Hi!

On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote:

The following avoids re-walking and re-combining the instructions
between i2 and i3 when the pattern of i2 doesn't change.

Bootstrap and regtest running ontop of a reversal of
r14-9692-g839bc42772ba7a.


Please include that in the patch (or series, preferably).


It brings down memory use frmo 9GB to 400MB and compile-time from
80s to 3.5s.  r14-9692-g839bc42772ba7a does better in both metrics
but has shown code generation regressions across acrchitectures.

OK to revert r14-9692-g839bc42772ba7a?


No.

The patch solved a very real problem.  How does your replacement handle
that?  You don't say.  It looks like it only battles symptoms a bit,
instead :-(
But that patch very clearly created new problems in the process.  I 
would argue that it was a step backwards given that we're in stage4.



Jeff

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-05 Thread Jakub Jelinek

On Fri, Apr 05, 2024 at 03:46:30PM -0600, Jeff Law wrote:
> 
> 
> On 4/5/24 3:27 PM, Segher Boessenkool wrote:
> > Hi!
> > 
> > On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote:
> > > The following avoids re-walking and re-combining the instructions
> > > between i2 and i3 when the pattern of i2 doesn't change.
> > > 
> > > Bootstrap and regtest running ontop of a reversal of
> > > r14-9692-g839bc42772ba7a.
> > 
> > Please include that in the patch (or series, preferably).
> > 
> > > It brings down memory use frmo 9GB to 400MB and compile-time from
> > > 80s to 3.5s.  r14-9692-g839bc42772ba7a does better in both metrics
> > > but has shown code generation regressions across acrchitectures.
> > > 
> > > OK to revert r14-9692-g839bc42772ba7a?
> > 
> > No.
> > 
> > The patch solved a very real problem.  How does your replacement handle
> > that?  You don't say.  It looks like it only battles symptoms a bit,
> > instead :-(
> But that patch very clearly created new problems in the process.  I would
> argue that it was a step backwards given that we're in stage4.

Yeah.  E.g. PR114518 and PR114522, both P1s.

Jakub

[PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-05 Thread Peter Bergner

This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR
and delete the now dead mask.

This passed bootstrap and retesting on powerpc64le-linux with no regressions.
Ok for trunk?

Eventually we'll want to backport this along with the follow-on patch that
actually fixes PR101865.

Peter


gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_VSX_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mno-direct-move): New.
(mdirect-move): Remove Mask and Var.


diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 68bc45d65ba..77d045c9f6e 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -471,6 +471,8 @@ extern int rs6000_vector_align[];
 #define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64)
 #define TARGET_MADDLD  TARGET_MODULO
 
+/* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
+#define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
 #define TARGET_XSCVDPSPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
 #define TARGET_XSCVSPDPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
 #define TARGET_VADDUQM (TARGET_P8_VECTOR && TARGET_POWERPC64)
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 6ba9df4f02e..c241371147c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3811,7 +3811,7 @@ rs6000_option_override_internal (bool global_init_p)
  Testing for direct_move matches power8 and later.  */
   if (!BYTES_BIG_ENDIAN
   && !(processor_target_table[tune_index].target_enable
-  & OPTION_MASK_DIRECT_MOVE))
+  & OPTION_MASK_P8_VECTOR))
 rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
   /* Add some warnings for VSX.  */
@@ -3853,8 +3853,7 @@ rs6000_option_override_internal (bool global_init_p)
   && (rs6000_isa_flags_explicit & (OPTION_MASK_SOFT_FLOAT
   | OPTION_MASK_ALTIVEC
   | OPTION_MASK_VSX)) != 0)
-rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO
-  | OPTION_MASK_DIRECT_MOVE)
+rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO)
 & ~rs6000_isa_flags_explicit);
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
@@ -3939,13 +3938,6 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FPRND;
 }
 
-  if (TARGET_DIRECT_MOVE && !TARGET_VSX)
-{
-  if (rs6000_isa_flags_explicit & OPTION_MASK_DIRECT_MOVE)
-   error ("%qs requires %qs", "-mdirect-move", "-mvsx");
-  rs6000_isa_flags &= ~OPTION_MASK_DIRECT_MOVE;
-}
-
   if (TARGET_P8_VECTOR && !TARGET_ALTIVEC)
 rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;
 
@@ -24429,7 +24421,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
false, true  },
   { "cmpb",OPTION_MASK_CMPB,   false, true  },
   { "crypto",  OPTION_MASK_CRYPTO, false, true  },
-  { "direct-move", OPTION_MASK_DIRECT_MOVE,false, true  },
+  { "direct-move", 0,  false, true  },
   { "dlmzb",   OPTION_MASK_DLMZB,  false, true  },
   { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX,
false, true  },
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index ce0b14a8d37..647f20de7f2 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned on in the following condition:
- 1. TARGET_P8_VECTOR is enabled and OPTION_MASK_DIRECT_MOVE is not
-explicitly disabled.
-Hereafter, the

Re: [patch, libgfortran] PR114304 - [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-04-05 Thread Jerry D


On 4/5/24 10:47 AM, Jerry D wrote:

On 4/4/24 2:41 PM, Tobias Burnus wrote:

Hi Jerry,

I think for the current testcases, I like the patch – the question is 
only what's about:


   ',3' as input for 'comma'   (or '.3' as input for 'point')

For 'point' – 0.3 is read and ios = 0 (as expected)
But for 'comma':
* GCC 12 reads nothing and has ios = 0.
* GCC 13/mainline has an error (ios != 0 – and reads nothing)
* GCC with your patch: Same result: ios != 0 and nothing read.

Expected: Same as with ','/'comma' – namely: read-in value is 0.3.
→ https://godbolt.org/z/4rc8fz4sT for the full example, which works 
with ifort, ifx and flang


* * *

Can you check and fix this? It looks perfectly valid to me to have 
remove the '0' in the floating point numbers '0.3' or '0,3' seems to 
be permitted – and it works for '.' (with 'point') but not for ',' 
(with 'comma').



Yes, I found the spot to fix this.

F2023's "13.10.3.1 List-directed input forms" refers to "13.7.2.3.2 F 
editing", which states:


"The standard form of the input field [...] The form of the mantissa 
is an optional sign, followed by a string of one or more digits 
optionally containing a decimal symbol."


The latter does not require that the digit has to be before the 
decimal sign and as for output, it is optional, it is surely intended 
that ",3" is a valid floating-point number for decimal='comma'.




Agree


* * *

I extended the testcase to check for this – see attached diff. All 
'point' work, all 'comma' fail.


Thanks for working on this!

Tobias


Thanks much. After I fix it, I will use your extended test case in the 
test suite.


Jerry -



See attached updated patch.

Regressions tested on x86-64. OK for trunk and 13 after a bit.

Jerry -
commit 690b9fa57d95796ba0e92a172e1490601f25e03a
Author: Jerry DeLisle 
Date:   Fri Apr 5 19:25:13 2024 -0700

libfortran: Fix handling of formatted separators.

PR libfortran/114304
PR libfortran/105473

libgfortran/ChangeLog:

* io/list_read.c (eat_separator): Add logic to handle spaces
preceding a comma or semicolon such that that a 'null' read
occurs without error at the end of comma or semicolon
terminated input lines. Add check and error message for ';'.
(list_formatted_read_scalar): Treat comma as a decimal point
when specified by the decimal mode on the first item.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105473.f90: Modify to verify new error message.
* gfortran.dg/pr114304.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/pr105473.f90 b/gcc/testsuite/gfortran.dg/pr105473.f90
index 2679f6bb447..863a312c794 100644
--- a/gcc/testsuite/gfortran.dg/pr105473.f90
+++ b/gcc/testsuite/gfortran.dg/pr105473.f90
@@ -9,11 +9,11 @@
   n = 999; m = 777; r=1.2345
   z = cmplx(0.0,0.0)
 
-! Check that semi-colon is allowed as separator with decimal=point.
+! Check that semi-colon is not allowed as separator with decimal=point.
   ios=0
   testinput = '1;17;3.14159'
   read(testinput,*,decimal='point',iostat=ios) n, m, r
-  if (ios /= 0) stop 1
+  if (ios /= 5010) stop 1
 
 ! Check that semi-colon allowed as a separator with decimal=point.
   ios=0
diff --git a/gcc/testsuite/gfortran.dg/pr114304.f90 b/gcc/testsuite/gfortran.dg/pr114304.f90
new file mode 100644
index 000..2f913f1ab34
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114304.f90
@@ -0,0 +1,114 @@
+! { dg-do run }
+!
+! PR fortran/114304
+!
+! See also PR fortran/105473
+!
+! Testing: Does list-directed reading an integer/real allow some non-integer input?
+!
+! Note: GCC result comments before fix of this PR.
+
+  implicit none
+  call t(.true.,  'comma', ';') ! No error shown
+  call t(.false., 'point', ';') ! /!\ gfortran: no error, others: error
+  call t(.false., 'comma', ',') ! Error shown
+  call t(.true.,  'point', ',') ! No error shown
+  call t(.false., 'comma', '.') ! Error shown
+  call t(.false., 'point', '.') ! Error shown
+  call t(.false., 'comma', '5.') ! Error shown
+  call t(.false., 'point', '5.') ! gfortran/flang: Error shown, ifort: no error
+  call t(.false., 'comma', '5,') ! gfortran: error; others: no error
+  call t(.true.,  'point', '5,') ! No error shown
+  call t(.true.,  'comma', '5;') ! No error shown
+  call t(.false., 'point', '5;') ! /!\ gfortran: no error shown, others: error
+  call t(.true.,  'comma', '7 .') ! No error shown
+  call t(.true.,  'point', '7 .') ! No error shown
+  call t(.true.,  'comma', '7 ,') ! /!\ gfortran: error; others: no error
+  call t(.true.,  'point', '7 ,') ! No error shown
+  call t(.true.,  'comma', '7 ;') ! No error shown
+  call t(.true.,  'point', '7 ;') ! No error shown
+
+!  print *, '---'
+
+  call t(.false., 'comma', '8.', .true.) ! Error shown
+  call t(.true.,  'point', '8.', .true.) ! gfortran/flang: Error shown, ifort: no error
+  call t(.true.,  'comma', '8,', .true.) ! gfortran: error; others

Re: [PATCH v3] RISC-V: Replace zero_extendsidi2_shifted with generalized split

2024-04-05 Thread Jeff Law





On 3/27/24 4:55 AM, Philipp Tomsich wrote:

Jeff,

just a heads-up that that trunk (i.e., the soon-to-be GCC14) still
generates the suboptimal sequence:
   https://godbolt.org/z/K9YYEPsvY

Realistically it's too late to get this into gcc-14.

Jeff

Re: [patch, libgfortran] PR114304 - [13/14 Regression] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-04-05 Thread Tobias Burnus


Hi Jerry, hello world,

Jerry D wrote:

On 4/5/24 10:47 AM, Jerry D wrote:

On 4/4/24 2:41 PM, Tobias Burnus wrote:
I think for the current testcases, I like the patch – the question 
is only what's about:

   ',3' as input for 'comma'   (or '.3' as input for 'point')
[...]
But for 'comma': [...]
* GCC with your patch: Same result: ios != 0 and nothing read.

Expected: [...] read-in value is 0.3. [...]



See attached updated patch.
Regressions tested on x86-64. OK for trunk and 13 after a bit.


OK. Thanks for the patch!

Tobias

Re: [PATCH 0/2] Condition coverage fixes

2024-04-05 Thread Richard Biener




> Am 05.04.2024 um 21:59 schrieb Jørgen Kvalsvik :
> 
> Hi,
> 
> I propose these fixes for the current issues with the condition
> coverage.
> 
> Rainer, I propose to simply delete the test with __sigsetjmp. I don't
> think it actually detects anything reasonable any more, I kept it around
> to prevent a regression. Since then I have built a lot of programs (with
> optimization enabled) and not really seen this problem.
> 
> H.J., the problem you found with -O2 was really a problem of
> tree-inlining, which was actually caught earlier by Jan [1]. It probably
> warrants some more testing, but I could reproduce by tuning your test
> case to use always_inline and not -O2 and trigger the error.
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648785.html

Ok

Thanks,
Richard 


> Thanks,
> Jørgen
> 
> Jørgen Kvalsvik (2):
>  Remove unecessary and broken MC/DC compile test
>  Copy condition->expr map when inlining [PR114599]
> 
> gcc/testsuite/gcc.misc-tests/gcov-19.c   | 11 -
> gcc/testsuite/gcc.misc-tests/gcov-pr114599.c | 25 
> gcc/tree-inline.cc   | 20 +++-
> 3 files changed, 44 insertions(+), 12 deletions(-)
> create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-pr114599.c
> 
> --
> 2.30.2
>

61 matches

Mail list logo