Re: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization

2023-01-09 Thread Richard Biener via Gcc-patches
On Wed, Jan 4, 2023 at 9:54 AM Jose E. Marchesi via Gcc-patches
 wrote:
>
>
> ping.
> Would this be a good approach for fixing the issue?

adding the is_libcall bit enlarges rtx_def by 8 bytes - there's no room for more
bits here.

I really wonder how other targets avoid the issue you are pointing out?
Do their assemblers prune unused (extern) .global?

> > Hi Jakub.
> >
> >> On Thu, Dec 08, 2022 at 02:02:36PM +0100, Jose E. Marchesi wrote:
> >>> So, I guess the right fix would be to call assemble_external_libcall
> >>> during final?  The `.global FOO' directive would be generated
> >>> immediately before the call sequence, but I guess that would be ok.
> >>
> >> During final only if all the targets can deal with the effects of
> >> assemble_external_libcall being done in the middle of emitting assembly
> >> for the function.
> >>
> >> Otherwise, it could be e.g. done in the first loop of shorten_branches.
> >>
> >> Note, in calls.cc it is done only for emit_library_call_value_1
> >> and not for emit_call_1, so if we do it late, we need to be able to find
> >> out what call is to a libcall and what is to a normal call.  If there is
> >> no way to differentiate it right now, perhaps we need some flag somewhere,
> >> say on a SYMBOL_REF.  And then assemble_external_libcall either only
> >> if such a SYMBOL_REF appears in CALL_INSN or sibcall JUMP_INSN, or
> >> perhaps anywhere in the function and its constant pool.
> >
> > Allright, the quick-and-dirty patch below seems to DTRT with simple
> > examples.
> >
> > First, when libcalls are generated.  Note only one .global is generated
> > for all calls, and actually it is around the same position than before:
> >
> >   $ cat foo.c
> >   int foo(unsigned int len, int flag)
> >   {
> > if (flag)
> >   return (((long)len) * 234 / 5);
> > return (((long)len) * 2 / 5);
> >   }
> >   $ cc1 -O2 foo.c
> >   $ cat foo.c
> >   .file   "foo.c"
> >   .text
> >   .global __divdi3
> >   .align  3
> >   .global foo
> >   .type   foo, @function
> >   foo:
> >   mov32   %r1,%r1
> >   lsh %r2,32
> >   jne %r2,0,.L5
> >   mov %r2,5
> >   lsh %r1,1
> >   call__divdi3
> >   lsh %r0,32
> >   arsh%r0,32
> >   exit
> >   .L5:
> >   mov %r2,5
> >   mul %r1,234
> >   call__divdi3
> >   lsh %r0,32
> >   arsh%r0,32
> >   exit
> >   .size   foo, .-foo
> >   .ident  "GCC: (GNU) 13.0.0 20221207 (experimental)"
> >
> > Second, when libcalls are tried by expand_moddiv in a sequence, but then
> > discarded and not linked in the main sequence:
> >
> >   $ cat foo.c
> >   int foo(unsigned int len, int flag)
> >   {
> > if (flag)
> >   return (((long)len) * 234 / 5);
> > return (((long)len) * 2 / 5);
> >   }
> >   $ cc1 -O2 foo.c
> >   $ cat foo.c
> >   .file   "foo.c"
> >   .text
> >   .align  3
> >   .global foo
> >   .type   foo, @function
> >   foo:
> >   mov32   %r0,%r1
> >   lsh %r2,32
> >   jne %r2,0,.L5
> >   add %r0,%r0
> >   div %r0,5
> >   lsh %r0,32
> >   arsh%r0,32
> >   exit
> >   .L5:
> >   mul %r0,234
> >   div %r0,5
> >   lsh %r0,32
> >   arsh%r0,32
> >   exit
> >   .size   foo, .-foo
> >   .ident  "GCC: (GNU) 13.0.0 20221207 (experimental)"
> >
> > Note the .global now is not generated, as desired.
> >
> > As you can see below, I am adding a new RTX flag `is_libcall', with
> > written form "/l".
> >
> > Before I get into serious testing etc, can you please confirm whether
> > this is the right approach or not?
> >
> > In particular, I am a little bit concerned about the expectation I am
> > using that the target of the `call' instruction emitted by emit_call_1
> > is always a (MEM (SYMBOL_REF ...)) when it is passed a SYMBOL_REF as the
> > first argument (`fun' in emit_library_call_value_1).
> >
> > Thanks.
> >
> > diff --git a/gcc/calls.cc b/gcc/calls.cc
> > index 6dd6f73e978..6c4a3725272 100644
> > --- a/gcc/calls.cc
> > +++ b/gcc/calls.cc
> > @@ -4370,10 +4370,6 @@ emit_library_call_value_1 (int retval, rtx orgfun, 
> > rtx value,
> >   || argvec[i].partial != 0)
> >update_stack_alignment_for_call (&argvec[i].locate);
> >
> > -  /* If this machine requires an external definition for library
> > - functions, write one out.  */
> > -  assemble_external_libcall (fun);
> > -
> >original_args_size = args_size;
> >args_size.constant = (aligned_upper_bound (args_size.constant
> >+ stack_pointer_delta,
> > @@ -4717,6 +4713,9 @@ emit_library_call_value_1 (int retval, rtx orgfun, 
> > rtx value,
> >  valreg,
> >  old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far);
> >
> > +  /* Mark the emitted call as a libcall with the new flag.  */
> > +  RTL_LIBCALL_P (last_call_insn ()) = 1;
> > +
> >if (flag_ipa_ra)

Re: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization

2023-01-09 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 09, 2023 at 09:05:26AM +0100, Richard Biener wrote:
> On Wed, Jan 4, 2023 at 9:54 AM Jose E. Marchesi via Gcc-patches
>  wrote:
> >
> >
> > ping.
> > Would this be a good approach for fixing the issue?
> 
> adding the is_libcall bit enlarges rtx_def by 8 bytes - there's no room for 
> more
> bits here.

That is obviously not the way to go, sure.

> I really wonder how other targets avoid the issue you are pointing out?
> Do their assemblers prune unused (extern) .global?

I think no target solves this, if they see an extern call during expansion
and emit some directive for those, they emit the global or whatever directive
which remains there.

If all bits for CALL_INSN are taken, can't we add a flag on the CALL
rtx inside of the CALL_INSN pattern?  Or a flag on the SYMBOL_REF inside of
it (libcalls are always direct calls, aren't they) or SYMBOL_REF_FLAGS ?

Jakub



[PATCH] c++: Only do maybe_init_list_as_range optimization if !processing_template_decl [PR108047]

2023-01-09 Thread Jakub Jelinek via Gcc-patches
Hi!

The last testcase in this patch ICEs, because
maybe_init_list_as_range -> maybe_init_list_as_array
calls maybe_constant_init in:
  /* Don't do this if the conversion would be constant.  */ 

 
  first = maybe_constant_init (first);  

 
  if (TREE_CONSTANT (first))

 
return NULL_TREE;   

 
but maybe_constant_init shouldn't be called when processing_template_decl.
While we could replace that call with fold_non_dependent_init, my limited
understanding is that this is an optimization and even if we don't optimize
it when processing_template_decl, build_user_type_conversion_1 will be
called again during instantiation with !processing_template_decl if it is
every instantiated and we can do the optimization only then.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Or do you want fold_non_dependent_init instead?

2023-01-09  Jakub Jelinek  

PR c++/105838
PR c++/108047
PR c++/108266
* call.cc (maybe_init_list_as_range): Always return NULL_TREE if
processing_template_decl.

* g++.dg/tree-ssa/initlist-opt2.C: New test.
* g++.dg/tree-ssa/initlist-opt3.C: New test.

--- gcc/cp/call.cc.jj   2022-12-15 09:24:44.265935297 +0100
+++ gcc/cp/call.cc  2023-01-06 11:24:44.837270905 +0100
@@ -4285,7 +4285,8 @@ maybe_init_list_as_array (tree elttype,
 static tree
 maybe_init_list_as_range (tree fn, tree expr)
 {
-  if (BRACE_ENCLOSED_INITIALIZER_P (expr)
+  if (!processing_template_decl
+  && BRACE_ENCLOSED_INITIALIZER_P (expr)
   && is_list_ctor (fn)
   && decl_in_std_namespace_p (fn))
 {
--- gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C.jj2023-01-06 
11:53:13.160432870 +0100
+++ gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C   2023-01-06 
11:53:44.561976302 +0100
@@ -0,0 +1,31 @@
+// PR c++/105838
+// { dg-additional-options -fdump-tree-gimple }
+// { dg-do compile { target c++11 } }
+
+// Test that we do range-initialization from const char *.
+// { dg-final { scan-tree-dump {_M_range_initialize} 
"gimple" } }
+
+#include 
+#include 
+
+void g (const void *);
+
+template 
+void f (const char *p)
+{
+  std::vector lst = {
+  "aahing", "aaliis", "aarrgh", "abacas", "abacus", "abakas", "abamps", 
"abands", "abased", "abaser", "abases", "abasia",
+  "abated", "abater", "abates", "abatis", "abator", "abattu", "abayas", 
"abbacy", "abbess", "abbeys", "abbots", "abcees",
+  "abdabs", "abduce", "abduct", "abears", "abeigh", "abeles", "abelia", 
"abends", "abhors", "abided", "abider", "abides",
+  "abject", "abjure", "ablate", "ablaut", "ablaze", "ablest", "ablets", 
"abling", "ablins", "abloom", "ablush", "abmhos",
+  "aboard", "aboded", "abodes", "abohms", "abolla", "abomas", "aboral", 
"abords", "aborne", "aborts", "abound", "abouts",
+  "aboves", "abrade", "abraid", "abrash", "abrays", "abrazo", "abrege", 
"abrins", "abroad", "abrupt", "abseil", "absent",
+  };
+
+  g(&lst);
+}
+
+void h (const char *p)
+{
+  f<0> (p);
+}
--- gcc/testsuite/g++.dg/tree-ssa/initlist-opt3.C.jj2023-01-06 
11:56:36.981469370 +0100
+++ gcc/testsuite/g++.dg/tree-ssa/initlist-opt3.C   2023-01-06 
11:56:09.984861898 +0100
@@ -0,0 +1,21 @@
+// PR c++/108266
+// { dg-do compile { target c++11 } }
+
+#include 
+#include 
+
+struct S { S (const char *); };
+void bar (std::vector);
+
+template 
+void
+foo ()
+{
+  bar ({"", ""});
+}
+
+void
+baz ()
+{
+  foo<0> ();
+}

Jakub



[PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2023-01-09 Thread Jakub Jelinek via Gcc-patches
Hi!

On powerpc64le-linux, the following patch fixes
-FAIL: gcc.dg/c2x-stdarg-4.c execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O0  execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O1  execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2  execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O3 -g  execution test
-FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -Os  execution test
The problem is mismatch between the caller and callee side.
On the callee side, we do:
  /* NAMED_ARG is a misnomer.  We really mean 'non-variadic'. */
  if (!cfun->stdarg)
data->arg.named = 1;  /* No variadic parms.  */
  else if (DECL_CHAIN (parm))
data->arg.named = 1;  /* Not the last non-variadic parm. */
  else if (targetm.calls.strict_argument_naming (all->args_so_far))
data->arg.named = 1;  /* Only variadic ones are unnamed.  */
  else
data->arg.named = 0;  /* Treat as variadic.  */
which is later passed to the target hooks to determine if a particular
argument is named or not.  Now, cfun->stdarg is determined from the stdarg_p
call, which for the new C2X TYPE_NO_NAMED_ARGS_STDARG_P function types
(rettype fn (...)) returns true.  Such functions have no named arguments,
so data->arg.named will be 0 in function.cc.  But on the caller side,
as TYPE_NO_NAMED_ARGS_STDARG_P function types have TYPE_ARG_TYPES NULL,
we instead treat those calls as unprototyped even when they are prototyped
- /* If we know nothing, treat all args as named.  */ n_named_args = 
num_actuals;
in 2 spots.  We need to treat the TYPE_NO_NAMED_ARGS_STDARG_P cases as
prototyped with no named arguments.

Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux (where
it fixes the above failures), aarch64-linux and s390x-linux, ok for trunk?

2023-01-09  Jakub Jelinek  

PR target/107453
* calls.cc (expand_call): For calls with
TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
Formatting fix.

--- gcc/calls.cc.jj 2023-01-02 09:32:28.834192105 +0100
+++ gcc/calls.cc2023-01-06 14:52:14.740594896 +0100
@@ -2908,8 +2908,8 @@ expand_call (tree exp, rtx target, int i
 }
 
   /* Count the arguments and set NUM_ACTUALS.  */
-  num_actuals =
-call_expr_nargs (exp) + num_complex_actuals + structure_value_addr_parm;
+  num_actuals
+= call_expr_nargs (exp) + num_complex_actuals + structure_value_addr_parm;
 
   /* Compute number of named args.
  First, do a raw count of the args for INIT_CUMULATIVE_ARGS.  */
@@ -2919,6 +2919,8 @@ expand_call (tree exp, rtx target, int i
   = (list_length (type_arg_types)
 /* Count the struct value address, if it is passed as a parm.  */
 + structure_value_addr_parm);
+  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
+n_named_args = 0;
   else
 /* If we know nothing, treat all args as named.  */
 n_named_args = num_actuals;
@@ -2957,6 +2959,8 @@ expand_call (tree exp, rtx target, int i
   && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
 /* Don't include the last named arg.  */
 --n_named_args;
+  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
+n_named_args = 0;
   else
 /* Treat all args as named.  */
 n_named_args = num_actuals;

Jakub



[PATCH] hash: do not insert deleted value to a hash_set

2023-01-09 Thread Martin Liška

Hi.

After the new hash_set checking code, we face an issue where deleted value
is added to a hash_set. Fix it.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR lto/108330

gcc/ChangeLog:

* lto-cgraph.cc (compute_ltrans_boundary): Do not insert
NULL (deleleted value) to a hash_set.

gcc/testsuite/ChangeLog:

* g++.dg/ipa/pr108830.C: New test.
---
 gcc/lto-cgraph.cc   |  3 ++-
 gcc/testsuite/g++.dg/ipa/pr108830.C | 20 
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr108830.C

diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc
index eef5ea1d061..805c7855eb3 100644
--- a/gcc/lto-cgraph.cc
+++ b/gcc/lto-cgraph.cc
@@ -918,7 +918,8 @@ compute_ltrans_boundary (lto_symtab_encoder_t in_encoder)
  vec targets
= possible_polymorphic_call_targets
(edge, &final, &cache_token);
- if (!reachable_call_targets.add (cache_token))
+ if (cache_token != NULL
+ && !reachable_call_targets.add (cache_token))
{
  for (i = 0; i < targets.length (); i++)
{
diff --git a/gcc/testsuite/g++.dg/ipa/pr108830.C 
b/gcc/testsuite/g++.dg/ipa/pr108830.C
new file mode 100644
index 000..96656f67e4f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr108830.C
@@ -0,0 +1,20 @@
+// PR lto/108330
+// { dg-do compile }
+
+class A {
+  virtual unsigned long m_fn1() const;
+virtual int &m_fn2(unsigned long) const;
+};
+class C : A {
+public:
+  int &m_fn2(unsigned long) const;
+unsigned long m_fn1() const;
+};
+class B {
+  void m_fn3(const A &, const int &, const C &, int &) const;
+};
+void B::m_fn3(const A &, const int &, const C &, int &) const {
+  C &a(a);
+for (long b = 0; a.m_fn1(); b++)
+ a.m_fn2(0);
+}
--
2.39.0



[PATCH] tree-optimization/101912 - testcase for fixed uninit case

2023-01-09 Thread Richard Biener via Gcc-patches
We now properly optimize this testcase and no longer diagnose
a bogus uninit use.

Tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/101912
* gcc.dg/uninit-pr101912.c: New testcase.
---
 gcc/testsuite/gcc.dg/uninit-pr101912.c | 21 +
 1 file changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/uninit-pr101912.c

diff --git a/gcc/testsuite/gcc.dg/uninit-pr101912.c 
b/gcc/testsuite/gcc.dg/uninit-pr101912.c
new file mode 100644
index 000..1550c03436d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-pr101912.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wuninitialized" } */
+
+int getint (void);
+int
+tzloadbody (void)
+{
+  int n = getint ();
+  int prevcorr;
+  int leapcnt = 0;
+  for (int i = 0; i < n; i++)
+{
+  int corr = getint ();
+  if (corr < 1 || (corr == 1 && !(leapcnt == 0 || (prevcorr < corr ? corr 
== prevcorr + 1 : (corr == prevcorr || corr == prevcorr - 1) /* { dg-bogus 
"uninitialized" } */
+   return -1;
+
+  prevcorr = corr;
+  leapcnt++;
+}
+  return leapcnt;
+}
-- 
2.35.3


[PATCH] tree-optimization/107767 - not profitable switch conversion

2023-01-09 Thread Richard Biener via Gcc-patches
When the CFG has not merged equal PHI defs in a switch stmt the
cost model from switch conversion gets off and we prefer a
jump table over branches.  The following fixes that by recording
cases that will be merged later and more appropriately counting
unique values.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Martin, OK with you?

Thanks,
Richard.

PR tree-optimization/107767
* tree-cfgcleanup.cc (phi_alternatives_equal): Export.
* tree-cfgcleanup.h (phi_alternatives_equal): Declare.
* tree-switch-conversion.cc (switch_conversion::collect):
Count unique non-default targets accounting for later
merging opportunities.

* gcc.dg/tree-ssa/pr107767.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr107767.c | 20 ++
 gcc/tree-cfgcleanup.cc   |  2 +-
 gcc/tree-cfgcleanup.h|  1 +
 gcc/tree-switch-conversion.cc| 49 ++--
 4 files changed, 60 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107767.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107767.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr107767.c
new file mode 100644
index 000..bace8abfd9c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107767.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -fdump-tree-switchconv" } */
+
+int firewall2(const unsigned char *restrict data)
+{
+  const unsigned short dst_port = *((const unsigned short *)data + 32);
+
+  if (dst_port == 15) return 1;
+  if (dst_port == 23) return 1;
+  if (dst_port == 47) return 1;
+  if (dst_port == 45) return 1;
+  if (dst_port == 42) return 1;
+  if (dst_port == 1) return 1;
+  if (dst_port == 2) return 1;
+  if (dst_port == 3) return 1;
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "CSWTCH" "switchconv" } } */
diff --git a/gcc/tree-cfgcleanup.cc b/gcc/tree-cfgcleanup.cc
index 075b1560cdd..ca0cb633f2c 100644
--- a/gcc/tree-cfgcleanup.cc
+++ b/gcc/tree-cfgcleanup.cc
@@ -450,7 +450,7 @@ tree_forwarder_block_p (basic_block bb, bool phi_wanted)
those alternatives are equal in each of the PHI nodes, then return
true, else return false.  */
 
-static bool
+bool
 phi_alternatives_equal (basic_block dest, edge e1, edge e2)
 {
   int n1 = e1->dest_idx;
diff --git a/gcc/tree-cfgcleanup.h b/gcc/tree-cfgcleanup.h
index c26831915c0..b7c7ff1ebcd 100644
--- a/gcc/tree-cfgcleanup.h
+++ b/gcc/tree-cfgcleanup.h
@@ -27,5 +27,6 @@ extern bool fixup_noreturn_call (gimple *stmt);
 extern bool delete_unreachable_blocks_update_callgraph (cgraph_node *dst_node,
bool update_clones);
 extern unsigned clean_up_loop_closed_phi (function *);
+extern bool phi_alternatives_equal (basic_block, edge, edge);
 
 #endif /* GCC_TREE_CFGCLEANUP_H */
diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
index 6aeabb96c26..c08c22039c9 100644
--- a/gcc/tree-switch-conversion.cc
+++ b/gcc/tree-switch-conversion.cc
@@ -51,6 +51,7 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, 
MA
 #include "tree-into-ssa.h"
 #include "omp-general.h"
 #include "gimple-range.h"
+#include "tree-cfgcleanup.h"
 
 /* ??? For lang_hooks.types.type_for_mode, but is there a word_mode
type in the GIMPLE type system that is language-independent?  */
@@ -132,16 +133,42 @@ switch_conversion::collect (gswitch *swtch)
   /* Require that all switch destinations are either that common
  FINAL_BB or a forwarder to it, except for the default
  case if contiguous range.  */
+  auto_vec fw_edges;
+  m_uniq = 0;
   if (m_final_bb)
 FOR_EACH_EDGE (e, ei, m_switch_bb->succs)
   {
+   edge phi_e = nullptr;
if (e->dest == m_final_bb)
- continue;
-
-   if (single_pred_p (e->dest)
-   && single_succ_p (e->dest)
-   && single_succ (e->dest) == m_final_bb)
- continue;
+ phi_e = e;
+   else if (single_pred_p (e->dest)
+&& single_succ_p (e->dest)
+&& single_succ (e->dest) == m_final_bb)
+ phi_e = single_succ_edge (e->dest);
+   if (phi_e)
+ {
+   if (e == e_default)
+ ;
+   else if (phi_e == e || empty_block_p (e->dest))
+ {
+   /* For empty blocks consider forwarders with equal
+  PHI arguments in m_final_bb as unique.  */
+   unsigned i;
+   for (i = 0; i < fw_edges.length (); ++i)
+ if (phi_alternatives_equal (m_final_bb, fw_edges[i], phi_e))
+   break;
+   if (i == fw_edges.length ())
+ {
+   /* But limit the above possibly quadratic search.  */
+   if (fw_edges.length () < 10)
+ fw_edges.quick_push (phi_e);
+   m_uniq++;
+ }
+ }
+   else
+ m_uniq++;
+   continue;
+  

Re: [PATCH] c++tools: Fix compilation of server.cc on hpux

2023-01-09 Thread Nathan Sidwell via Gcc-patches

On 1/7/23 14:12, John David Anglin wrote:

Tested on trunk and gcc-12 with hppa64-hp-hpux11.11.


ah, I see that is the use that was unprotected, ok.




Okay?

Dave
---

Fix compilation of server.cc on hpux.

Select and FD_ISSET are declared in sys/time.h on most versions
of hpux.  As a result, HAVE_PSELECT and HAVE_SELECT can be 0.

2023-01-07  John David Anglin  

c++tools/ChangeLog:

PR c++tools/107616
* server.cc (server): Don't call FD_ISSET when HAVE_PSELECT
and HAVE_SELECT are zero.

diff --git a/c++tools/server.cc b/c++tools/server.cc
index 00154a05925..693aec6820a 100644
--- a/c++tools/server.cc
+++ b/c++tools/server.cc
@@ -753,8 +753,10 @@ server (bool ipv6, int sock_fd, module_resolver *resolver)
  }
  }
  
+#if defined (HAVE_PSELECT) || defined (HAVE_SELECT)

  if (active < 0 && sock_fd >= 0 && FD_ISSET (sock_fd, &readers))
active = -1;
+#endif
}
  
  	  if (active >= 0)




--
Nathan Sidwell



Re: [PATCH] calls: Fix up TYPE_NO_NAMED_ARGS_STDARG_P handling [PR107453]

2023-01-09 Thread Richard Biener via Gcc-patches
On Mon, 9 Jan 2023, Jakub Jelinek wrote:

> Hi!
> 
> On powerpc64le-linux, the following patch fixes
> -FAIL: gcc.dg/c2x-stdarg-4.c execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O0  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O1  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -O3 -g  execution test
> -FAIL: gcc.dg/torture/c2x-stdarg-split-1a.c   -Os  execution test
> The problem is mismatch between the caller and callee side.
> On the callee side, we do:
>   /* NAMED_ARG is a misnomer.  We really mean 'non-variadic'. */
>   if (!cfun->stdarg)
> data->arg.named = 1;  /* No variadic parms.  */
>   else if (DECL_CHAIN (parm))
> data->arg.named = 1;  /* Not the last non-variadic parm. */
>   else if (targetm.calls.strict_argument_naming (all->args_so_far))
> data->arg.named = 1;  /* Only variadic ones are unnamed.  */
>   else
> data->arg.named = 0;  /* Treat as variadic.  */
> which is later passed to the target hooks to determine if a particular
> argument is named or not.  Now, cfun->stdarg is determined from the stdarg_p
> call, which for the new C2X TYPE_NO_NAMED_ARGS_STDARG_P function types
> (rettype fn (...)) returns true.  Such functions have no named arguments,
> so data->arg.named will be 0 in function.cc.  But on the caller side,
> as TYPE_NO_NAMED_ARGS_STDARG_P function types have TYPE_ARG_TYPES NULL,
> we instead treat those calls as unprototyped even when they are prototyped
> - /* If we know nothing, treat all args as named.  */ n_named_args = 
> num_actuals;
> in 2 spots.  We need to treat the TYPE_NO_NAMED_ARGS_STDARG_P cases as
> prototyped with no named arguments.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux (where
> it fixes the above failures), aarch64-linux and s390x-linux, ok for trunk?

LGTM.

Richard.

> 2023-01-09  Jakub Jelinek  
> 
>   PR target/107453
>   * calls.cc (expand_call): For calls with
>   TYPE_NO_NAMED_ARGS_STDARG_P (funtype) use zero for n_named_args.
>   Formatting fix.
> 
> --- gcc/calls.cc.jj   2023-01-02 09:32:28.834192105 +0100
> +++ gcc/calls.cc  2023-01-06 14:52:14.740594896 +0100
> @@ -2908,8 +2908,8 @@ expand_call (tree exp, rtx target, int i
>  }
>  
>/* Count the arguments and set NUM_ACTUALS.  */
> -  num_actuals =
> -call_expr_nargs (exp) + num_complex_actuals + structure_value_addr_parm;
> +  num_actuals
> += call_expr_nargs (exp) + num_complex_actuals + 
> structure_value_addr_parm;
>  
>/* Compute number of named args.
>   First, do a raw count of the args for INIT_CUMULATIVE_ARGS.  */
> @@ -2919,6 +2919,8 @@ expand_call (tree exp, rtx target, int i
>= (list_length (type_arg_types)
>/* Count the struct value address, if it is passed as a parm.  */
>+ structure_value_addr_parm);
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +n_named_args = 0;
>else
>  /* If we know nothing, treat all args as named.  */
>  n_named_args = num_actuals;
> @@ -2957,6 +2959,8 @@ expand_call (tree exp, rtx target, int i
>  && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
>  /* Don't include the last named arg.  */
>  --n_named_args;
> +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
> +n_named_args = 0;
>else
>  /* Treat all args as named.  */
>  n_named_args = num_actuals;
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] hash: do not insert deleted value to a hash_set

2023-01-09 Thread Richard Biener via Gcc-patches
On Mon, Jan 9, 2023 at 11:53 AM Martin Liška  wrote:
>
> Hi.
>
> After the new hash_set checking code, we face an issue where deleted value
> is added to a hash_set. Fix it.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK

> Thanks,
> Martin
>
> PR lto/108330
>
> gcc/ChangeLog:
>
> * lto-cgraph.cc (compute_ltrans_boundary): Do not insert
> NULL (deleleted value) to a hash_set.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ipa/pr108830.C: New test.
> ---
>   gcc/lto-cgraph.cc   |  3 ++-
>   gcc/testsuite/g++.dg/ipa/pr108830.C | 20 
>   2 files changed, 22 insertions(+), 1 deletion(-)
>   create mode 100644 gcc/testsuite/g++.dg/ipa/pr108830.C
>
> diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc
> index eef5ea1d061..805c7855eb3 100644
> --- a/gcc/lto-cgraph.cc
> +++ b/gcc/lto-cgraph.cc
> @@ -918,7 +918,8 @@ compute_ltrans_boundary (lto_symtab_encoder_t in_encoder)
>   vec targets
> = possible_polymorphic_call_targets
> (edge, &final, &cache_token);
> - if (!reachable_call_targets.add (cache_token))
> + if (cache_token != NULL
> + && !reachable_call_targets.add (cache_token))
> {
>   for (i = 0; i < targets.length (); i++)
> {
> diff --git a/gcc/testsuite/g++.dg/ipa/pr108830.C 
> b/gcc/testsuite/g++.dg/ipa/pr108830.C
> new file mode 100644
> index 000..96656f67e4f
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ipa/pr108830.C
> @@ -0,0 +1,20 @@
> +// PR lto/108330
> +// { dg-do compile }
> +
> +class A {
> +  virtual unsigned long m_fn1() const;
> +virtual int &m_fn2(unsigned long) const;
> +};
> +class C : A {
> +public:
> +  int &m_fn2(unsigned long) const;
> +unsigned long m_fn1() const;
> +};
> +class B {
> +  void m_fn3(const A &, const int &, const C &, int &) const;
> +};
> +void B::m_fn3(const A &, const int &, const C &, int &) const {
> +  C &a(a);
> +for (long b = 0; a.m_fn1(); b++)
> + a.m_fn2(0);
> +}
> --
> 2.39.0
>


Re: [PATCH v6 10/11] OpenMP: Support OpenMP 5.0 "declare mapper" directives for C

2023-01-09 Thread Julian Brown
On Fri, 23 Dec 2022 04:13:03 -0800
Julian Brown  wrote:

> This patch adds support for "declare mapper" directives (and the
> "mapper" modifier on "map" clauses) for C.  As for C++, arrays of
> custom-mapped objects are not supported yet.

Here's a small follow-up for this one.

Re-tested (with previous patches) with offloading to nvptx.

OK?

Julian
commit be53bd5db61c2e4a093a00c371ba8395737d9be9
Author: Julian Brown 
Date:   Fri Jan 6 12:05:56 2023 +

OpenMP: Use c_parser_check_balanced_raw_token_sequence parsing mapper modifier

This is a small cumulative patch to simplify mapper modifier parsing in
c_parser_omp_clause_map, making an approximately-equivalent change to the
one Jakub suggested in the review for the C++ "declare mapper" patch here:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595523.html

The C version isn't quite so terse, but is still a slight improvement
over the previous code, I think.

2023-01-09  Julian Brown  

gcc/c/
* c-parser.cc (c_parser_omp_clause_map): Use
c_parser_check_balanced_raw_token_sequence to traverse mapper modifier
syntax.

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 5dca50850b38..a1b377edd886 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -17242,20 +17242,18 @@ c_parser_omp_clause_map (c_parser *parser, tree list, enum gomp_map_kind kind)
 
   if (c_parser_peek_nth_token_raw (parser, pos + 1)->type == CPP_COMMA)
 	pos++;
-  else if ((c_parser_peek_nth_token_raw (parser, pos + 1)->type
-		== CPP_OPEN_PAREN)
-	   && ((c_parser_peek_nth_token_raw (parser, pos + 2)->type
-		== CPP_NAME)
-		   || ((c_parser_peek_nth_token_raw (parser, pos + 2)->type
-			== CPP_KEYWORD)
-		   && (c_parser_peek_nth_token_raw (parser,
-			pos + 2)->keyword
-			   == RID_DEFAULT)))
-	   && (c_parser_peek_nth_token_raw (parser, pos + 3)->type
-		   == CPP_CLOSE_PAREN)
-	   && (c_parser_peek_nth_token_raw (parser, pos + 4)->type
-		   == CPP_COMMA))
-	pos += 4;
+  else if (c_parser_peek_nth_token_raw (parser, pos + 1)->type
+	   == CPP_OPEN_PAREN)
+	{
+	  unsigned int npos = pos + 2;
+	  if (c_parser_check_balanced_raw_token_sequence (parser, &npos)
+	  && (c_parser_peek_nth_token_raw (parser, npos)->type
+		  == CPP_CLOSE_PAREN)
+	  && (c_parser_peek_nth_token_raw (parser, npos + 1)->type
+		  == CPP_COMMA))
+	pos = npos + 1;
+	}
+
   pos++;
 }
 


[PATCH] middle-end/69482 - not preserving volatile accesses

2023-01-09 Thread Richard Biener via Gcc-patches
The following addresses a long standing issue with not preserving
accesses to non-volatile objects through volatile qualified
pointers in the case that object gets expanded to a register.  The
fix is to treat accesses to an object with a volatile qualified
access as forcing that object to memory.  This issue got more
exposed recently so it regressed more since GCC 11.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk
sofar.

PR middle-end/69482
* cfgexpand.cc (discover_nonconstant_array_refs_r): Volatile
qualified accesses also force objects to memory.

* gcc.target/i386/pr69482-1.c: New testcase.
* gcc.target/i386/pr69482-2.c: Likewise.
---
 gcc/cfgexpand.cc  |  9 +
 gcc/testsuite/gcc.target/i386/pr69482-1.c | 16 
 gcc/testsuite/gcc.target/i386/pr69482-2.c | 10 ++
 3 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr69482-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr69482-2.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 86783a6b661..25b1558dcb9 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -6291,6 +6291,15 @@ discover_nonconstant_array_refs_r (tree * tp, int 
*walk_subtrees,
 
   if (IS_TYPE_OR_DECL_P (t))
 *walk_subtrees = 0;
+  else if (REFERENCE_CLASS_P (t) && TREE_THIS_VOLATILE (t))
+{
+  t = get_base_address (t);
+  if (t && DECL_P (t)
+ && DECL_MODE (t) != BLKmode
+ && !TREE_ADDRESSABLE (t))
+   bitmap_set_bit (forced_stack_vars, DECL_UID (t));
+  *walk_subtrees = 0;
+}
   else if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
 {
   while (((TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
diff --git a/gcc/testsuite/gcc.target/i386/pr69482-1.c 
b/gcc/testsuite/gcc.target/i386/pr69482-1.c
new file mode 100644
index 000..f192261b104
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69482-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+static inline void memset_s(void* s, int n) {
+  volatile unsigned char * p = s;
+  for(int i = 0; i < n; ++i) {
+p[i] = 0;
+  }
+}
+
+void test() {
+  unsigned char x[4];
+  memset_s(x, sizeof x);
+}
+
+/* { dg-final { scan-assembler-times "mov" 4 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr69482-2.c 
b/gcc/testsuite/gcc.target/i386/pr69482-2.c
new file mode 100644
index 000..58e89a79333
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69482-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void bar ()
+{
+  int j;
+  *(volatile int *)&j = 0;
+}
+
+/* { dg-final { scan-assembler-times "mov" 1 } } */
-- 
2.35.3


Re: [patch, fortran] Fix common subexpression elimination with IEEE rounding (PR108329)

2023-01-09 Thread Richard Biener via Gcc-patches
On Sun, Jan 8, 2023 at 5:21 PM Thomas Koenig  wrote:
>
> Hi Richard,
>
> >> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran 
> >> :
> >>
> >> Hi Thomas,
> >>
> >> Following your off-line explanation that the seemingly empty looking
> >> assembly line forces an effective reload from memory, all is now clear.
> >
> > It’s not a full fix (for register vars) and it’s ‚superior‘ to the call 
> > itself only because asm handling is implemented in a rather stupid way in 
> > the Alias oracle.  So I don’t think this is a „fix“ at all.
>
> There are no register variables in Fortran, this is Fortran FE only,
> and it is a fix in the sense that correct code is no longer miscompiled.

It's a quite big hammer and the fact that it "works" is just luck and
the fact that the memory barrier implied by the ieee_set_rouding_mode
does not is because by-reference passed arguments are marked by
the frontend so they can be CSEd since memory barriers may not
affect them.

As said, the fact that this "works" is just because we're lazy on GIMPLE:

/* If the statement STMT may clobber the memory reference REF return true,
   otherwise return false.  */

bool
stmt_may_clobber_ref_p_1 (gimple *stmt, ao_ref *ref, bool tbaa_p)
{
...
  else if (gimple_code (stmt) == GIMPLE_ASM)
return true;

> There's a FIXME in the code pointing to the relevant PR precisely
> because I think that this is less than elegant (as do you, obviously).
> Do you have other suggestions how to implement this?  If PR 34678
> is solved, this would probably provide a mechanism that we could
> simply re-use.

There is no reliable way to get this correct at the moment and if there
were good and easy ways to get this working they'd be implemented already.

Richard.

> Best regards
>
> Thomas


Re: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization

2023-01-09 Thread Richard Biener via Gcc-patches
On Mon, Jan 9, 2023 at 10:58 AM Jakub Jelinek  wrote:
>
> On Mon, Jan 09, 2023 at 09:05:26AM +0100, Richard Biener wrote:
> > On Wed, Jan 4, 2023 at 9:54 AM Jose E. Marchesi via Gcc-patches
> >  wrote:
> > >
> > >
> > > ping.
> > > Would this be a good approach for fixing the issue?
> >
> > adding the is_libcall bit enlarges rtx_def by 8 bytes - there's no room for 
> > more
> > bits here.
>
> That is obviously not the way to go, sure.
>
> > I really wonder how other targets avoid the issue you are pointing out?
> > Do their assemblers prune unused (extern) .global?
>
> I think no target solves this, if they see an extern call during expansion
> and emit some directive for those, they emit the global or whatever directive
> which remains there.
>
> If all bits for CALL_INSN are taken, can't we add a flag on the CALL
> rtx inside of the CALL_INSN pattern?  Or a flag on the SYMBOL_REF inside of
> it (libcalls are always direct calls, aren't they) or SYMBOL_REF_FLAGS ?

I suppose the SYMBOL_REF would be what I'd target here.  Note we already
have

/* 1 if RTX is a symbol_ref that has been the library function in
   emit_library_call.  */
#define SYMBOL_REF_USED(RTX)\
  (RTL_FLAG_CHECK1 ("SYMBOL_REF_USED", (RTX), SYMBOL_REF)->used)

so can't we just use that during the final scan for the delayed assembling?

>
> Jakub
>


Re: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization

2023-01-09 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 09, 2023 at 02:04:48PM +0100, Richard Biener via Gcc-patches wrote:
> On Mon, Jan 9, 2023 at 10:58 AM Jakub Jelinek  wrote:
> >
> > On Mon, Jan 09, 2023 at 09:05:26AM +0100, Richard Biener wrote:
> > > On Wed, Jan 4, 2023 at 9:54 AM Jose E. Marchesi via Gcc-patches
> > >  wrote:
> > > >
> > > >
> > > > ping.
> > > > Would this be a good approach for fixing the issue?
> > >
> > > adding the is_libcall bit enlarges rtx_def by 8 bytes - there's no room 
> > > for more
> > > bits here.
> >
> > That is obviously not the way to go, sure.
> >
> > > I really wonder how other targets avoid the issue you are pointing out?
> > > Do their assemblers prune unused (extern) .global?
> >
> > I think no target solves this, if they see an extern call during expansion
> > and emit some directive for those, they emit the global or whatever 
> > directive
> > which remains there.
> >
> > If all bits for CALL_INSN are taken, can't we add a flag on the CALL
> > rtx inside of the CALL_INSN pattern?  Or a flag on the SYMBOL_REF inside of
> > it (libcalls are always direct calls, aren't they) or SYMBOL_REF_FLAGS ?
> 
> I suppose the SYMBOL_REF would be what I'd target here.  Note we already
> have
> 
> /* 1 if RTX is a symbol_ref that has been the library function in
>emit_library_call.  */
> #define SYMBOL_REF_USED(RTX)\
>   (RTL_FLAG_CHECK1 ("SYMBOL_REF_USED", (RTX), SYMBOL_REF)->used)
> 
> so can't we just use that during the final scan for the delayed assembling?

No, this one can't, it is used to avoid emitting the external directive
multiple times.  We need something next to it to identify for which symbols
that should be done.  Or of course if we are really out of bits that could
be used for it, the above could be repurposed for SYMBOL_REF_LIBCALL and
the current SYMBOL_REF_USED could be handled with a hash set.

Jakub



[nvptx PATCH] Correct pattern for popcountdi2 insn in nvptx.md.

2023-01-09 Thread Roger Sayle

The result of a POPCOUNT operation in RTL should have the same mode
as its operand.  This corrects the specification of popcount in
the nvptx backend, splitting the current generic define_insn into
two, one for popcountsi2 and the other for popcountdi2 (the latter
with an explicit truncate).

This patch has been tested on nvptx-none (hosted on x86_64-pc-linux-gnu)
with make and make -k check with no new failures.  This functionality is
already tested by gcc.target/nvptx/popc-[123].c.  Ok for mainline?


2023-01-09  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (popcount2): Split into...
(popcountsi2): define_insn handling SImode popcount.
(popcountdi2): define_insn handling DImode popcount, with an
explicit truncate:SI to produce an SImode result.

Thanks in advance,
Roger
--

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 740c4de..461540e 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -658,11 +658,18 @@
   DONE;
 })
 
-(define_insn "popcount2"
+(define_insn "popcountsi2"
   [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
-   (popcount:SI (match_operand:SDIM 1 "nvptx_register_operand" "R")))]
+   (popcount:SI (match_operand:SI 1 "nvptx_register_operand" "R")))]
   ""
-  "%.\\tpopc.b%T1\\t%0, %1;")
+  "%.\\tpopc.b32\\t%0, %1;")
+
+(define_insn "popcountdi2"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+   (truncate:SI
+ (popcount:DI (match_operand:DI 1 "nvptx_register_operand" "R"]
+  ""
+  "%.\\tpopc.b64\\t%0, %1;")
 
 ;; Multiplication variants
 


Re: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization

2023-01-09 Thread Jeff Law via Gcc-patches




On 1/9/23 02:57, Jakub Jelinek via Gcc-patches wrote:

On Mon, Jan 09, 2023 at 09:05:26AM +0100, Richard Biener wrote:

On Wed, Jan 4, 2023 at 9:54 AM Jose E. Marchesi via Gcc-patches
 wrote:



ping.
Would this be a good approach for fixing the issue?


adding the is_libcall bit enlarges rtx_def by 8 bytes - there's no room for more
bits here.


That is obviously not the way to go, sure.


I really wonder how other targets avoid the issue you are pointing out?
Do their assemblers prune unused (extern) .global?


I think no target solves this, if they see an extern call during expansion
and emit some directive for those, they emit the global or whatever directive
which remains there.

If all bits for CALL_INSN are taken, can't we add a flag on the CALL
rtx inside of the CALL_INSN pattern?  Or a flag on the SYMBOL_REF inside of
it (libcalls are always direct calls, aren't they) or SYMBOL_REF_FLAGS ?

You might look at 32bit PA SOM.  It was always a bit odd in this respect.

You had to import every external symbol explicitly and it disliked 
importing something that wasn't used.  I recall some special handling 
for libcalls as well.


Jeff


[PATCH] middle-end/108209 - typo in genmatch.cc:commutative_op

2023-01-09 Thread Richard Biener via Gcc-patches
The early out for user-id handling indicated commutative
rather than not commutative.

Bootstrapped and tested on x86_64-unkown-linux-gnu, pushed.

PR middle-end/108209
* genmatch.cc (commutative_op): Fix return value for
user-id with non-commutative first replacement.
---
 gcc/genmatch.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
index fb9e37ae434..d4cb439a851 100644
--- a/gcc/genmatch.cc
+++ b/gcc/genmatch.cc
@@ -496,7 +496,7 @@ commutative_op (id_base *id)
 {
   int res = commutative_op (uid->substitutes[0]);
   if (res < 0)
-   return 0;
+   return -1;
   for (unsigned i = 1; i < uid->substitutes.length (); ++i)
if (res != commutative_op (uid->substitutes[i]))
  return -1;
-- 
2.35.3


[COMMITTED] ada: Simplify finalization of temporaries created for interface objects

2023-01-09 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The expansion of (class-wide) interface objects generates a temporary that
holds the actual data and the objects are rewritten as the renaming of the
dereference at the interface tag present in it. These temporaries may need
to be finalized and this is currently done through the renamings, by using
pattern matching to recognize the original source constructs.

Now these temporaries may also need to be adjusted and this is currently
done "naturally", i.e. by using the standard machinery for them, so there
is no fundamental reason why the finalization cannot be done this way too.

Therefore this changes removes the special machinery implemented for their
finalization and let them be handled by the standard one instead.

gcc/ada/

* exp_util.ads (Is_Tag_To_Class_Wide_Conversion): Delete.
(Is_Displacement_Of_Object_Or_Function_Result): Likewise.
* exp_util.adb (Is_Tag_To_Class_Wide_Conversion): Rename to...
(Is_Temporary_For_Interface_Object): ...this.
(Is_Finalizable_Transient): Adjust call to above renaming.
(Is_Displacement_Of_Object_Or_Function_Result): Delete.
(Requires_Cleanup_Actions): Remove special handling of the
temporaries created for interface objects.
* exp_ch7.adb (Build_Finalizer): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb  |  28 +
 gcc/ada/exp_util.adb | 277 +--
 gcc/ada/exp_util.ads |  12 --
 3 files changed, 31 insertions(+), 286 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index 4cb26890ea2..f29a97a0ceb 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -2264,16 +2264,13 @@ package body Exp_Ch7 is
--  The object is of the form:
--Obj : [constant] Typ [:= Expr];
 
-   --  Do not process tag-to-class-wide conversions because they do
-   --  not yield an object. Do not process the incomplete view of a
-   --  deferred constant. Note that an object initialized by means
-   --  of a build-in-place function call may appear as a deferred
-   --  constant after expansion activities. These kinds of objects
-   --  must be finalized.
+   --  Do not process the incomplete view of a deferred constant.
+   --  Note that an object initialized by means of a BIP function
+   --  call may appear as a deferred constant after expansion
+   --  activities. These kinds of objects must be finalized.
 
elsif not Is_Imported (Obj_Id)
  and then Needs_Finalization (Obj_Typ)
- and then not Is_Tag_To_Class_Wide_Conversion (Obj_Id)
  and then not (Ekind (Obj_Id) = E_Constant
 and then not Has_Completion (Obj_Id)
 and then No (BIP_Initialization_Call (Obj_Id)))
@@ -2388,23 +2385,6 @@ package body Exp_Ch7 is
  and then Present (Status_Flag_Or_Transient_Decl (Obj_Id))
then
   Processing_Actions (Has_No_Init => True);
-
-   --  Detect a case where a source object has been initialized by
-   --  a controlled function call or another object which was later
-   --  rewritten as a class-wide conversion of Ada.Tags.Displace:
-
-   -- Obj1 : CW_Type := Function_Call (...);
-   -- Obj2 : CW_Type := Src_Obj;
-
-   -- Tmp  : ... := Function_Call (...)'reference;
-   -- Rnn  : access CW_Type := (... Ada.Tags.Displace (Tmp));
-   -- Obj1 : CW_Type renames Rnn.all;
-
-   -- Rnn : access CW_Type := (...Ada.Tags.Displace (Src_Obj));
-   -- Obj2 : CW_Type renames Rnn.all;
-
-   elsif Is_Displacement_Of_Object_Or_Function_Result (Obj_Id) then
-  Processing_Actions (Has_No_Init => True);
end if;
 
 --  Inspect the freeze node of an access-to-controlled type and
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index ab4b18da538..e89c6a91e60 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -168,9 +168,10 @@ package body Exp_Util is
--  Force evaluation of bounds of a slice, which may be given by a range
--  or by a subtype indication with or without a constraint.
 
-   function Is_Verifiable_DIC_Pragma (Prag : Node_Id) return Boolean;
-   --  Determine whether pragma Default_Initial_Condition denoted by Prag has
-   --  an assertion expression that should be verified at run time.
+   function Is_Temporary_For_Interface_Object
+ (Obj_Id : Entity_Id) return Boolean;
+   --  Determine whether Obj_Id is a temporary created for the handling of a
+   --  (class-wide) interface object.
 
function Is_Uninitialized_Aggregate
  (Exp : Node_Id;
@@ -182,6 +183,10 @@ pa

[COMMITTED] ada: Remove a couple of unreachable statements

2023-01-09 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The "then" arm of these nested if-statements is trivially unreachable.

gcc/ada/

* exp_ch7.adb (Make_Adjust_Call): Remove unreachable statement.
(Make_Final_Call): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index f29a97a0ceb..86878616f6a 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -6043,11 +6043,7 @@ package body Exp_Ch7 is
   --  Derivations from [Limited_]Controlled
 
   elsif Is_Controlled (Utyp) then
- if Has_Controlled_Component (Utyp) then
-Adj_Id := Find_Optional_Prim_Op (Utyp, TSS_Deep_Adjust);
- else
-Adj_Id := Find_Optional_Prim_Op (Utyp, Name_Of (Adjust_Case));
- end if;
+ Adj_Id := Find_Optional_Prim_Op (Utyp, Name_Of (Adjust_Case));
 
   --  Tagged types
 
@@ -8396,11 +8392,7 @@ package body Exp_Ch7 is
   --  Derivations from [Limited_]Controlled
 
   elsif Is_Controlled (Utyp) then
- if Has_Controlled_Component (Utyp) then
-Fin_Id := Find_Optional_Prim_Op (Utyp, TSS_Deep_Finalize);
- else
-Fin_Id := Find_Optional_Prim_Op (Utyp, Name_Of (Finalize_Case));
- end if;
+ Fin_Id := Find_Optional_Prim_Op (Utyp, Name_Of (Finalize_Case));
 
   --  Tagged types
 
-- 
2.34.1



Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Richard Earnshaw  writes:
>
>> On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:
>>> 
 -Original Message-
 From: Andrea Corallo 
 Sent: Tuesday, September 27, 2022 11:06 AM
 To: Kyrylo Tkachov 
 Cc: Andrea Corallo via Gcc-patches ; Richard
 Earnshaw ; nd 
 Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
 popping if necessary

 Kyrylo Tkachov  writes:

> Hi Andrea,
>
>> -Original Message-
>> From: Gcc-patches > bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
>> Corallo via Gcc-patches
>> Sent: Friday, August 12, 2022 4:34 PM
>> To: Andrea Corallo via Gcc-patches 
>> Cc: Richard Earnshaw ; nd 
>> Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
 popping
>> if necessary
>>
>> Hi all,
>>
>> this patch enables 'arm_emit_multi_reg_pop' to set again the stack
>> pointer as CFA reg when popping if this is necessary.
>>
>
>  From what I can tell from similar functions this is correct, but could 
> you
 elaborate on why this change is needed for my understanding please?
> Thanks,
> Kyrill

 Hi Kyrill,

 sure, if the frame pointer was set, than it is the current CFA register.
 If we request to adjust the current CFA register offset indicating it
 being SP (while it's actually FP) that is indeed not correct and the
 incoherence we will be detected by an assertion in the dwarf emission
 machinery.
>>> Thanks,  the patch is ok
>>> Kyrill
>>> 

 Best Regards

Andrea
>>
>> Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?
>
> Hi Richard,
>
> not sure I understand, isn't any pop updating SP by definition?


Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
  foo (i);
  return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb -O0 -g

Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
ldr r3, .L3
ldr r3, [r3]
mov r0, r3
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?

Best Regards

  Andrea


[x86 PATCH] PR rtl-optimization/107991: peephole2 to tweak register allocation.

2023-01-09 Thread Roger Sayle

This patch addresses PR rtl-optimization/107991, which is a P2 regression
where GCC currently requires more "mov" instructions than GCC 7.

The x86's two address ISA creates some interesting challenges for reload.
For example, the tricky "x = y - x" usually needs to be implemented on x86
as

tmp = x
x = y
x -= tmp

where a scratch register and two mov's are required to work around
the lack of a subf (subtract from) or rsub (reverse subtract) insn.

Not uncommonly, if y is dead after this subtraction, register allocation
can be improved by clobbering y.

y -= x
x = y

For the testcase in PR 107991, things are slightly more complicated,
where y is not itself dead, but is assigned from (i.e. equivalent to)
a value that is dead.  Hence we have something like:

y = z
x = y - x

so, GCC's reload currently generates the expected shuffle (as y is live):

y = z
tmp = x
x = y
x -= tmp

but we can use a peephole2 that understands that y and z are equivalent,
and that z is dead, to produce the shorter sequence:

y = z
z -= x
x = z

In practice, for the new testcase from PR 107991, which before produced:

foo:movl%edx, %ecx
movl%esi, %edx
movl%esi, %eax
subl%ecx, %edx
testb   %dil, %dil
cmovne  %edx, %eax
ret

with this patch/peephole2 we now produce the much improved:

foo:movl%esi, %eax
subl%edx, %esi
testb   %dil, %dil
cmovne  %esi, %eax
ret


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2023-01-09  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/107991
* config/i386/i386.md (peephole2): New peephole2 to avoid register
shuffling before a subtraction, after a register-to-register move.

gcc/testsuite/ChangeLog
PR rtl-optimization/107991
* gcc.target/i386/pr107991.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 76f55ec..3090cea 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -7603,6 +7603,31 @@
   "sub{l}\t{%2, %1|%1, %2}"
   [(set_attr "type" "alu")
(set_attr "mode" "SI")])
+
+;; PR 107991: Use peephole2 to avoid suffling before subtraction.
+;; ax = si; cx = dx; dx = ax; dx -= cx where both si and cx
+;; are dead becomes ax = si; si -= dx; dx = si.
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+   (match_operand:SWI 1 "general_reg_operand"))
+   (set (match_operand:SWI 2 "general_reg_operand")
+   (match_operand:SWI 3 "general_reg_operand"))
+   (set (match_dup 3) (match_dup 0))
+   (parallel
+ [(set (match_dup 3) (minus:SWI (match_dup 3) (match_dup 2)))
+  (clobber (reg:CC FLAGS_REG))])]
+  "REGNO (operands[0]) != REGNO (operands[1])
+   && REGNO (operands[0]) != REGNO (operands[2])
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && REGNO (operands[1]) != REGNO (operands[2])
+   && REGNO (operands[1]) != REGNO (operands[3])
+   && REGNO (operands[2]) != REGNO (operands[3])
+   && peep2_reg_dead_p (1, operands[1])
+   && peep2_reg_dead_p (4, operands[2])"
+  [(set (match_dup 0) (match_dup 1))
+   (parallel [(set (match_dup 1) (minus:SWI (match_dup 1) (match_dup 3)))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 3) (match_dup 1))])
 
 ;; Add with carry and subtract with borrow
 
diff --git a/gcc/testsuite/gcc.target/i386/pr107991.c 
b/gcc/testsuite/gcc.target/i386/pr107991.c
new file mode 100644
index 000..9d0d9b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr107991.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+
+int foo(_Bool b, int i, int j) {
+return b ? i - j : i;
+}
+
+int bar(_Bool b, int i, int j) {
+return i + (b ? -j : 0);
+}
+
+int baz(_Bool b, int i, int j) {
+return i - (b ? j : 0);
+}
+
+/* { dg-final { scan-assembler-times "movl" 3 } } */


Re: gcc-13/changes.html: Mention -fstrict-flex-arrays and its impact

2023-01-09 Thread Qing Zhao via Gcc-patches


> On Jan 9, 2023, at 2:11 AM, Richard Biener  wrote:
> 
> On Thu, 22 Dec 2022, Qing Zhao wrote:
> 
>> 
>> 
>>> On Dec 22, 2022, at 2:09 AM, Richard Biener  wrote:
>>> 
>>> On Wed, 21 Dec 2022, Qing Zhao wrote:
>>> 
 Hi, Richard,
 
 Thanks a lot for your comments.
 
> On Dec 21, 2022, at 2:12 AM, Richard Biener  wrote:
> 
> On Tue, 20 Dec 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> This is the patch for mentioning -fstrict-flex-arrays and 
>> -Warray-bounds=2 changes in gcc-13/changes.html.
>> 
>> Let me know if you have any comment or suggestions.
> 
> Some copy editing below
> 
>> Thanks.
>> 
>> Qing.
>> 
>> ===
>> From c022076169b4f1990b91f7daf4cc52c6c5535228 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Tue, 20 Dec 2022 16:13:04 +
>> Subject: [PATCH] gcc-13/changes: Mention -fstrict-flex-arrays and its 
>> impact.
>> 
>> ---
>> htdocs/gcc-13/changes.html | 15 +++
>> 1 file changed, 15 insertions(+)
>> 
>> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
>> index 689178f9..47b3d40f 100644
>> --- a/htdocs/gcc-13/changes.html
>> +++ b/htdocs/gcc-13/changes.html
>> @@ -39,6 +39,10 @@ a work-in-progress.
>>   Legacy debug info compression option -gz=zlib-gnu was 
>> removed
>> and the option is ignored right now.
>>   New debug info compression option value -gz=zstd has 
>> been added.
>> +-Warray-bounds=2 will no longer issue warnings for 
>> out of bounds
>> +  accesses to trailing struct members of one-element array type 
>> anymore. Please
>> +  add -fstrict-flex-arrays=level to control how the 
>> compiler treat
>> +  trailing arrays of structures as flexible array members. 
> 
> "Instead it diagnoses accesses to trailing arrays according to 
> -fstrict-flex-arrays."
 
 Okay.
> 
>> 
>> 
>> 
>> @@ -409,6 +413,17 @@ a work-in-progress.
>> Other significant improvements
>> 
>> 
>> +Treating trailing arrays as flexible array 
>> members
>> +
>> +
>> + GCC can now control when to treat the trailing array of a 
>> structure as a 
>> + flexible array member for the purpose of accessing the elements of 
>> such
>> + an array. By default, all trailing arrays of structures are 
>> treated as
> 
> all trailing arrays in aggregates are treated
 Okay.
> 
>> + flexible array members. Use the new command-line option
>> + -fstrict-flex-array=level to control how GCC treats 
>> the trailing
>> + array of a structure as a flexible array member at different 
>> levels.
> 
> -fstrict-flex-arrays to control which trailing array
> members are streated as flexible arrays.
 
 Okay.
 
> 
> I've also just now noticed that there's now a flag_strict_flex_arrays
> check in the middle-end (in array bound diagnostics) but this option
> isn't streamed or handled with LTO.  I think you want to replace that
> with the appropriate DECL_NOT_FLEXARRAY check.
 
 We need to know the level value of the strict_flex_arrays on the struct 
 field to issue proper warnings at different levels. DECL_NOT_FLEXARRAY 
 does not include such info. So, what should I do? Streaming the 
 flag_strict_flex_arrays with LTO?
>>> 
>>> But you do
>>> 
>>> if (compref)
>>>   {
>>> /* Try to determine special array member type for this 
>>> COMPONENT_REF.  */
>>> sam = component_ref_sam_type (arg);
>>> /* Get the level of strict_flex_array for this array field.  */
>>> tree afield_decl = TREE_OPERAND (arg, 1);
>>> strict_flex_array_level = strict_flex_array_level_of (afield_decl);
>>> 
>>> I see that function doesn't look at DECL_NOT_FLEXARRAY but just
>>> checks attributes (those are streamed in LTO).
>> 
>> Yes, checked both flag_strict_flex_arrays and attributes. 
>> 
>> There are two places in middle end calling ?strict_flex_array_level_of? 
>> function, 
>> one inside ?array_bounds_checker::check_array_ref?, another one inside 
>> ?component_ref_size?.
>> Shall we check DECL_NOT_FLEXARRAY field instead of calling 
>> ?strict_flex_array_level_of? in both places?
> 
> I wonder if that function should check DECL_NOT_FLEXARRAY?

The function “strict_flex_array_level_of” is intended to query the LEVEL of 
strict_flex_array, only check DECL_NOT_FLEXARRAY is not enough. 

So, I think the major question here is: 

Do we need  the LEVEL of strict_flex_array information in the Middle end?

The current major use of LEVEL of strict_flex_array in the middle end is two 
places:

1. In the routine “component_ref_size”: to determine the size of the 
trailing array based on the level of the strict_flex_array.
2. In the routine “array_bounds_chec

Re: Java front-end and library patches v2

2023-01-09 Thread Max Downey Twiss via Gcc-patches
The astute among you may have noticed that I in fact sent no patches.

This is for two reasons--

1. Google did something behind the scenes and broke git send-email for me again

2. Thanks to Iain Sandoe, the build now completes, and he provided a
couple other fixes as well. There are still a couple issues
post-build, though, and it makes more sense to work through those
before posting a v2.


Re: [PATCH] Remove legacy pre-C++ 11 definitions

2023-01-09 Thread Martin Liška

On 1/6/23 19:23, Jonathan Wakely wrote:

Seems to me that GCC code should just use nullptr directly not redefine NULL.


Sure, but that would lead to a huge patch which would rename NULL to nullptr, 
right?

Martin


Re: [PATCH] ipa: Sort ipa_param_body_adjustments::m_replacements (PR 108110)

2023-01-09 Thread Martin Liška

On 1/6/23 17:50, Martin Jambor wrote:

Hi,

On Fri, Jan 06 2023, Martin Liška wrote:

Hi Martin


+  key.unit_offset = unit_offset;
+  ipa_param_body_replacement *res
+= std::lower_bound (m_replacements.begin (), m_replacements.end (), key,
+   [] (const ipa_param_body_replacement &elt,
+   const ipa_param_body_replacement &val)
+   {
+ if (DECL_UID (elt.base) < DECL_UID (val.base))
+   return true;
+ if (DECL_UID (elt.base) > DECL_UID (val.base))
+   return false;
+ if (elt.unit_offset < val.unit_offset)
+   return true;
+ return false;
+   });


I'm curious if we can re-use compare_param_body_replacement as the introduced
lambda does a very similar thing, right?



Not directly, the qsort callback returns an integer that can be either
negative, positive or zero but the lower_bound returns only true or
false (the semantics is that it returns the first element for which it
returns false).  Plus one takes parameters which are pointer and other
needs references.


Hi.

I see, so leaving that up to you if you want to adjust it or not.

OK for both versions of the patch.

Cheers,
Martin



So I was lazy and just came up with a similar comparator lambda.  But
sure, I can call the qsort comparator from it, which I guess makes sense
at least for consistency.  I'll adjust the patch.

Thanks,

Martin




Re: [PATCH] Remove legacy pre-C++ 11 definitions

2023-01-09 Thread Jonathan Wakely via Gcc-patches
On Mon, 9 Jan 2023 at 15:17, Martin Liška  wrote:
>
> On 1/6/23 19:23, Jonathan Wakely wrote:
> > Seems to me that GCC code should just use nullptr directly not redefine 
> > NULL.
>
> Sure, but that would lead to a huge patch which would rename NULL to nullptr, 
> right?


Yeah, which can probably be done separately (or not done at all). I
was just commenting on the comment that Andrew showed. That comment
explain that nullptr is better than 0 as a null pointer constant,
which is a good reason to prefer nullptr. But not a good reason to
redefine NULL; in code with a minimum requirement of C++11 you can
just use nullptr directly.


ada: Update copyright notice

2023-01-09 Thread Marc Poulhiès via Gcc-patches


Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* accessibility.adb, accessibility.ads, ada_get_targ.adb: Update 
copyright year.
* adabkend.adb, adabkend.ads, adadecode.c, adadecode.h, adaint.c: 
Likewise.
* adaint.h, affinity.c, ali-util.adb, ali-util.ads, ali.adb: Likewise.
* ali.ads, alloc.ads, argv-lynxos178-raven-cert.c, argv.c: Likewise.
* aspects.adb, aspects.ads, atree.adb, atree.ads, atree.h: Likewise.
* aux-io.c, back_end.adb, back_end.ads, backend_utils.adb: Likewise.
* backend_utils.ads, bcheck.adb, bcheck.ads, binde.adb, binde.ads: 
Likewise.
* binderr.adb, binderr.ads, bindgen.adb, bindgen.ads: Likewise.
* bindo-augmentors.adb, bindo-augmentors.ads, bindo-builders.adb: 
Likewise.
* bindo-builders.ads, bindo-diagnostics.adb: Likewise.
* bindo-diagnostics.ads, bindo-elaborators.adb: Likewise.
* bindo-elaborators.ads, bindo-graphs.adb, bindo-graphs.ads: Likewise.
* bindo-units.adb, bindo-units.ads, bindo-validators.adb: Likewise.
* bindo-validators.ads, bindo-writers.adb, bindo-writers.ads: Likewise.
* bindo.adb, bindo.ads, bindusg.adb, bindusg.ads, butil.adb: Likewise.
* butil.ads, cal.c, casing.adb, casing.ads, checks.adb: Likewise.
* checks.ads, cio.c, clean.adb, clean.ads: Likewise.
* comperr.adb, comperr.ads, contracts.adb, contracts.ads: Likewise.
* csets.adb, csets.ads, cstand.adb: Likewise.
* cstand.ads, cstreams.c, ctrl_c.c, debug.adb, debug.ads: Likewise.
* debug_a.adb, debug_a.ads, einfo-utils.adb, einfo-utils.ads: Likewise.
* einfo.adb, einfo.ads, elists.adb, elists.ads, elists.h, env.c: 
Likewise.
* env.h, err_vars.ads, errno.c, errout.adb, errout.ads: Likewise.
* erroutc.adb, erroutc.ads, errutil.adb, errutil.ads: Likewise.
* eval_fat.adb, eval_fat.ads, exit.c, exp_aggr.adb, exp_aggr.ads: 
Likewise.
* exp_atag.adb, exp_atag.ads, exp_attr.adb, exp_attr.ads: Likewise.
* exp_cg.adb, exp_cg.ads, exp_ch10.ads, exp_ch11.adb: Likewise.
* exp_ch11.ads, exp_ch12.adb, exp_ch12.ads, exp_ch13.adb: Likewise.
* exp_ch13.ads, exp_ch2.adb, exp_ch2.ads, exp_ch3.adb: Likewise.
* exp_ch3.ads, exp_ch4.adb, exp_ch4.ads, exp_ch5.adb, exp_ch5.ads: 
Likewise.
* exp_ch6.adb, exp_ch6.ads, exp_ch7.adb, exp_ch7.ads, exp_ch8.adb: 
Likewise.
* exp_ch8.ads, exp_ch9.adb, exp_ch9.ads, exp_code.adb: Likewise.
* exp_code.ads, exp_dbug.adb, exp_dbug.ads, exp_disp.adb: Likewise.
* exp_disp.ads, exp_dist.adb, exp_dist.ads, exp_fixd.adb: Likewise.
* exp_fixd.ads, exp_imgv.adb, exp_imgv.ads, exp_intr.adb: Likewise.
* exp_intr.ads, exp_pakd.adb, exp_pakd.ads, exp_prag.adb: Likewise.
* exp_prag.ads, exp_put_image.adb, exp_put_image.ads, exp_sel.adb: 
Likewise.
* exp_sel.ads, exp_smem.adb, exp_smem.ads, exp_spark.adb: Likewise.
* exp_spark.ads, exp_strm.adb, exp_strm.ads, exp_tss.adb: Likewise.
* exp_tss.ads, exp_unst.adb, exp_unst.ads, exp_util.adb: Likewise.
* exp_util.ads, expander.adb, expander.ads, expect.c, fe.h: Likewise.
* final.c, fmap.adb, fmap.ads, fname-sf.adb, fname-sf.ads: Likewise.
* fname-uf.adb, fname-uf.ads, fname.adb, fname.ads, freeze.adb: 
Likewise.
* freeze.ads, frontend.adb, frontend.ads, gen_il-fields.ads: Likewise.
* gen_il-gen-gen_entities.adb, gen_il-gen-gen_nodes.adb: Likewise.
* gen_il-gen.adb, gen_il-gen.ads, gen_il-internals.adb: Likewise.
* gen_il-internals.ads, gen_il-main.adb, gen_il-types.ads: Likewise.
* gen_il.adb, gen_il.ads, get_scos.adb, get_scos.ads: Likewise.
* get_targ.adb, get_targ.ads, ghost.adb, ghost.ads, gnat1drv.adb: 
Likewise.
* gnat1drv.ads, gnat_cuda.adb, gnat_cuda.ads: Likewise.
* gnatbind.adb, gnatbind.ads, gnatchop.adb: Likewise.
* gnatclean.adb, gnatcmd.adb, gnatcmd.ads, gnatdll.adb: Likewise.
* gnatkr.adb, gnatkr.ads, gnatlink.adb, gnatlink.ads, gnatls.adb: 
Likewise.
* gnatls.ads, gnatmake.adb, gnatmake.ads, gnatname.adb: Likewise.
* gnatname.ads, gnatprep.adb, gnatprep.ads: Likewise.
* gprep.adb, gprep.ads, gsocket.h: Likewise.
* hostparm.ads: Likewise.
* impunit.adb, impunit.ads, indepsw-aix.adb, indepsw-darwin.adb: 
Likewise.
* indepsw-gnu.adb, indepsw.adb, indepsw.ads, init.c: Likewise.
* initialize.c, inline.adb, inline.ads, itypes.adb, itypes.ads: 
Likewise.
* krunch.adb, krunch.ads, layout.adb, layout.ads: Likewise.
* lib-list.adb, lib-load.adb, lib-load.ads, lib-sort.adb: Likewise.
* lib-util.adb, lib-util.ads, lib-writ.adb, lib-writ.ads: Likewise.
* lib-xref-spark_specific.adb, lib-xref.adb, lib-xref.ads: Likewise.
* lib.adb, lib.ads, libgnarl/a-astaco.adb, libgnarl/a-dispat.adb: 
Likewise.
* libgnarl/a-dynpri.adb, libgnarl/

Re: [patch, fortran] Fix common subexpression elimination with IEEE rounding (PR108329)

2023-01-09 Thread Thomas Koenig via Gcc-patches

Hi Richard,


There is no reliable way to get this correct at the moment and if there
were good and easy ways to get this working they'd be implemented already.


OK, I then withdraw the patch (and have unassigned myself from the PR).

Best regards

Thomas


Re: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

2023-01-09 Thread Patrick Palka via Gcc-patches
On Wed, 5 Oct 2022, Patrick Palka wrote:

> On Thu, 7 Jul 2022, Jonathan Wakely via Gcc-patches wrote:
> 
> > This adds a new built-in to replace the recursive class template
> > instantiations done by traits such as std::tuple_element and
> > std::variant_alternative. The purpose is to select the Nth type from a
> > list of types, e.g. __builtin_type_pack_element(1, char, int, float) is
> > int.
> > 
> > For a pathological example tuple_element_t<1000, tuple<2000 types...>>
> > the compilation time is reduced by more than 90% and the memory  used by
> > the compiler is reduced by 97%. In realistic examples the gains will be
> > much smaller, but still relevant.
> > 
> > Clang has a similar built-in, __type_pack_element, but that's a
> > "magic template" built-in using <> syntax, which GCC doesn't support. So
> > this provides an equivalent feature, but as a built-in function using
> > parens instead of <>. I don't really like the name "type pack element"
> > (it gives you an element from a pack of types) but the semi-consistency
> > with Clang seems like a reasonable argument in favour of keeping the
> > name. I'd be open to alternative names though, e.g. __builtin_nth_type
> > or __builtin_type_at_index.
> 
> Rather than giving the trait a different name from __type_pack_element,
> I wonder if we could just special case cp_parser_trait to expect <>
> instead of parens for this trait?
> 
> Btw the frontend recently got a generic TRAIT_TYPE tree code, which gets
> rid of much of the boilerplate of adding a new type-yielding built-in
> trait, see e.g. cp-trait.def.

Here's a tested patch based on Jonathan's original patch that implements
the built-in in terms of TRAIT_TYPE, names it __type_pack_element
instead of __builtin_type_pack_element, and treats invocations of it
like a template-id instead of a call (to match Clang).

-- >8 --

Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

This adds a new built-in to replace the recursive class template
instantiations done by traits such as std::tuple_element and
std::variant_alternative.  The purpose is to select the Nth type from a
list of types, e.g. __type_pack_element<1, char, int, float> is int.
We implement it as a special kind of TRAIT_TYPE.

For a pathological example tuple_element_t<1000, tuple<2000 types...>>
the compilation time is reduced by more than 90% and the memory  used by
the compiler is reduced by 97%.  In realistic examples the gains will be
much smaller, but still relevant.

Unlike the other built-in traits, __type_pack_element uses template-id
syntax instead of call syntax and is SFINAE-enabled, matching Clang's
implementation.  And like the other built-in traits, it's not mangleable
so we can't use it directly in function signatures.

Some caveats:

  * Clang's version of the built-in seems to act like a "magic template"
that can e.g. be used as a template template argument.  For simplicity
we implement it in a more ad-hoc way.
  * Our parsing of the <>'s in __type_pack_element<...> is currently
rudimentary and doesn't try to disambiguate a trailing >> vs > >
as cp_parser_enclosed_template_argument_list does.

Co-authored-by: Jonathan Wakely 

PR c++/100157

gcc/cp/ChangeLog:

* cp-trait.def (TYPE_PACK_ELEMENT): Define.
* cp-tree.h (finish_trait_type): Add complain parameter.
* cxx-pretty-print.cc (pp_cxx_trait): Handle
CPTK_TYPE_PACK_ELEMENT.
* parser.cc (cp_parser_constant_expression): Document default
arguments.
(cp_parser_trait): Handle CPTK_TYPE_PACK_ELEMENT.  Pass
tf_warning_or_error to finish_trait_type.
* pt.cc (tsubst) : Handle
CPTK_TYPE_PACK_ELEMENT.
* semantics.cc (finish_type_pack_element): Define.
(finish_trait_type): Add complain parameter.  Handle
CPTK_TYPE_PACK_ELEMENT.
* tree.cc (strip_typedefs): Pass tf_warning_or_error to
finish_trait_type.
* typeck.cc (structural_comptypes) : Use
cp_tree_equal instead of same_type_p for TRAIT_TYPE_TYPE.

libstdc++-v3/ChangeLog:

* include/bits/utility.h (_Nth_type): Conditionally define using
__type_pack_element.

gcc/testsuite/ChangeLog:

* g++.dg/ext/type_pack_element1.C: New test.
* g++.dg/ext/type_pack_element2.C: New test.
* g++.dg/ext/type_pack_element3.C: New test.
---
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/cp-tree.h  |  2 +-
 gcc/cp/cxx-pretty-print.cc| 17 ++--
 gcc/cp/parser.cc  | 36 -
 gcc/cp/pt.cc  |  8 +++-
 gcc/cp/semantics.cc   | 39 ++-
 gcc/cp/tree.cc|  3 +-
 gcc/cp/typeck.cc  |  2 +-
 gcc/testsuite/g++.dg/ext/type_pack_element1.C | 19 +
 gcc/testsuite/g++.dg/ext/type_pack_element2.C | 14 +++
 

Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Richard Earnshaw via Gcc-patches




On 09/01/2023 14:58, Andrea Corallo via Gcc-patches wrote:

Andrea Corallo via Gcc-patches  writes:


Richard Earnshaw  writes:


On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:



-Original Message-
From: Andrea Corallo 
Sent: Tuesday, September 27, 2022 11:06 AM
To: Kyrylo Tkachov 
Cc: Andrea Corallo via Gcc-patches ; Richard
Earnshaw ; nd 
Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
popping if necessary

Kyrylo Tkachov  writes:


Hi Andrea,


-Original Message-
From: Gcc-patches  On Behalf Of Andrea
Corallo via Gcc-patches
Sent: Friday, August 12, 2022 4:34 PM
To: Andrea Corallo via Gcc-patches 
Cc: Richard Earnshaw ; nd 
Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when

popping

if necessary

Hi all,

this patch enables 'arm_emit_multi_reg_pop' to set again the stack
pointer as CFA reg when popping if this is necessary.



  From what I can tell from similar functions this is correct, but could you

elaborate on why this change is needed for my understanding please?

Thanks,
Kyrill


Hi Kyrill,

sure, if the frame pointer was set, than it is the current CFA register.
If we request to adjust the current CFA register offset indicating it
being SP (while it's actually FP) that is indeed not correct and the
incoherence we will be detected by an assertion in the dwarf emission
machinery.

Thanks,  the patch is ok
Kyrill



Best Regards

Andrea


Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?


Hi Richard,

not sure I understand, isn't any pop updating SP by definition?



Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
   foo (i);
   return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb -O0 -g

Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
ldr r3, .L3
ldr r3, [r3]
mov r0, r3
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?


Ah, OK.  I think this is a special case, though, because in this 
specific case the frame pointer (r7) and the stack pointer point to the 
same place.  This means that in the epilogue we don't start by restoring 
SP from FP (at which point we tell the dwarf code that the frame is back 
in SP again).


For example, if I have:


int i;

void foo (int, int*);

int bar()
{
  int j[10];
  foo (i,j);
  return 0;
}


then the epilogue sequence starts with:

addsr7, r7, #40
.cfi_def_cfa_offset 8
mov sp, r7
.cfi_def_cfa_register 13

And then the pop works correctly as-is.

But I'm not convinced that this is enough anyway, you cause the compiler 
to output a directive that changes the CFA pointer back to r13, but you 
don't output anything that changes the CFA offset.  So I think this 
means that the CFA state machine ends up pointing to the wrong location, 
but it's hard to tell as you haven't shown the CFA directives in your 
example above.




Best Regards

   Andrea


R.


Re: [x86 PATCH] PR rtl-optimization/107991: peephole2 to tweak register allocation.

2023-01-09 Thread Uros Bizjak via Gcc-patches
On Mon, Jan 9, 2023 at 4:01 PM Roger Sayle  wrote:
>
>
> This patch addresses PR rtl-optimization/107991, which is a P2 regression
> where GCC currently requires more "mov" instructions than GCC 7.
>
> The x86's two address ISA creates some interesting challenges for reload.
> For example, the tricky "x = y - x" usually needs to be implemented on x86
> as
>
> tmp = x
> x = y
> x -= tmp
>
> where a scratch register and two mov's are required to work around
> the lack of a subf (subtract from) or rsub (reverse subtract) insn.
>
> Not uncommonly, if y is dead after this subtraction, register allocation
> can be improved by clobbering y.
>
> y -= x
> x = y
>
> For the testcase in PR 107991, things are slightly more complicated,
> where y is not itself dead, but is assigned from (i.e. equivalent to)
> a value that is dead.  Hence we have something like:
>
> y = z
> x = y - x
>
> so, GCC's reload currently generates the expected shuffle (as y is live):
>
> y = z
> tmp = x
> x = y
> x -= tmp
>
> but we can use a peephole2 that understands that y and z are equivalent,
> and that z is dead, to produce the shorter sequence:
>
> y = z
> z -= x
> x = z
>
> In practice, for the new testcase from PR 107991, which before produced:
>
> foo:movl%edx, %ecx
> movl%esi, %edx
> movl%esi, %eax
> subl%ecx, %edx
> testb   %dil, %dil
> cmovne  %edx, %eax
> ret
>
> with this patch/peephole2 we now produce the much improved:
>
> foo:movl%esi, %eax
> subl%edx, %esi
> testb   %dil, %dil
> cmovne  %esi, %eax
> ret
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?

Looking at the PR, it looks to me that Richard S (CC'd) wants to solve
this issue in the register allocator. This would be preferred
(compared to a very specialized peephole2), since peephole2 pass comes
very late in the game, so one freed register does not contribute to
lower the register pressure at all.

Peephole2 should be used to clean after reload only in rare cases when
target ISA prevents generic solution. From your description, a generic
solution would benefit all targets with destructive subtraction (or
perhaps also for other noncommutative operations).

So, please coordinate with Richard S regarding this issue.

Thanks,
Uros.

>
>
> 2023-01-09  Roger Sayle  
>
> gcc/ChangeLog
> PR rtl-optimization/107991
> * config/i386/i386.md (peephole2): New peephole2 to avoid register
> shuffling before a subtraction, after a register-to-register move.
>
> gcc/testsuite/ChangeLog
> PR rtl-optimization/107991
> * gcc.target/i386/pr107991.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>


Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Richard Earnshaw via Gcc-patches




On 09/01/2023 14:58, Andrea Corallo via Gcc-patches wrote:

Andrea Corallo via Gcc-patches  writes:


Richard Earnshaw  writes:


On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:



-Original Message-
From: Andrea Corallo 
Sent: Tuesday, September 27, 2022 11:06 AM
To: Kyrylo Tkachov 
Cc: Andrea Corallo via Gcc-patches ; Richard
Earnshaw ; nd 
Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
popping if necessary

Kyrylo Tkachov  writes:


Hi Andrea,


-Original Message-
From: Gcc-patches  On Behalf Of Andrea
Corallo via Gcc-patches
Sent: Friday, August 12, 2022 4:34 PM
To: Andrea Corallo via Gcc-patches 
Cc: Richard Earnshaw ; nd 
Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when

popping

if necessary

Hi all,

this patch enables 'arm_emit_multi_reg_pop' to set again the stack
pointer as CFA reg when popping if this is necessary.



  From what I can tell from similar functions this is correct, but could you

elaborate on why this change is needed for my understanding please?

Thanks,
Kyrill


Hi Kyrill,

sure, if the frame pointer was set, than it is the current CFA register.
If we request to adjust the current CFA register offset indicating it
being SP (while it's actually FP) that is indeed not correct and the
incoherence we will be detected by an assertion in the dwarf emission
machinery.

Thanks,  the patch is ok
Kyrill



Best Regards

Andrea


Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?


Hi Richard,

not sure I understand, isn't any pop updating SP by definition?



Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
   foo (i);
   return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb -O0 -g

Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
ldr r3, .L3
ldr r3, [r3]
mov r0, r3
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?



Digging a bit deeper, I'm now even more confused.  arm_expand_epilogue 
contains (parphrasing the code):


 if frame_pointer_needed
   {
 if arm
   {}
 else
   {
 if adjust
   r7 += adjust
 mov sp, r7 // Reset CFA to SP
   }
}

so there should always be a move of r7 into SP, even if this is strictly 
redundant.  I don't understand why this doesn't happen for your 
testcase.  Can you dig a bit deeper?  I wonder if we've (probably 
incorrectly) assumed that this function doesn't need an epilogue but can 
use a simple return?  I don't think we should do that when 
authentication is needed: a simple return should really be one instruction.



Best Regards

   Andrea


R.


Patch ping

2023-01-09 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping a few pending patches:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606973.html
  - PR107465 - c-family: Fix up -Wsign-compare BIT_NOT_EXPR handling

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607104.html
  - PR107465 - c-family: Incremental fix for -Wsign-compare BIT_NOT_EXPR 
handling

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607145.html
  - PR107558 - c++: Don't clear TREE_READONLY for -fmerge-all-constants for 
non-aggregates

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607534.html
  - PR107846 - c-family: Account for integral promotions of left shifts for 
-Wshift-overflow warning

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606382.html
  - PR107703 - libgcc, i386: Add __fix{,uns}bfti and __float{,un}tibf

https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608932.html
  - PR108079 - c, c++, cgraphunit: Prevent duplicated -Wunused-value warnings

Thanks

Jakub



Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Richard Earnshaw via Gcc-patches




On 09/01/2023 16:48, Richard Earnshaw via Gcc-patches wrote:



On 09/01/2023 14:58, Andrea Corallo via Gcc-patches wrote:

Andrea Corallo via Gcc-patches  writes:


Richard Earnshaw  writes:


On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:



-Original Message-
From: Andrea Corallo 
Sent: Tuesday, September 27, 2022 11:06 AM
To: Kyrylo Tkachov 
Cc: Andrea Corallo via Gcc-patches ; Richard
Earnshaw ; nd 
Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg 
when

popping if necessary

Kyrylo Tkachov  writes:


Hi Andrea,


-Original Message-
From: Gcc-patches  On Behalf Of Andrea
Corallo via Gcc-patches
Sent: Friday, August 12, 2022 4:34 PM
To: Andrea Corallo via Gcc-patches 
Cc: Richard Earnshaw ; nd 
Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when

popping

if necessary

Hi all,

this patch enables 'arm_emit_multi_reg_pop' to set again the stack
pointer as CFA reg when popping if this is necessary.



  From what I can tell from similar functions this is correct, 
but could you

elaborate on why this change is needed for my understanding please?

Thanks,
Kyrill


Hi Kyrill,

sure, if the frame pointer was set, than it is the current CFA 
register.

If we request to adjust the current CFA register offset indicating it
being SP (while it's actually FP) that is indeed not correct and the
incoherence we will be detected by an assertion in the dwarf emission
machinery.

Thanks,  the patch is ok
Kyrill



Best Regards

    Andrea


Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?


Hi Richard,

not sure I understand, isn't any pop updating SP by definition?



Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
   foo (i);
   return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb 
-O0 -g


Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac    ip, lr, sp
push    {r3, r7, ip, lr}
add    r7, sp, #0
ldr    r3, .L3
ldr    r3, [r3]
mov    r0, r3
bl    foo
movs    r3, #0
mov    r0, r3
pop    {r3, r7, ip, lr}
aut    ip, lr, sp
bx    lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?



Digging a bit deeper, I'm now even more confused.  arm_expand_epilogue 
contains (parphrasing the code):


  if frame_pointer_needed
    {
  if arm
    {}
  else
    {
  if adjust
    r7 += adjust
  mov sp, r7    // Reset CFA to SP
    }
     }

so there should always be a move of r7 into SP, even if this is strictly 
redundant.  I don't understand why this doesn't happen for your 
testcase.  Can you dig a bit deeper?  I wonder if we've (probably 
incorrectly) assumed that this function doesn't need an epilogue but can 
use a simple return?  I don't think we should do that when 
authentication is needed: a simple return should really be one instruction.




So I strongly suspect the real problem here is that use_return_insn () 
in arm.cc needs to be updated to return false when using pointer 
authentication.  The specification for this function says that a return 
can be done in one instruction; and clearly when we need authentication 
more than one is needed.


R.


Best Regards

   Andrea


R.


Re: [PATCH] Add support for x86_64-*-gnu-* targets to build x86_64 gnumach/hurd

2023-01-09 Thread Flávio Cruz via Gcc-patches
Friendly ping

On Mon, Dec 26, 2022 at 12:34 PM Flavio Cruz  wrote:

> Tested by building a toolchain and compiling gnumach for x86_64 [1].
> This is the basic version without unwind support which I think is only
> required to
> implement exceptions.
>
> [1] https://github.com/flavioc/cross-hurd/blob/master/bootstrap-kernel.sh.
>
> ---
>  gcc/config.gcc  |  5 -
>  gcc/config/i386/gnu64.h | 40 +
>  libgcc/config.host  |  8 ++-
>  libgcc/config/i386/gnu-unwind.h | 10 +
>  4 files changed, 61 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/i386/gnu64.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 95190233820..0e2b15768bf 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1955,7 +1955,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu |
> i[34567]86-*-gnu* | i[34567]8
> ;;
> esac
> ;;
> -x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
> +x86_64-*-linux* | x86_64-*-kfreebsd*-gnu | x86_64-*-gnu*)
> tm_file="${tm_file} i386/unix.h i386/att.h elfos.h gnu-user.h
> glibc-stdint.h \
>  i386/x86-64.h i386/gnu-user-common.h i386/gnu-user64.h"
> case ${target} in
> @@ -1966,6 +1966,9 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
> x86_64-*-kfreebsd*-gnu)
> tm_file="${tm_file} kfreebsd-gnu.h i386/kfreebsd-gnu64.h"
> ;;
> +   x86_64-*-gnu*)
> +   tm_file="${tm_file} gnu.h i386/gnu64.h"
> +   ;;
> esac
> tmake_file="${tmake_file} i386/t-linux64"
> x86_multilibs="${with_multilib_list}"
> diff --git a/gcc/config/i386/gnu64.h b/gcc/config/i386/gnu64.h
> new file mode 100644
> index 000..a1ecfaa1cdb
> --- /dev/null
> +++ b/gcc/config/i386/gnu64.h
> @@ -0,0 +1,40 @@
> +/* Configuration for an x86_64 running GNU with ELF as the target
> machine.  */
> +
> +/*
> +Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software: you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation, either version 3 of the License, or
> +(at your option) any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC.  If not, see .
> +*/
> +
> +#define GNU_USER_LINK_EMULATION32 "elf_i386"
> +#define GNU_USER_LINK_EMULATION64 "elf_x86_64"
> +#define GNU_USER_LINK_EMULATIONX32 "elf32_x86_64"
> +
> +#undef GNU_USER_DYNAMIC_LINKER
> +#define GNU_USER_DYNAMIC_LINKER32 "/lib/ld.so.1"
> +#define GNU_USER_DYNAMIC_LINKER64 "/lib/ld-x86-64.so.1"
> +#define GNU_USER_DYNAMIC_LINKERX32 "/lib/ld-x32.so.1"
> +
> +#undef STARTFILE_SPEC
> +#if defined HAVE_LD_PIE
> +#define STARTFILE_SPEC \
> +  "%{!shared:
> %{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}}
> \
> +   crti.o%s
> %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
> +#else
> +#define STARTFILE_SPEC \
> +  "%{!shared:
> %{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};static:crt0.o%s;:crt1.o%s}} \
> +   crti.o%s
> %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
> +#endif
> diff --git a/libgcc/config.host b/libgcc/config.host
> index eb23abe89f5..75fd1b778fe 100644
> --- a/libgcc/config.host
> +++ b/libgcc/config.host
> @@ -751,6 +751,12 @@ x86_64-*-kfreebsd*-gnu)
> tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff
> t-dfprules"
> tm_file="${tm_file} i386/elf-lib.h"
> ;;
> +x86_64-*-gnu*)
> +   extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o
> crtfastmath.o"
> +   tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff
> t-dfprules"
> +   tm_file="${tm_file} i386/elf-lib.h"
> +   md_unwind_header=i386/gnu-unwind.h
> +   ;;
>  i[34567]86-pc-msdosdjgpp*)
> ;;
>  i[34567]86-*-lynxos*)
> @@ -1523,7 +1529,7 @@ esac
>  case ${host} in
>  i[34567]86-*-linux* | x86_64-*-linux* | \
>i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
> -  i[34567]86-*-gnu*)
> +  i[34567]86-*-gnu* | x86_64-*-gnu*)
> tmake_file="${tmake_file} t-tls i386/t-linux i386/t-msabi
> t-slibgcc-libgcc"
> if test "$libgcc_cv_cfi" = "yes"; then
> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
> diff --git a/libgcc/config/i386/gnu-unwind.h
> b/libgcc/config/i386/gnu-unwind.h
> index 25eb690e370..2cbfc40ea7e 100644
> --- a/libgcc/config/i386/gnu-unwind.h
> +++ b/libgcc/config/i386/gnu-unwind.h
> @@ -30,6 +30,14 @@ see the files COPYING3 and COPYING.RUNTIME
> respectively.  If not, see
>
>  #include 
>
> +#ifdef __x86_64__
> +
> +/*

Re: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

2023-01-09 Thread Patrick Palka via Gcc-patches
On Mon, 9 Jan 2023, Patrick Palka wrote:

> On Wed, 5 Oct 2022, Patrick Palka wrote:
> 
> > On Thu, 7 Jul 2022, Jonathan Wakely via Gcc-patches wrote:
> > 
> > > This adds a new built-in to replace the recursive class template
> > > instantiations done by traits such as std::tuple_element and
> > > std::variant_alternative. The purpose is to select the Nth type from a
> > > list of types, e.g. __builtin_type_pack_element(1, char, int, float) is
> > > int.
> > > 
> > > For a pathological example tuple_element_t<1000, tuple<2000 types...>>
> > > the compilation time is reduced by more than 90% and the memory  used by
> > > the compiler is reduced by 97%. In realistic examples the gains will be
> > > much smaller, but still relevant.
> > > 
> > > Clang has a similar built-in, __type_pack_element, but that's a
> > > "magic template" built-in using <> syntax, which GCC doesn't support. So
> > > this provides an equivalent feature, but as a built-in function using
> > > parens instead of <>. I don't really like the name "type pack element"
> > > (it gives you an element from a pack of types) but the semi-consistency
> > > with Clang seems like a reasonable argument in favour of keeping the
> > > name. I'd be open to alternative names though, e.g. __builtin_nth_type
> > > or __builtin_type_at_index.
> > 
> > Rather than giving the trait a different name from __type_pack_element,
> > I wonder if we could just special case cp_parser_trait to expect <>
> > instead of parens for this trait?
> > 
> > Btw the frontend recently got a generic TRAIT_TYPE tree code, which gets
> > rid of much of the boilerplate of adding a new type-yielding built-in
> > trait, see e.g. cp-trait.def.
> 
> Here's a tested patch based on Jonathan's original patch that implements
> the built-in in terms of TRAIT_TYPE, names it __type_pack_element
> instead of __builtin_type_pack_element, and treats invocations of it
> like a template-id instead of a call (to match Clang).
> 
> -- >8 --
> 
> Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]
> 
> This adds a new built-in to replace the recursive class template
> instantiations done by traits such as std::tuple_element and
> std::variant_alternative.  The purpose is to select the Nth type from a
> list of types, e.g. __type_pack_element<1, char, int, float> is int.
> We implement it as a special kind of TRAIT_TYPE.
> 
> For a pathological example tuple_element_t<1000, tuple<2000 types...>>
> the compilation time is reduced by more than 90% and the memory  used by
> the compiler is reduced by 97%.  In realistic examples the gains will be
> much smaller, but still relevant.
> 
> Unlike the other built-in traits, __type_pack_element uses template-id
> syntax instead of call syntax and is SFINAE-enabled, matching Clang's
> implementation.  And like the other built-in traits, it's not mangleable
> so we can't use it directly in function signatures.
> 
> Some caveats:
> 
>   * Clang's version of the built-in seems to act like a "magic template"
> that can e.g. be used as a template template argument.  For simplicity
> we implement it in a more ad-hoc way.
>   * Our parsing of the <>'s in __type_pack_element<...> is currently
> rudimentary and doesn't try to disambiguate a trailing >> vs > >
> as cp_parser_enclosed_template_argument_list does.

Hmm, this latter caveat turns out to be inconvenient (for code such as
type_pack_element3.C) and admits an easy workaround inspired by what
cp_parser_enclosed_template_argument_list does.

v2: Consider the >> in __type_pack_element<0, int, char>> to be two >'s.
Handle non-type TRAIT_TYPE_TYPE1 in strip_typedefs (for sake of
CPTK_TYPE_PACK_ELEMENT).

-- >8 --

Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

This adds a new built-in to replace the recursive class template
instantiations done by traits such as std::tuple_element and
std::variant_alternative.  The purpose is to select the Nth type from a
list of types, e.g. __type_pack_element<1, char, int, float> is int.
We implement it as a special kind of TRAIT_TYPE.

For a pathological example tuple_element_t<1000, tuple<2000 types...>>
the compilation time is reduced by more than 90% and the memory  used by
the compiler is reduced by 97%.  In realistic examples the gains will be
much smaller, but still relevant.

Unlike the other built-in traits, __type_pack_element uses template-id
syntax instead of call syntax and is SFINAE-enabled, matching Clang's
implementation.  And like the other built-in traits, it's not mangleable
so we can't use it directly in function signatures.

N.B. Clang seems to implement __type_pack_element as a first-class
template that can e.g. be used as a template template argument.  For
simplicity we implement it in a more ad-hoc way.

Co-authored-by: Jonathan Wakely 

PR c++/100157

gcc/cp/ChangeLog:

* cp-trait.def (TYPE_PACK_ELEMENT): Define.
* cp-tree.h (finish_trait_type): Add complain parame

[PATCH] PR rtl-optimization/106421: ICE in bypass_block from non-local goto.

2023-01-09 Thread Roger Sayle

This patch fixes PR rtl-optimization/106421, an ICE-on-valid (but
undefined) regression.  The fix, as proposed by Richard Biener, is to
defend against BLOCK_FOR_INSN returning NULL in cprop's bypass_block.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2023-01-09  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/106421
* cprop.cc (bypass_block): Check that DEST is local to this
function (non-NULL) before calling find_edge.

gcc/testsuite/ChangeLog
PR rtl-optimization/106421
* gcc.dg/pr106421.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/cprop.cc b/gcc/cprop.cc
index 5b203ec..6ec0bda 100644
--- a/gcc/cprop.cc
+++ b/gcc/cprop.cc
@@ -1622,9 +1622,12 @@ bypass_block (basic_block bb, rtx_insn *setcc, rtx_insn 
*jump)
{
  dest = BLOCK_FOR_INSN (XEXP (new_rtx, 0));
  /* Don't bypass edges containing instructions.  */
- edest = find_edge (bb, dest);
- if (edest && edest->insns.r)
-   dest = NULL;
+ if (dest)
+   {
+ edest = find_edge (bb, dest);
+ if (edest && edest->insns.r)
+   dest = NULL;
+   }
}
  else
dest = NULL;
diff --git a/gcc/testsuite/gcc.dg/pr106421.c b/gcc/testsuite/gcc.dg/pr106421.c
new file mode 100644
index 000..73e522a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr106421.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int main(int argc, char **argv)
+{
+   __label__ loop, end;
+   void jmp(int c) { goto *(c ? &&loop : &&end); }
+loop:
+   jmp(argc < 0);
+end:
+   return 0;
+}
+


Re: Adding a new thread model to GCC

2023-01-09 Thread Eric Botcazou via Gcc-patches
> fixed now.
> bootstrapped successfully!

Thanks for fixing it.  Another way out is to hide the Win32 API by defining  
__GTHREAD_HIDE_WIN32API like libstdc++ does in its header files.

-- 
Eric Botcazou




[committed] c: Check for modifiable static compound literals in inline definitions

2023-01-09 Thread Joseph Myers
The C rule against modifiable objects with static storage duration in
inline definitions should apply to compound literals (using the C2x
feature of storage-class specifiers for compound literals) just as to
variables.  Add a call to record_inline_static for compound literals
to make sure this case is detected.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (build_compound_literal): Call record_inline_static.

gcc/testsuite/
* gcc.dg/c2x-complit-8.c: New test.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index e47ca6718b3..d76ffb3380d 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -6260,6 +6260,13 @@ build_compound_literal (location_t loc, tree type, tree 
init, bool non_const,
   DECL_USER_ALIGN (decl) = 1;
 }
   store_init_value (loc, decl, init, NULL_TREE);
+  if (current_scope != file_scope
+  && TREE_STATIC (decl)
+  && !TREE_READONLY (decl)
+  && DECL_DECLARED_INLINE_P (current_function_decl)
+  && DECL_EXTERNAL (current_function_decl))
+record_inline_static (input_location, current_function_decl,
+ decl, csi_modifiable);
 
   if (TREE_CODE (type) == ARRAY_TYPE && !COMPLETE_TYPE_P (type))
 {
diff --git a/gcc/testsuite/gcc.dg/c2x-complit-8.c 
b/gcc/testsuite/gcc.dg/c2x-complit-8.c
new file mode 100644
index 000..fb614ab7802
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-complit-8.c
@@ -0,0 +1,70 @@
+/* Test C2x storage class specifiers in compound literals: inline function
+   constraints.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+inline void
+f1 ()
+{
+  (static int) { 123 }; /* { dg-error "static but declared in inline function 
'f1' which is not static" } */
+  (static thread_local int) { 456 } ; /* { dg-error "static but declared in 
inline function 'f1' which is not static" } */
+  (int) { 789 };
+  (register int) { 1234 };
+}
+
+inline void
+f1e ()
+{
+  (static int) { 123 };
+  (static thread_local int) { 456 } ;
+}
+
+static inline void
+f1s ()
+{
+  (static int) { 123 };
+  (static thread_local int) { 456 } ;
+}
+
+inline void
+f2 ()
+{
+  (static const int) { 123 };
+  (static thread_local const int) { 456 };
+}
+
+inline void
+f2e ()
+{
+  (static const int) { 123 };
+  (static thread_local const int) { 456 };
+}
+
+static inline void
+f2s ()
+{
+  (static const int) { 123 };
+  (static thread_local const int) { 456 };
+}
+
+inline void
+f3 ()
+{
+  (static constexpr int) { 123 };
+}
+
+inline void
+f3e ()
+{
+  (static constexpr int) { 123 };
+}
+
+static inline void
+f3s ()
+{
+  (static constexpr int) { 123 };
+}
+
+extern void f1e ();
+extern void f2e ();
+extern void f3e ();

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Modula-2: fix documentation layout

2023-01-09 Thread Eric Botcazou via Gcc-patches
Hi,

the Modula-2 documentation is rejected by older versions of Makeinfo because 
the web of @node markers is fairly broken, apparently some subsections were 
moved around, most notably between the Overview and Using sections, and the 
@node markers were not (properly) adjusted.

This patch allows me to build it with these older versions, as well as with 
modern versions.  OK for mainline?


2023-01-09  Eric Botcazou  

* doc/gm2.texi (Overview): Fix @node markers.
(Using): Likewise.  Remove subsections that were moved to
Overview from the menu and move others around.

-- 
Eric Botcazoudiff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi
index 18cb798c6cd..35e0f5ef622 100644
--- a/gcc/doc/gm2.texi
+++ b/gcc/doc/gm2.texi
@@ -89,7 +89,7 @@ Boston, MA 02110-1301, USA@*
 * Features::  GNU Modula-2 Features
 @end menu
 
-@node What is GNU Modula-2, Why use GNU Modula-2, , Using
+@node What is GNU Modula-2, Why use GNU Modula-2, , Overview
 @section What is GNU Modula-2
 
 GNU Modula-2 is a @uref{http://gcc.gnu.org/frontends.html, front end}
@@ -115,7 +115,7 @@ technology - programming languages - part 1: Modula-2 Language,
 ISO/IEC 10514-1 (1996)'
 }
 
-@node Why use GNU Modula-2, Release map, What is GNU Modula-2, Using
+@node Why use GNU Modula-2, Development, What is GNU Modula-2, Overview
 @section Why use GNU Modula-2
 
 There are a number of advantages of using GNU Modula-2 rather than
@@ -149,25 +149,13 @@ directory for a sub directory @code{foo} containing the library
 contents.  The library module search path is altered accordingly
 for compile and link.
 
-@node Release map, Development, Why use GNU Modula-2, Using
-@section Release map
-
-GNU Modula-2 is now part of GCC and therefore will adopt the GCC
-release schedule.  It is intended that GNU Modula-2 implement more of
-the GCC builtins (vararg access) and GCC features.
-
-There is an intention to implement the ISO generics and the M2R10
-dialect of Modula-2.  It will also implement all language changes.  If
-you wish to see something different please email
-@email{gm2@@nongnu.org} with your ideas.
-
-@node Development, Features, Release map, Using
+@node Development, Features, Why use GNU Modula-2, Overview
 @section How to get source code using git
 
 GNU Modula-2 is now in the @url{https://gcc.gnu.org/git.html, GCC git
 tree}.
 
-@node Features, Documentation, Development, Using
+@node Features, , Development, Overview
 @section GNU Modula-2 Features
 
 @itemize @bullet
@@ -230,99 +218,7 @@ such as the AVR and the ARM).
 
 @end itemize
 
-@node Documentation, Regression tests, Features, Using
-@section Documentation
-
-The GNU Modula-2 documentation is available on line
-@url{https://www.nongnu.org/gm2/homepage.html,at the gm2 homepage}
-or in the pdf, info, html file format.
-
-@node Regression tests, Limitations, Documentation, Using
-@section Regression tests for gm2 in the repository
-
-The regression testsuite can be run from the gcc build directory:
-
-@example
-$ cd build-gcc
-$ make check -j 24
-@end example
-
-which runs the complete testsuite for all compilers using 24 parallel
-invocations of the compiler.  Individual language testsuites can be
-run by specifying the language, for example the Modula-2 testsuite can
-be run using:
-
-@example
-$ cd build-gcc
-$ make check-m2 -j 24
-@end example
-
-Finally the results of the testsuite can be emailed to the
-@url{https://gcc.gnu.org/lists.html, gcc-testresults} list using the
-@file{test_summary} script found in the gcc source tree:
-
-@example
-$ @samp{directory to the sources}/contrib/test_summary
-@end example
-
-@node Limitations, Objectives, Regression tests, Using
-@section Limitations
-
-Logitech compatibility library is incomplete.  The principle modules
-for this platform exist however for a comprehensive list of completed
-modules please check the documentation
-@url{gm2.html}.
-
-@node Objectives, FAQ, , Using
-@section Objectives
-
-@itemize @bullet
-
-@item
-The intention of GNU Modula-2 is to provide a production Modula-2
-front end to GCC.
-
-@item
-It should support all Niklaus Wirth PIM Dialects [234] and also ISO
-Modula-2 including a re-implementation of all the ISO modules.
-
-@item
-There should be an easy interface to C.
-
-@item
-Exploit the features of GCC.
-
-@item
-Listen to the requests of the users.
-@end itemize
-
-@node FAQ, Community, Objectives, Using
-@section FAQ
-
-@subsection Why use the C++ exception mechanism in GCC, rather than a bespoke Modula-2 mechanism?
-
-The C++ mechanism is tried and tested, it also provides GNU Modula-2
-with the ability to link with C++ modules and via swig it can raise
-Python exceptions.
-
-@node Community, Other languages, FAQ, Using
-@section Community
-
-You can subscribe to the GNU Modula-2 mailing by sending an
-email to:
-@email{gm2-subscribe@@nongnu.org}
-or by
-@url{http://lists.nongnu.org/mailman/listinfo/gm2}.
-The mailing list contents can be viewed
-@url{http://lists.gnu.org/a

[PATCH] RISC-V: Cleanup the codes of bitmap create and free [NFC]

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is a NFC patch to move the codes into a wrapper function so that
they can be reused. I will reuse them in the following patches.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::create_bitmap_vectors): New function.
(vector_infos_manager::free_bitmap_vectors): Ditto.
(pass_vsetvl::pre_vsetvl): Adjust codes.
* config/riscv/riscv-vsetvl.h: New function declaration.

---
 gcc/config/riscv/riscv-vsetvl.cc | 95 +++-
 gcc/config/riscv/riscv-vsetvl.h  |  2 +
 2 files changed, 59 insertions(+), 38 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index d42cfa91d63..7800c2ee509 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1569,18 +1569,62 @@ vector_infos_manager::release (void)
 vector_exprs.release ();
 
   if (optimize > 0)
-{
-  /* Finished. Free up all the things we've allocated.  */
-  free_edge_list (vector_edge_list);
-  sbitmap_vector_free (vector_del);
-  sbitmap_vector_free (vector_insert);
-  sbitmap_vector_free (vector_kill);
-  sbitmap_vector_free (vector_antic);
-  sbitmap_vector_free (vector_transp);
-  sbitmap_vector_free (vector_comp);
-  sbitmap_vector_free (vector_avin);
-  sbitmap_vector_free (vector_avout);
-}
+free_bitmap_vectors ();
+}
+
+void
+vector_infos_manager::create_bitmap_vectors (void)
+{
+  /* Create the bitmap vectors.  */
+  vector_antic = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+  vector_exprs.length ());
+  vector_transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+   vector_exprs.length ());
+  vector_comp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+ vector_exprs.length ());
+  vector_avin = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+ vector_exprs.length ());
+  vector_avout = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+  vector_exprs.length ());
+  vector_kill = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+ vector_exprs.length ());
+
+  bitmap_vector_ones (vector_transp, last_basic_block_for_fn (cfun));
+  bitmap_vector_clear (vector_antic, last_basic_block_for_fn (cfun));
+  bitmap_vector_clear (vector_comp, last_basic_block_for_fn (cfun));
+}
+
+void
+vector_infos_manager::free_bitmap_vectors (void)
+{
+  /* Finished. Free up all the things we've allocated.  */
+  free_edge_list (vector_edge_list);
+  if (vector_del)
+sbitmap_vector_free (vector_del);
+  if (vector_insert)
+sbitmap_vector_free (vector_insert);
+  if (vector_kill)
+sbitmap_vector_free (vector_kill);
+  if (vector_antic)
+sbitmap_vector_free (vector_antic);
+  if (vector_transp)
+sbitmap_vector_free (vector_transp);
+  if (vector_comp)
+sbitmap_vector_free (vector_comp);
+  if (vector_avin)
+sbitmap_vector_free (vector_avin);
+  if (vector_avout)
+sbitmap_vector_free (vector_avout);
+
+  vector_edge_list = nullptr;
+  vector_kill = nullptr;
+  vector_del = nullptr;
+  vector_insert = nullptr;
+  vector_antic = nullptr;
+  vector_transp = nullptr;
+  vector_comp = nullptr;
+  vector_avin = nullptr;
+  vector_avout = nullptr;
 }
 
 void
@@ -2480,32 +2524,7 @@ pass_vsetvl::pre_vsetvl (void)
   /* Compute entity list.  */
   prune_expressions ();
 
-  /* Create the bitmap vectors.  */
-  m_vector_manager->vector_antic
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_transp
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_comp
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_avin
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_avout
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_kill
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-
-  bitmap_vector_ones (m_vector_manager->vector_transp,
- last_basic_block_for_fn (cfun));
-  bitmap_vector_clear (m_vector_manager->vector_antic,
-  last_basic_block_for_fn (cfun));
-  bitmap_vector_clear (m_vector_manager->vector_comp,
-  last_basic_block_for_fn (cfun));
+  m_vector_manager->create_bitmap_vectors ();
   compute

Re: [PATCH] c++: Only do maybe_init_list_as_range optimization if !processing_template_decl [PR108047]

2023-01-09 Thread Jason Merrill via Gcc-patches

On 1/9/23 05:19, Jakub Jelinek wrote:

Hi!

The last testcase in this patch ICEs, because
maybe_init_list_as_range -> maybe_init_list_as_array
calls maybe_constant_init in:
   /* Don't do this if the conversion would be constant.  */
   first = maybe_constant_init (first);
   if (TREE_CONSTANT (first))
 return NULL_TREE;
but maybe_constant_init shouldn't be called when processing_template_decl.
While we could replace that call with fold_non_dependent_init, my limited
understanding is that this is an optimization and even if we don't optimize
it when processing_template_decl, build_user_type_conversion_1 will be
called again during instantiation with !processing_template_decl if it is
every instantiated and we can do the optimization only then.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


Or do you want fold_non_dependent_init instead?

2023-01-09  Jakub Jelinek  

PR c++/105838
PR c++/108047
PR c++/108266
* call.cc (maybe_init_list_as_range): Always return NULL_TREE if
processing_template_decl.

* g++.dg/tree-ssa/initlist-opt2.C: New test.
* g++.dg/tree-ssa/initlist-opt3.C: New test.

--- gcc/cp/call.cc.jj   2022-12-15 09:24:44.265935297 +0100
+++ gcc/cp/call.cc  2023-01-06 11:24:44.837270905 +0100
@@ -4285,7 +4285,8 @@ maybe_init_list_as_array (tree elttype,
  static tree
  maybe_init_list_as_range (tree fn, tree expr)
  {
-  if (BRACE_ENCLOSED_INITIALIZER_P (expr)
+  if (!processing_template_decl
+  && BRACE_ENCLOSED_INITIALIZER_P (expr)
&& is_list_ctor (fn)
&& decl_in_std_namespace_p (fn))
  {
--- gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C.jj2023-01-06 
11:53:13.160432870 +0100
+++ gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C   2023-01-06 
11:53:44.561976302 +0100
@@ -0,0 +1,31 @@
+// PR c++/105838
+// { dg-additional-options -fdump-tree-gimple }
+// { dg-do compile { target c++11 } }
+
+// Test that we do range-initialization from const char *.
+// { dg-final { scan-tree-dump {_M_range_initialize} 
"gimple" } }
+
+#include 
+#include 
+
+void g (const void *);
+
+template 
+void f (const char *p)
+{
+  std::vector lst = {
+  "aahing", "aaliis", "aarrgh", "abacas", "abacus", "abakas", "abamps", "abands", "abased", 
"abaser", "abases", "abasia",
+  "abated", "abater", "abates", "abatis", "abator", "abattu", "abayas", "abbacy", "abbess", 
"abbeys", "abbots", "abcees",
+  "abdabs", "abduce", "abduct", "abears", "abeigh", "abeles", "abelia", "abends", "abhors", 
"abided", "abider", "abides",
+  "abject", "abjure", "ablate", "ablaut", "ablaze", "ablest", "ablets", "abling", "ablins", 
"abloom", "ablush", "abmhos",
+  "aboard", "aboded", "abodes", "abohms", "abolla", "abomas", "aboral", "abords", "aborne", 
"aborts", "abound", "abouts",
+  "aboves", "abrade", "abraid", "abrash", "abrays", "abrazo", "abrege", "abrins", "abroad", 
"abrupt", "abseil", "absent",
+  };
+
+  g(&lst);
+}
+
+void h (const char *p)
+{
+  f<0> (p);
+}
--- gcc/testsuite/g++.dg/tree-ssa/initlist-opt3.C.jj2023-01-06 
11:56:36.981469370 +0100
+++ gcc/testsuite/g++.dg/tree-ssa/initlist-opt3.C   2023-01-06 
11:56:09.984861898 +0100
@@ -0,0 +1,21 @@
+// PR c++/108266
+// { dg-do compile { target c++11 } }
+
+#include 
+#include 
+
+struct S { S (const char *); };
+void bar (std::vector);
+
+template 
+void
+foo ()
+{
+  bar ({"", ""});
+}
+
+void
+baz ()
+{
+  foo<0> ();
+}

Jakub





[PATCH] RISC-V: Avoid redundant flow in forward fusion

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::forward_demand_fusion): 
Add pre-check for redundant flow.

---
 gcc/config/riscv/riscv-vsetvl.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7800c2ee509..18c6f437db6 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2140,6 +2140,9 @@ pass_vsetvl::forward_demand_fusion (void)
   if (!prop.valid_or_dirty_p ())
continue;
 
+  if (cfg_bb == ENTRY_BLOCK_PTR_FOR_FN (cfun))
+   continue;
+
   edge e;
   edge_iterator ei;
   /* Forward propagate to each successor.  */
@@ -2153,6 +2156,11 @@ pass_vsetvl::forward_demand_fusion (void)
  /* It's quite obvious, we don't need to propagate itself.  */
  if (e->dest->index == cfg_bb->index)
continue;
+ /* We don't propagate through critical edges.  */
+ if (e->flags & EDGE_COMPLEX)
+   continue;
+ if (e->dest->index == EXIT_BLOCK_PTR_FOR_FN (cfun)->index)
+   continue;
 
  /* If there is nothing to propagate, just skip it.  */
  if (!local_dem.valid_or_dirty_p ())
-- 
2.36.1



Re: [PATCH] Modula-2: fix documentation layout

2023-01-09 Thread Gaius Mulley via Gcc-patches
Eric Botcazou  writes:

> Hi,
>
> the Modula-2 documentation is rejected by older versions of Makeinfo because 
> the web of @node markers is fairly broken, apparently some subsections were 
> moved around, most notably between the Overview and Using sections, and the 
> @node markers were not (properly) adjusted.
>
> This patch allows me to build it with these older versions, as well as with 
> modern versions.  OK for mainline?
>
>
> 2023-01-09  Eric Botcazou  
>
>   * doc/gm2.texi (Overview): Fix @node markers.
>   (Using): Likewise.  Remove subsections that were moved to
>   Overview from the menu and move others around.

Hi Eric,

yes indeed and thanks for the patch!

regards,
Gaius


[PATCH] bpf: correct bpf_print_operand for floats [PR108293]

2023-01-09 Thread David Faust via Gcc-patches
The existing logic in bpf_print_operand was only correct for integral
CONST_DOUBLEs, and emitted garbage for floating point modes. Fix it so
floating point mode operands are correctly handled.

Tested on bpf-unknown-none, no known regressions.
OK to check-in?

Thanks.


PR target/108293

gcc/

* config/bpf/bpf.cc (bpf_print_operand): Correct handling for
floating point modes.

gcc/testsuite/

* gcc.target/bpf/double-1.c: New test.
* gcc.target/bpf/double-2.c: New test.
* gcc.target/bpf/float-1.c: New test.
---
 gcc/config/bpf/bpf.cc   | 21 ++---
 gcc/testsuite/gcc.target/bpf/double-1.c | 12 
 gcc/testsuite/gcc.target/bpf/double-2.c | 12 
 gcc/testsuite/gcc.target/bpf/float-1.c  | 12 
 4 files changed, 50 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/double-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/double-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/float-1.c

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 2aeaeaf309b..9dde3944e9c 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -880,13 +880,20 @@ bpf_print_operand (FILE *file, rtx op, int code 
ATTRIBUTE_UNUSED)
   output_address (GET_MODE (op), XEXP (op, 0));
   break;
 case CONST_DOUBLE:
-  if (CONST_DOUBLE_HIGH (op))
-   fprintf (file, HOST_WIDE_INT_PRINT_DOUBLE_HEX,
-CONST_DOUBLE_HIGH (op), CONST_DOUBLE_LOW (op));
-  else if (CONST_DOUBLE_LOW (op) < 0)
-   fprintf (file, HOST_WIDE_INT_PRINT_HEX, CONST_DOUBLE_LOW (op));
-  else
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC, CONST_DOUBLE_LOW (op));
+  long vals[2];
+  real_to_target (vals, CONST_DOUBLE_REAL_VALUE (op), GET_MODE (op));
+  vals[0] &= 0x;
+  vals[1] &= 0x;
+  if (GET_MODE (op) == SFmode)
+   fprintf (file, "0x%08lx", vals[0]);
+  else if (GET_MODE (op) == DFmode)
+   {
+ /* Note: real_to_target puts vals in target word order.  */
+ if (WORDS_BIG_ENDIAN)
+   fprintf (file, "0x%08lx%08lx", vals[0], vals[1]);
+ else
+   fprintf (file, "0x%08lx%08lx", vals[1], vals[0]);
+   }
   break;
 default:
   output_addr_const (file, op);
diff --git a/gcc/testsuite/gcc.target/bpf/double-1.c 
b/gcc/testsuite/gcc.target/bpf/double-1.c
new file mode 100644
index 000..200f1bd18f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/double-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mlittle-endian" } */
+
+double f;
+double a() { f = 1.0; return 1.0; }
+double b() { f = 2.0; return 2.0; }
+double c() { f = 2.0; return 3.0; }
+double d() { f = 3.0; return 3.0; }
+
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x3ff0" 2 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4000" 3 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4008" 3 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/double-2.c 
b/gcc/testsuite/gcc.target/bpf/double-2.c
new file mode 100644
index 000..d04ddd0c575
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/double-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mbig-endian" } */
+
+double f;
+double a() { f = 1.0; return 1.0; }
+double b() { f = 2.0; return 2.0; }
+double c() { f = 2.0; return 3.0; }
+double d() { f = 3.0; return 3.0; }
+
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x3ff0" 2 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4000" 3 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4008" 3 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/float-1.c 
b/gcc/testsuite/gcc.target/bpf/float-1.c
new file mode 100644
index 000..05ed7bb651d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/float-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mlittle-endian" } */
+
+float f;
+float a() { f = 1.0; return 1.0; }
+float b() { f = 2.0; return 2.0; }
+float c() { f = 2.0; return 3.0; }
+float d() { f = 3.0; return 3.0; }
+
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x3f80" 2 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4000" 3 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4040" 3 } } */
-- 
2.39.0



[PATCH] RISC-V: Refine codes in backward fusion

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This NFC patch is preparing for the following patches.
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion): 
Refine codes.

---
 gcc/config/riscv/riscv-vsetvl.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 18c6f437db6..0ca1c51d121 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2010,15 +2010,16 @@ pass_vsetvl::backward_demand_fusion (void)
   for (const bb_info *bb : crtl->ssa->reverse_bbs ())
 {
   basic_block cfg_bb = bb->cfg_bb ();
-  const auto &prop
-   = m_vector_manager->vector_block_infos[cfg_bb->index].local_dem;
+  const auto &curr_block_info
+   = m_vector_manager->vector_block_infos[cfg_bb->index];
+  const auto &prop = curr_block_info.local_dem;
 
   /* If there is nothing to propagate, just skip it.  */
   if (!prop.valid_or_dirty_p ())
continue;
 
-  if (!backward_propagate_worthwhile_p (
-   cfg_bb, m_vector_manager->vector_block_infos[cfg_bb->index]))
+  if (!backward_propagate_worthwhile_p (cfg_bb, curr_block_info))
+   continue;
continue;
 
   edge e;
-- 
2.36.1



[PATCH] RISC-V: Avoid redundant flow in backward fusion

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion): 
Add pre-check to avoid redundant flow.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 0ca1c51d121..a972c49561a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2020,6 +2020,8 @@ pass_vsetvl::backward_demand_fusion (void)
 
   if (!backward_propagate_worthwhile_p (cfg_bb, curr_block_info))
continue;
+
+  if (cfg_bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
continue;
 
   edge e;
-- 
2.36.1



[PATCH] RISC-V: Rename insn into rinsn for rtx_insn *

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Since the PASS is implemented base on RTL_SSA framework.
According to rtl_ssa, they name insn_info * as insn and
name rtx_insn * rinsn. I follow this rule in this pass but I missed
this function. So rename it to make codes be consistent to RTL_SSA
framework.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (add_label_notes):

---
 gcc/config/riscv/riscv-vsetvl.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index a972c49561a..3c920779992 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -682,7 +682,7 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn,
necessary REG_LABEL_OPERAND and REG_LABEL_TARGET notes.  */
 
 static void
-add_label_notes (rtx x, rtx_insn *insn)
+add_label_notes (rtx x, rtx_insn *rinsn)
 {
   enum rtx_code code = GET_CODE (x);
   int i, j;
@@ -699,8 +699,8 @@ add_label_notes (rtx x, rtx_insn *insn)
   /* There's no reason for current users to emit jump-insns with
 such a LABEL_REF, so we don't have to handle REG_LABEL_TARGET
 notes.  */
-  gcc_assert (!JUMP_P (insn));
-  add_reg_note (insn, REG_LABEL_OPERAND, label_ref_label (x));
+  gcc_assert (!JUMP_P (rinsn));
+  add_reg_note (rinsn, REG_LABEL_OPERAND, label_ref_label (x));
 
   if (LABEL_P (label_ref_label (x)))
LABEL_NUSES (label_ref_label (x))++;
@@ -711,10 +711,10 @@ add_label_notes (rtx x, rtx_insn *insn)
   for (i = GET_RTX_LENGTH (code) - 1, fmt = GET_RTX_FORMAT (code); i >= 0; i--)
 {
   if (fmt[i] == 'e')
-   add_label_notes (XEXP (x, i), insn);
+   add_label_notes (XEXP (x, i), rinsn);
   else if (fmt[i] == 'E')
for (j = XVECLEN (x, i) - 1; j >= 0; j--)
- add_label_notes (XVECEXP (x, i, j), insn);
+ add_label_notes (XVECEXP (x, i, j), rinsn);
 }
 }
 
-- 
2.36.1



[PATCH] RISC-V: Remove dirty_pat since it is redundant

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Remove 
dirty_pat.
(vector_insn_info::merge): Ditto.
(vector_insn_info::dump): Ditto.
(pass_vsetvl::merge_successors): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
* config/riscv/riscv-vsetvl.h: Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 28 
 gcc/config/riscv/riscv-vsetvl.h  | 11 +--
 2 files changed, 13 insertions(+), 26 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3c920779992..0f12d4ddb23 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1083,10 +1083,10 @@ vector_insn_info::operator== (const vector_insn_info 
&other) const
 if (m_demands[i] != other.demand_p ((enum demand_type) i))
   return false;
 
-  if (m_insn != other.get_insn ())
-return false;
-  if (m_dirty_pat != other.get_dirty_pat ())
-return false;
+  if (vector_config_insn_p (m_insn->rtl ())
+  || vector_config_insn_p (other.get_insn ()->rtl ()))
+if (m_insn != other.get_insn ())
+  return false;
 
   if (!same_avl_p (other))
 return false;
@@ -1317,8 +1317,6 @@ vector_insn_info::merge (const vector_insn_info 
&merge_info,
new_info.set_insn (merge_info.get_insn ());
 }
 
-  new_info.set_dirty_pat (merge_info.get_dirty_pat ());
-
   if (!demand_p (DEMAND_AVL) && !merge_info.demand_p (DEMAND_AVL))
 new_info.undemand (DEMAND_AVL);
   if (!demand_p (DEMAND_SEW) && !merge_info.demand_p (DEMAND_SEW))
@@ -1431,11 +1429,6 @@ vector_insn_info::dump (FILE *file) const
  fprintf (file, "The real INSN=");
  print_rtl_single (file, get_insn ()->rtl ());
}
-  if (get_dirty_pat ())
-   {
- fprintf (file, "Dirty RTL Pattern=");
- print_rtl_single (file, get_dirty_pat ());
-   }
 }
 }
 
@@ -1967,7 +1960,6 @@ pass_vsetvl::merge_successors (const basic_block father,
 
   new_info.set_dirty ();
   rtx new_pat = gen_vsetvl_pat (new_info.get_insn ()->rtl (), new_info);
-  new_info.set_dirty_pat (new_pat);
 
   father_info.local_dem = new_info;
   father_info.reaching_out = new_info;
@@ -2051,7 +2043,6 @@ pass_vsetvl::backward_demand_fusion (void)
 
  block_info.reaching_out = prop;
  block_info.reaching_out.set_dirty ();
- block_info.reaching_out.set_dirty_pat (new_pat);
  block_info.local_dem = block_info.reaching_out;
  changed_p = true;
}
@@ -2080,7 +2071,6 @@ pass_vsetvl::backward_demand_fusion (void)
  rtx new_pat
= gen_vsetvl_pat (new_info.get_insn ()->rtl (), new_info);
  new_info.set_dirty ();
- new_info.set_dirty_pat (new_pat);
  block_info.local_dem = new_info;
  block_info.reaching_out = new_info;
  changed_p = true;
@@ -2178,7 +2168,6 @@ pass_vsetvl::forward_demand_fusion (void)
= gen_vsetvl_pat (prop.get_insn ()->rtl (), prop);
  local_dem = prop;
  local_dem.set_dirty ();
- local_dem.set_dirty_pat (dirty_pat);
  reaching_out = local_dem;
}
  else
@@ -2507,10 +2496,17 @@ pass_vsetvl::commit_vsetvls (void)
   if (!reaching_out.dirty_p ())
continue;
 
-  rtx new_pat = reaching_out.get_dirty_pat ();
+
+  rtx new_pat;
   if (can_refine_vsetvl_p (cfg_bb, reaching_out.get_ratio ()))
new_pat
  = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, reaching_out, NULL_RTX);
+  else if (vlmax_avl_p (reaching_out.get_avl ()))
+   new_pat = gen_vsetvl_pat (VSETVL_NORMAL, reaching_out,
+ get_vl (reaching_out.get_insn ()->rtl ()));
+  else
+   new_pat
+ = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, reaching_out, NULL_RTX);
 
   start_sequence ();
   emit_insn (new_pat);
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index dc16c55b918..563ad3084ed 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -220,13 +220,6 @@ private:
  (with AVL included) before vmv.x.s, but vmv.x.s is not the INSN holding 
the
  definition of AVL.  */
   rtl_ssa::insn_info *m_insn;
-  /* Save instruction pattern for Dirty block.
- Since empty block may be polluted as a dirty block during dem backward
- propagation (phase 3) which is intending to cheat LCM there is a VSETVL
- instruction here to gain better LCM optimization. Such instruction is not
- emit yet, we save this here and then emit it in the 4th phase if it is
- necessary.  */
-  rtx m_dirty_pat;
 
   /* Parse the instruction to get VL/VTYPE information an

[PATCH] RISC-V: Add probability model of each block to prevent endless loop of Phase 3

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Notice that the PASS is just simpily pick the probability >= 50% 
to do the backward fusion which will create endless loop on Phase 3.

Adding this probability to fix this bug.
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::vector_infos_manager): Add probability.
(vector_infos_manager::dump): Ditto.
(pass_vsetvl::compute_probabilities): Ditto.
* config/riscv/riscv-vsetvl.h (struct vector_block_info): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 39 
 gcc/config/riscv/riscv-vsetvl.h  |  3 +++
 2 files changed, 42 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 0f12d4ddb23..7d8c3a32aaa 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1465,6 +1465,7 @@ vector_infos_manager::vector_infos_manager ()
  vector_block_infos[bb->index ()].reaching_out = vector_insn_info ();
  for (insn_info *insn : bb->real_insns ())
vector_insn_infos[insn->uid ()].parse_insn (insn);
+ vector_block_infos[bb->index ()].probability = profile_probability ();
}
 }
 }
@@ -1642,6 +1643,8 @@ vector_infos_manager::dump (FILE *file) const
}
   fprintf (file, "=");
   vector_block_infos[cfg_bb->index].reaching_out.dump (file);
+  fprintf (file, "=");
+  vector_block_infos[cfg_bb->index].probability.dump (file);
   fprintf (file, "\n\n");
 }
 
@@ -1764,6 +1767,7 @@ private:
 
   void init (void);
   void done (void);
+  void compute_probabilities (void);
 
 public:
   pass_vsetvl (gcc::context *ctxt) : rtl_opt_pass (pass_data_vsetvl, ctxt) {}
@@ -2629,6 +2633,41 @@ pass_vsetvl::done (void)
   m_vector_manager = nullptr;
 }
 
+/* Compute probability for each block.  */
+void
+pass_vsetvl::compute_probabilities (void)
+{
+  /* Don't compute it in -O0 since we don't need it.  */
+  if (!optimize)
+return;
+  edge e;
+  edge_iterator ei;
+
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  basic_block cfg_bb = bb->cfg_bb ();
+  auto &curr_prob
+   = m_vector_manager->vector_block_infos[cfg_bb->index].probability;
+  if (ENTRY_BLOCK_PTR_FOR_FN (cfun) == cfg_bb)
+   curr_prob = profile_probability::always ();
+  gcc_assert (curr_prob.initialized_p ());
+  FOR_EACH_EDGE (e, ei, cfg_bb->succs)
+   {
+ auto &new_prob
+   = m_vector_manager->vector_block_infos[e->dest->index].probability;
+ if (!new_prob.initialized_p ())
+   new_prob = curr_prob * e->probability;
+ else if (new_prob == profile_probability::always ())
+   continue;
+ else
+   new_prob += curr_prob * e->probability;
+   }
+}
+  auto &exit_block
+= m_vector_manager->vector_block_infos[EXIT_BLOCK_PTR_FOR_FN 
(cfun)->index];
+  exit_block.probability = profile_probability::always ();
+}
+
 /* Lazy vsetvl insertion for optimize > 0. */
 void
 pass_vsetvl::lazy_vsetvl (void)
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index 563ad3084ed..fb3ebb9db79 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -291,6 +291,9 @@ struct vector_block_info
   /* The reaching_out vector insn_info of the block.  */
   vector_insn_info reaching_out;
 
+  /* The static execute probability of the demand info.  */
+  profile_probability probability;
+
   vector_block_info () = default;
 };
 
-- 
2.36.1



[PATCH] RISC-V: Call DCE to remove redundant instructions created by the PASS

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::done): Add DCE.
* config/riscv/t-riscv: Add DCE.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 ++
 gcc/config/riscv/t-riscv | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7d8c3a32aaa..7aa2852b456 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "profile-count.h"
 #include "riscv-vsetvl.h"
+#include "dce.h"
 
 using namespace rtl_ssa;
 using namespace riscv_vector;
@@ -2627,6 +2628,7 @@ pass_vsetvl::done (void)
cleanup_cfg (0);
   delete crtl->ssa;
   crtl->ssa = nullptr;
+  run_fast_dce ();
 }
   m_vector_manager->release ();
   delete m_vector_manager;
diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
index d30e0235356..c95f4aff358 100644
--- a/gcc/config/riscv/t-riscv
+++ b/gcc/config/riscv/t-riscv
@@ -54,7 +54,7 @@ riscv-c.o: $(srcdir)/config/riscv/riscv-c.cc $(CONFIG_H) 
$(SYSTEM_H) \
 riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) \
   $(TARGET_H) tree-pass.h df.h rtl-ssa.h cfgcleanup.h insn-config.h \
-  insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h \
+  insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h dce.h \
   predict.h profile-count.h $(srcdir)/config/riscv/riscv-vsetvl.h
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/riscv/riscv-vsetvl.cc
-- 
2.36.1



[PATCH] RISC-V: Fix bugs of supporting AVL=REG (single-real-def) in VSETVL PASS

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (same_bb_and_before_p): Remove it.
(real_insn_and_same_bb_p): New function.
(same_bb_and_after_or_equal_p): Remove it.
(before_p): New function.
(reg_killed_by_bb_p): Ditto.
(has_vsetvl_killed_avl_p): Ditto.
(get_vl): Move location so that we can call it.
(anticipatable_occurrence_p): Fix issue of AVL=REG support.
(available_occurrence_p): Ditto.
(dominate_probability_p): Remove it.
(can_backward_propagate_p): Remove it.
(get_all_nonphi_defs): New function.
(get_all_predecessors): Ditto.
(any_insn_in_bb_p): Ditto.
(insert_vsetvl): Adjust AVL REG.
(source_equal_p): New function.
(extract_single_source): Ditto.
(avl_info::single_source_equal_p): Ditto.
(avl_info::operator==): Adjust for AVL=REG.
(vl_vtype_info::same_avl_p): Ditto.
(vector_insn_info::set_demand_info): Remove it.
(vector_insn_info::compatible_p): Adjust for AVL=REG.
(vector_insn_info::compatible_avl_p): New function.
(vector_insn_info::merge): Adjust AVL=REG.
(vector_insn_info::dump): Ditto.
(pass_vsetvl::merge_successors): Remove it.
(enum fusion_type): New enum.
(pass_vsetvl::get_backward_fusion_type): New function.
(pass_vsetvl::backward_demand_fusion): Adjust for AVL=REG.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::demand_fusion): Ditto.
(pass_vsetvl::prune_expressions): Ditto.
(pass_vsetvl::compute_local_properties): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::init): Ditto.
* config/riscv/riscv-vsetvl.h (enum fusion_type): New enum.
(enum merge_type): New enum.

---
 gcc/config/riscv/riscv-vsetvl.cc | 928 +--
 gcc/config/riscv/riscv-vsetvl.h  |  68 ++-
 2 files changed, 710 insertions(+), 286 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7aa2852b456..0245124e28f 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -178,34 +178,97 @@ vsetvl_insn_p (rtx_insn *rinsn)
 || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);
 }
 
-/* Return true if INSN1 comes befeore INSN2 in the same block.  */
 static bool
-same_bb_and_before_p (const insn_info *insn1, const insn_info *insn2)
+real_insn_and_same_bb_p (const insn_info *insn, const bb_info *bb)
 {
-  return ((insn1->bb ()->index () == insn2->bb ()->index ())
-&& (*insn1 < *insn2));
+  return insn != nullptr && insn->is_real () && insn->bb () == bb;
 }
 
-/* Return true if INSN1 comes after or equal INSN2 in the same block.  */
 static bool
-same_bb_and_after_or_equal_p (const insn_info *insn1, const insn_info *insn2)
+before_p (const insn_info *insn1, const insn_info *insn2)
 {
-  return ((insn1->bb ()->index () == insn2->bb ()->index ())
-&& (*insn1 >= *insn2));
+  return insn1->compare_with (insn2) == -1;
+}
+
+static bool
+reg_killed_by_bb_p (const bb_info *bb, rtx x)
+{
+  if (!x || vlmax_avl_p (x))
+return false;
+  for (const insn_info *insn : bb->real_nondebug_insns ())
+if (find_access (insn->defs (), REGNO (x)))
+  return true;
+  return false;
+}
+
+static bool
+has_vsetvl_killed_avl_p (const bb_info *bb, const vector_insn_info &info)
+{
+  if (info.dirty_with_killed_avl_p ())
+{
+  rtx avl = info.get_avl ();
+  for (const insn_info *insn : bb->reverse_real_nondebug_insns ())
+   {
+ def_info *def = find_access (insn->defs (), REGNO (avl));
+ if (def)
+   {
+ set_info *set = safe_dyn_cast (def);
+ if (!set)
+   return false;
+
+ rtx new_avl = gen_rtx_REG (GET_MODE (avl), REGNO (avl));
+ gcc_assert (new_avl != avl);
+ if (!info.compatible_avl_p (avl_info (new_avl, set)))
+   return false;
+
+ return true;
+   }
+   }
+}
+  return false;
+}
+
+/* Helper function to get VL operand.  */
+static rtx
+get_vl (rtx_insn *rinsn)
+{
+  if (has_vl_op (rinsn))
+{
+  extract_insn_cached (rinsn);
+  return recog_data.operand[get_attr_vl_op_idx (rinsn)];
+}
+  return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0));
 }
 
 /* An "anticipatable occurrence" is one that is the first occurrence in the
basic block, the operands are not modified in the basic block prior
to the occurrence and the output is not used between the start of
-   the block and the occurrence.  */
+   the block and the occurrence.
+
+   For VSETVL instruction, we have these following formats:
+ 1. vsetvl zero, rs1.
+ 2. vsetvl zero, imm.
+ 3. vsetvl rd, rs1.
+
+   So base on these circumstances, a DEM is considered as a local anticipatable
+   occurrence should satisfy these foll

[PATCH] RISC-V: Adjust testcases for AVL=REG support

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c: Adjust testcase.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-17.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: Ditto.

---
 .../gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c|  2 +-
 .../gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c|  2 +-
 .../gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c   | 12 ++--
 .../gcc.target/riscv/rvv/vsetvl/imm_conflict-5.c   | 12 ++--
 .../riscv/rvv/vsetvl/imm_loop_invariant-17.c   |  3 +--
 .../riscv/rvv/vsetvl/vlmax_back_prop-27.c  |  4 ++--
 .../riscv/rvv/vsetvl/vlmax_back_prop-28.c  |  4 ++--
 .../riscv/rvv/vsetvl/vlmax_back_prop-45.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c | 14 +++---
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c | 12 ++--
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c | 12 ++--
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c |  1 -
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-12.c   |  2 +-
 15 files changed, 42 insertions(+), 44 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c
index 3da7b8722c2..20a1cd27c43 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c
@@ -19,4 +19,4 @@ void f(void *base, void *out, void *mask_in, size_t vl, 
size_t m) {
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e16,\s*mf4,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c
index 2a9616eb7ea..58aecb0a219 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c
@@ -21,5 +21,5 @@ void f(void *base, void *out, void *mask_in, size_t vl, 
size_t m, size_t n) {
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e16,\s*mf4,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c
index f24e129b4dc..fdfcb07a63d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c
@@ -30,9 +30,9 @@ void f (void * restrict in, void * restrict out, int n, int 
cond)
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*5,\s*e8,\s*mf8,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" 
no-opts "-g" no-opts "-funroll-loops" } } } } */
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*19,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts 
"-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*8,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts 
"-O0" no-opts "-O1" no-opts "-g" no-opts "-funroll-loops" } } } } */
-/* { dg-final { scan-assembler-times 
{vsetvli\s+[a-x0-9]+,\s*zero,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 1 { target { 
no-opts "-O0"  no-opts "-funroll-loops" no-opts "-g" } } } } */
-/* { dg-final { scan-assembler-times {vsetivli} 3 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-funroll-loops" no-opts "-g" } } } } */
-/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0"  
no-opts "-funroll-loops" no-opts "-g" } } } } */
+/* { dg-fin

[PATCH] RISC-V: Add testcases for AVL=REG support

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/avl_single-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-20.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-21.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-22.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-23.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-24.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-25.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-26.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-27.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-28.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-29.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-30.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-31.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-32.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-33.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-34.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-35.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-36.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-37.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-38.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-39.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-40.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-41.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-42.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-43.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-44.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-45.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-46.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-47.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-48.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-49.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-50.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-51.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-52.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-53.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-54.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-55.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-56.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-57.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-58.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-59.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-60.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-61.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-62.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-63.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-64.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-65.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-66.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-67.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-68.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-69.c: New test.

---
 .../riscv/rvv/vsetvl/avl_single-2.c   | 18 ++
 .../riscv/rvv/vsetvl/avl_single-20.c  | 40 +
 .../riscv/rvv/vsetvl/avl_single-21.c  | 32 +++
 .../riscv/rvv/vsetvl/avl_single-22.c  | 42 ++
 .../riscv/rvv/vsetvl/avl_single-23.c  | 34 +++
 .../riscv/rvv/vsetvl/avl_single-24.c  | 36 
 .../riscv/rvv/vsetvl/avl_single-25.c  | 38 +
 .../riscv/rvv/vsetvl/avl_single-26.c  | 35 
 .../riscv/rvv/vsetvl/avl_single-27.c  | 36 
 .../riscv/rvv/vsetvl/avl_single-28.c  | 30 ++
 .../riscv/rvv/vsetvl/avl_single-29.c  | 31 ++
 .../riscv/rvv/vsetvl/avl_single-3.c   | 19 +++
 .../riscv/rvv/vsetvl/avl_single-30.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-31.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-32.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-33.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-34.c  | 28 +
 .../riscv/rvv/vsetvl/avl_single-35.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-36.c  | 25 
 .../riscv/rvv/vsetvl/avl_single-37.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-38.c  | 57 +++
 .../riscv/rvv/vsetvl/avl_single-39.c   

[PATCH] RISC-V: Add the rest testcases of AVL=REG support

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/avl_single-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-18.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-19.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-70.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-71.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-72.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-9.c: New test.

---
 .../riscv/rvv/vsetvl/avl_single-1.c   | 17 ++
 .../riscv/rvv/vsetvl/avl_single-10.c  | 21 +++
 .../riscv/rvv/vsetvl/avl_single-11.c  | 21 +++
 .../riscv/rvv/vsetvl/avl_single-12.c  | 19 +++
 .../riscv/rvv/vsetvl/avl_single-13.c  | 28 ++
 .../riscv/rvv/vsetvl/avl_single-14.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-15.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-16.c  | 32 +++
 .../riscv/rvv/vsetvl/avl_single-17.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-18.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-19.c  | 40 +
 .../riscv/rvv/vsetvl/avl_single-7.c   | 17 ++
 .../riscv/rvv/vsetvl/avl_single-70.c  | 41 ++
 .../riscv/rvv/vsetvl/avl_single-71.c  | 54 ++
 .../riscv/rvv/vsetvl/avl_single-72.c  | 46 +++
 .../riscv/rvv/vsetvl/avl_single-8.c   | 18 ++
 .../riscv/rvv/vsetvl/avl_single-9.c   | 56 +++
 17 files changed, 522 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-13.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-14.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-17.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-18.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-19.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-70.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-71.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-72.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-9.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c
new file mode 100644
index 000..84225dbe7d2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (void * restrict in, void * restrict out, int n, int vl)
+{
+  for (int i = 0; i < n; i++)
+{
+  vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+  __riscv_vse8_v_i8mf8 (out + i, v, vl);
+}
+}
+
+/* { dg-final { scan-assembler-times 
{\.L[0-9]+\:\s+vle8\.v\s+v[0-9]+,\s*0\s*\([a-x0-9]+\)} 1 { target { no-opts 
"-O0" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 1 { target { 
no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0" 
no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c
new file mode 100644
index 000..f64d1c3680f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c
@@ -0,0 +1,21

Re: [PATCH v3 17/19] modula2 front end: dejagnu expect library scripts

2023-01-09 Thread Jason Merrill via Gcc-patches

On 12/6/22 09:47, Gaius Mulley via Gcc-patches wrote:

Here are the dejagnu expect library scripts for the gm2
testsuite.


A couple of weeks ago I noticed on a testrun that the modula tests 
didn't seem to be timing out properly, so I made this change.  It looks 
like they didn't run at all in the bootstrap/test I did just now, so I 
don't know if this change is actually helpful, but here it is if you 
think it makes sense:


From 6c9007800b8793c68921ee3d24f3a5000b44a100 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Wed, 21 Dec 2022 17:01:50 -0500
Subject: [PATCH] testsuite: use same timeout for gm2 as other front-ends
To: gcc-patches@gcc.gnu.org

I noticed Modula tests running forever in a regression test run, and then
that its .exp wasn't using timeout.exp like the other front-ends.

gcc/testsuite/ChangeLog:

	* lib/gm2.exp: Use timeout.exp.
---
 gcc/testsuite/lib/gm2.exp | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/lib/gm2.exp b/gcc/testsuite/lib/gm2.exp
index 9eba195291a..1fa62d8e6ea 100644
--- a/gcc/testsuite/lib/gm2.exp
+++ b/gcc/testsuite/lib/gm2.exp
@@ -22,7 +22,7 @@ load_lib libgloss.exp
 load_lib prune.exp
 load_lib gcc-defs.exp
 load_lib target-libpath.exp
-
+load_lib timeout.exp
 
 #
 # GCC_UNDER_TEST is the compiler under test.
@@ -183,9 +183,7 @@ proc gm2_target_compile_default { source dest type options } {
 if [info exists TOOL_OPTIONS] {
 	lappend options "additional_flags=$TOOL_OPTIONS"
 }
-if [target_info exists gcc,timeout] {
-	lappend options "timeout=[target_info gcc,timeout]"
-}
+lappend options "timeout=[timeout_value]"
 lappend options "compiler=$GCC_UNDER_TEST"
 # puts stderr "options = $options\n"
 # puts stderr "* target_compile: $source $dest $type $options\n"
-- 
2.31.1



[PATCH, Modula2] PR-108142 Many empty directories created in the build directory

2023-01-09 Thread Gaius Mulley via Gcc-patches


PR-108142 Modula-2 configure generates many subdirectories in the top
build directory.  This patch dynamically creates subdirectories under
gcc/m2 if and when required.

Bootstrapped on x86_64 gnu/linux, ok for master?

regards,
Gaius


gcc/m2/ChangeLog:

* Make-lang.in (GM2_1): Change -B path to m2/stage1.
($(objdir)/m2/images/gnu.eps): Check and create dest dir
if necessary.
(gm2-libs.texi-check): Check and create dir m2/gm2-libs-pim,
m2/gm2-libs-iso and m2/gm2-libs if necessary.
($(objdir)/m2/gm2-compiler-boot): Remove.
($(objdir)/m2/gm2-libs-boot): Remove.
($(objdir)/m2/gm2-libs-libiberty): Remove.
($(objdir)/m2/gm2-libiberty): Remove.
($(objdir)/m2/gm2-gcc): Remove.
($(objdir)/m2/gm2-compiler): Remove.
($(objdir)/m2/gm2-libs): Remove.
($(objdir)/m2/gm2-libs-iso): Remove.
($(objdir)/m2/gm2-libs-min): Remove.
($(objdir)/m2/gm2-compiler-paranoid): Remove.
($(objdir)/m2/gm2-libs-paranoid): Remove.
($(objdir)/m2/gm2-compiler-verify): Remove.
($(objdir)/m2/boot-bin): Remove.
($(objdir)/m2/gm2-libs-pim): Remove.
($(objdir)/m2/gm2-libs-coroutines): Remove.
(stage1/m2): Remove.
(stage2/m2): Remove.
(stage3/m2): Remove.
(m2.stageprofile): New rule.
(m2.stagefeedback): New rule.
(cc1gm2$(exeext)): Change dependent name.
(m2/stage2/cc1gm2$(exeext)): Change dependent name.
Check and create dest dir.
(m2/stage1/cc1gm2$(exeext)): Check and create dest dir
if necessary.
(m2/gm2-gcc/%.o): Ditto.
(m2/gm2-gcc/rtegraph.o): Ditto.
(m2/gm2-gcc/$(SRC_PREFIX)%.h): Ditto.
(m2/gm2-gcc/$(SRC_PREFIX)%.h): Ditto.
(m2/mc-boot): Ditto.
(m2/mc-boot-ch): Ditto.
(m2/gm2-libs-boot): Ditto.
(m2/gm2-compiler-boot): Ditto.
(m2/gm2-compiler): Ditto.
(m2/gm2-libiberty): Ditto.
(m2/gm2-compiler): Ditto.
(m2/gm2-libs-iso): Ditto.
(m2/gm2-libs): Ditto.
(m2/gm2-libs-min): Ditto.
(m2/gm2-libs-coroutines): Ditto.
(m2/boot-bin): Ditto.
(m2/pge-boot): Ditto.
(m2/pge-boot): Ditto.
* Make-maintainer.in (m2/gm2-ppg-boot): Check and create
dest dir if necessary.
(m2): Ditto.
(m2/gm2-ppg-boot): Ditto.
(m2/gm2-pg-boot): Ditto.
(m2/gm2-auto): Ditto.
(m2/gm2-pg-boot): Ditto.
(m2/gm2-pge-boot): Ditto.
($(objdir)/plugin): Ditto.
($(objdir)/m2/mc-boot-ch): Ditto.
($(objdir)/m2/mc-boot-gen): Ditto.
(m2/boot-bin): Ditto.
(m2/mc): Ditto.
(m2/mc-obj): Ditto.
($(objdir)/m2/gm2-ppg-boot): Ditto.
($(objdir)/m2/gm2-pg-boot): Ditto.
($(objdir)/m2/gm2-pge-boot): Ditto.
(m2/mc-boot-gen): Ditto.
(m2/m2obj3): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
* config-lang.in (m2/gm2-compiler-boot): Remove mkdir.
(m2/gm2-libs-boot): Ditto.
(m2/gm2-ici-boot): Ditto.
(m2/gm2-libiberty): Ditto.
(m2/gm2-gcc): Ditto.
(m2/gm2-compiler): Ditto.
(m2/gm2-libs): Ditto.
(m2/gm2-libs-iso): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-verify): Ditto.
(m2/boot-bin): Ditto.
(m2/gm2-libs-pim): Ditto.
(m2/gm2-libs-coroutines): Ditto.
(m2/gm2-libs-min): Ditto.
(m2/pge-boot): Ditto.
(plugin): Ditto.
(stage1/m2): Ditto.
(stage2/m2): Ditto.
(stage3/m2): Ditto.
(stage4/m2): Ditto.
(m2/gm2-auto): Ditto.
(m2/gm2-pg-boot): Ditto.
(m2/gm2-pge-boot): Ditto.
(m2/gm2-ppg-boot): Ditto.
(m2/mc-boot): Ditto.
(m2/mc-boot-ch): Ditto.
(m2/mc-boot-gen): Ditto.

-- o< -- o< -- o< -- o< -- o< -- o< -- o<
diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
index 08d0f3b963f..a3751109481 100644
--- a/gcc/m2/Make-lang.in
+++ b/gcc/m2/Make-lang.in
@@ -27,7 +27,7 @@ GM2_CROSS_NAME = `echo gm2|sed 
'$(program_transform_cross_name)'`

 M2_MAINTAINER = no

-GM2_1 = ./gm2 -B./stage1/m2 -g -fm2-g
+GM2_1 = ./gm2 -B./m2/stage1 -g -fm2-g

 GM2_FOR_TARGET = $(STAGE_CC_WRAPPER) ./gm2 -B./ -B$(build_tooldir)/bin/ 
-L$(objdir)/../ld $(TFLAGS)

@@ -71,7 +71,6 @@ m2.srcextra: m2/SYSTEM-pim.texi m2/SYSTEM-iso.texi 
m2/gm2-libs.texi m2/gm2-ebnf.
-cp -p m2/SYSTEM-iso.texi $(srcdir)/m2
-cp -p m2/gm2-libs.texi $(srcdir)/m2
-cp -p m2/gm2-ebnf.texi $(srcdir)/m2
-   find . -name '*.texi' -print
 else
 m2.srcextra:
 endif
@@ -167,7 +166,7 @@ doc/m2.info: $(TEXISRC)
else true; fi

 

Re: Missing dependencies in m2/ ?

2023-01-09 Thread Jeff Law via Gcc-patches




On 1/8/23 21:18, Gaius Mulley wrote:

Jeff Law via Gcc-patches  writes:


I've been getting sporatic errors like this since the introduction of
the modula-2 front-end:


In file included from ../../..//gcc/gcc/m2/mc-boot/GSFIO.c:29:
../../..//gcc/gcc/system.h:556:20: error: conflicting declaration of C function 
'const char* strsignal(int)'
   556 | extern const char *strsignal (int);
   |^
In file included from /usr/include/c++/12/cstring:42,
  from ../../..//gcc/gcc/system.h:241:
/usr/include/string.h:478:14: note: previous declaration 'char* strsignal(int)'
   478 | extern char *strsignal (int __sig) __THROW;
   |  ^
In file included from ../../..//gcc/gcc/system.h:707:
../../..//gcc/gcc/../include/libiberty.h:112:14: error: ambiguating new 
declaration of 'char* basename(const char*)'
   112 | extern char *basename (const char *) ATTRIBUTE_RETURNS_NONNULL 
ATTRIBUTE_NONNULL(1);
   |  ^~~~
/usr/include/string.h:524:26: note: old declaration 'const char* basename(const 
char*)'
   524 | extern "C++" const char *basename (const char *__filename)
   |  ^~~~
make[1]: *** [../../..//gcc/gcc/m2/Make-lang.in:1364: m2/mc-boot/GSFIO.o] Error 
1



They seem to come and go without rhyme or reason.  For example build
#1885 on lm32-elf failed, while #1884 passed.

Aside from the fact that I coonfigure with --enable-languages=c,c++
and yet modula-2 stuff still gets built (can that be fixed?) it seems
like we're missing dependencies to ensure that the generated config.h
file is made before building the modula-2 stuff.

In a good build you'll see something like this:

config.status: creating auto-host.h
[ ... ]
Build GSFIO.o:
g++ -g -c -I. -I../../..//gcc/gcc/m2/mc-boot-ch
-I../../..//gcc/gcc/m2/mc-boot -I../../..//gcc/gcc/../include
-I../../..//gcc/gcc -I. -Im2/mc-boot -I../../..//gcc/gcc
  -I../../..//gcc/gcc/m2/mc-boot -I../../..//gcc/gcc/../include
-I../../..//gcc/gcc/../libcpp/include -I../../..//gcc/gcc/../libcody
  -I../../..//gcc/gcc/../libdecnumber
-I../../..//gcc/gcc/../libdecnumber/dpd -I../libdecnumber
  -I../../..//gcc/gcc/../libbacktrace
../../..//gcc/gcc/m2/mc-boot/GSFIO.c -o m2/mc-boot/GSFIO.o

Which naturally works just fine.

In a bad build, auto-host.h is _not_ created before trying to build GSFIO.o.

Can you please take care of this.  It's rather annoying to have builds
failing in the continuous testing system like this, particularly when
modula-2 isn't even enabled.

Jeff


Hi Jeff,

many apologies for the breakage - I've now added the Makefile
dependencies.  I've also regenerated the m2 configure scripts

I'm still seeing it as of about 2 hours ago:

http://law-sandy.freeddns.org:8080/job/avr-elf/2125/console

A good run (yesterday):

http://law-sandy.freeddns.org:8080/job/avr-elf/2124/console


However, I did find that my scripts were enabling all languages -- sorry 
I stated otherwise and blamed it on the M2 front-end.  THe only issue we 
need to resolve is the dependency problems.


jeff


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2023-01-09 Thread Michael Meissner via Gcc-patches
On Fri, Jan 06, 2023 at 07:41:07PM -0500, Michael Meissner wrote:
> On Wed, Dec 21, 2022 at 09:40:24PM +, Joseph Myers wrote:
> > On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> > 
> > > > --- a/gcc/tree.cc
> > > > +++ b/gcc/tree.cc
> > > > @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
> > > >if (!targetm.floatn_mode (n, extended).exists (&mode))
> > > > continue;
> > > >int precision = GET_MODE_PRECISION (mode);
> > > > -  /* Work around the rs6000 KFmode having precision 113 not
> > > > -128.  */
> > > 
> > > It has precision 126 now fwiw.
> > > 
> > > Joseph: what do you think about this patch?  Is the workaround it
> > > removes still useful in any way, do we need to do that some other way if
> > > we remove this?
> > 
> > I think it's best for the TYPE_PRECISION, for any type with the binary128 
> > format, to be 128 (not 126).
> > 
> > It's necessary that _Float128, _Float64x and long double all have the same 
> > TYPE_PRECISION when they have the same (binary128) format, or at least 
> > that TYPE_PRECISION for _Float128 >= that for long double >= that for 
> > _Float64x, so that the rules in c_common_type apply properly.
> > 
> > How the TYPE_PRECISION compares to that of __ibm128, or of long double 
> > when that's double-double, is less important.
> 
> I spent a few days on working on this.  I have patches to make the 3 128-bit
> types to all have TYPE_PRECISION of 128.  To do this, I added a new mode macro
> (FRACTIONAL_FLOAT_MODE_NO_WIDEN) that takes the same arguments as
> FRACTIONAL_FLOAT_MODE.

...

I had the patches to change the precision to 128, and I just ran them.  C and
C++ do not seem to be bothered by changing the precision to 128 (once I got it
to build, etc.).  But Fortran on the other hand does actually use the precision
to differentiate between IBM extended double and IEEE 128-bit.  In particular,
the following 3 tests fail when long double is IBM extended double:

gfortran.dg/PR100914.f90
gfortran.dg/c-interop/typecodes-array-float128.f90
gfortran.dg/c-interop/typecodes-scalar-float128.f90

I tried adding code to use the old precisions for Fortran, but not for C/C++,
but it didn't seem to work.

So while it might be possible to use a single 128 for the precision, it needs
more work and attention, particularly on the Fortran side.

I'm not sure it is worth it to try and change things.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[PATCH] xtensa: Make instruction cost estimation for size more accurate

2023-01-09 Thread Takayuki 'January June' Suwa via Gcc-patches
Until now, we applied COSTS_N_INSNS() (multiplying by 4) after dividing
the instruction length by 3, so we couldn't express the difference less
than modulo 3 in insn cost for size (e.g. 11 Bytes and 12 bytes cost the
same).

This patch fixes that.

;; 2 bytes
addi.n  a2, a2, -1  ; cost 3

;; 3 bytes
addmi   a2, a2, 1024; cost 4

;; 4 bytes
movi.n  a3, 80  ; cost 5
bnez.n  a2, a3, .L4

;; 5 bytes
srlia2, a3, 1   ; cost 7
add.n   a2, a2, a2

;; 6 bytes
ssai8   ; cost 8
src a4, a2, a3

:: 3 + 4 bytes
l32ra2, .L5 ; cost 9

;; 11 bytes ; cost 15
;; 12 bytes ; cost 16

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_insn_cost):
Let insn cost for size be obtained by applying COSTS_N_INSNS()
to instruction length and then dividing by 3.
---
 gcc/config/xtensa/xtensa.cc | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index a1f184950ae..6cf6b35399a 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -4519,13 +4519,15 @@ xtensa_insn_cost (rtx_insn *insn, bool speed)
 {
   if (!(recog_memoized (insn) < 0))
 {
-  int len = get_attr_length (insn), n = (len + 2) / 3;
+  int len = get_attr_length (insn);
 
   if (len == 0)
return COSTS_N_INSNS (0);
 
   if (speed)  /* For speed cost.  */
{
+ int n = (len + 2) / 3;
+
  /* "L32R" may be particular slow (implementation-dependent).  */
  if (xtensa_is_insn_L32R_p (insn))
return COSTS_N_INSNS (1 + xtensa_extra_l32r_costs);
@@ -4572,10 +4574,11 @@ xtensa_insn_cost (rtx_insn *insn, bool speed)
{
  /* "L32R" itself plus constant in litpool.  */
  if (xtensa_is_insn_L32R_p (insn))
-   return COSTS_N_INSNS (2) + 1;
+   len = 3 + 4;
 
- /* Consider ".n" short instructions.  */
- return COSTS_N_INSNS (n) - (n * 3 - len);
+ /* Consider fractional instruction length (for example, ".n"
+short instructions or "L32R" litpool constants.  */
+ return (COSTS_N_INSNS (len) + 1) / 3;
}
}
 }
-- 
2.30.2


Re: Missing dependencies in m2/ ?

2023-01-09 Thread Gaius Mulley via Gcc-patches
Jeff Law  writes:

> I'm still seeing it as of about 2 hours ago:
>
> http://law-sandy.freeddns.org:8080/job/avr-elf/2125/console
>
> A good run (yesterday):
>
> http://law-sandy.freeddns.org:8080/job/avr-elf/2124/console
>
>
Hi Jeff,

many thanks for the urls above - useful I'll attempt to reproduce the
gcc compile.

> However, I did find that my scripts were enabling all languages --
> sorry I stated otherwise and blamed it on the M2 front-end.

No problem at all - it allowed me to find I was using wrong
version of autoconf :-).

> THe only issue we need to resolve is the dependency problems.

Yes indeed, I think I've found some missing dependencies which I'll push
to git when the bootstrap completes.  In the meantime here is the patch:

regards,
Gaius

--- o< --- o< --- o< --- o< --- o< --- o< --- o<
diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
index 08d0f3b963f..5c173f22540 100644
--- a/gcc/m2/Make-lang.in
+++ b/gcc/m2/Make-lang.in
@@ -1360,7 +1360,7 @@ m2/boot-bin/mc$(exeext): $(BUILD-MC-BOOT-O) 
$(BUILD-MC-INTERFACE-O) \
  $(BUILD-MC-INTERFACE-O) m2/mc-boot/main.o \
  mcflex.o m2/gm2-libs-boot/RTcodummy.o -lm
 
-m2/mc-boot/$(SRC_PREFIX)%.o: m2/mc-boot/$(SRC_PREFIX)%.c
+m2/mc-boot/$(SRC_PREFIX)%.o: m2/mc-boot/$(SRC_PREFIX)%.c 
m2/gm2-libs/gm2-libs-host.h
$(CXX) -g -c -I. -I$(srcdir)/m2/mc-boot-ch -I$(srcdir)/m2/mc-boot 
-I$(srcdir)/../include -I$(srcdir) $(INCLUDES) $< -o $@
 
 m2/mc-boot-ch/$(SRC_PREFIX)%.o: m2/mc-boot-ch/$(SRC_PREFIX)%.c 
m2/gm2-libs/gm2-libs-host.h
@@ -1373,7 +1373,7 @@ m2/mc-boot/main.o: $(M2LINK) $(srcdir)/m2/init/mcinit
unset CC ; $(M2LINK) -s --langc++ --exit --name m2/mc-boot/main.c 
$(srcdir)/m2/init/mcinit
$(CXX) -g -c -I. -I$(srcdir)/../include -I$(srcdir) $(INCLUDES) 
m2/mc-boot/main.c -o $@
 
-mcflex.o: mcflex.c
+mcflex.o: mcflex.c m2/gm2-libs/gm2-libs-host.h
$(CC) -I$(srcdir)/m2/mc -g -c $< -o $@   # remember that mcReserved.h 
is copied into m2/mc
 
 mcflex.c: $(srcdir)/m2/mc/mc.flex


More znver4 x86-tune flags

2023-01-09 Thread Jan Hubicka via Gcc-patches


Hi,
this patch adds more tunes for zen4:
 - new tunes for avx512 scater instructions.
   In micro benchmarks these seems consistent loss compared to open-coded coe
 - disable use of gather for zen4
   While these are win for a micro benchmarks (based on TSVC), enabling gather
   is a loss for parest. So for now it seems safe to keep it off.
 - disable pass to avoid FMA chains for znver4 since fmadd was optimized and 
does not seem
   to cause regressions.

Bootstrapped/regtested x86_64.
Honza

* i386.cc (ix86_vectorize_builtin_scatter): Guard scatter by 
TARGET_USE_SCATTER.
* i386.h (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
TARGET_USE_SCATTER): New macros.
* x86-tune.def (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
TARGET_USE_SCATTER): New tunes.
(X86_TUNE_AVOID_256FMA_CHAINS, X86_TUNE_AVOID_512FMA_CHAINS): Disable 
for znver4.
(X86_TUNE_USE_GATHER): Disable for zen4.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index de978d19063..9fb69f6c174 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -19051,6 +19051,13 @@ ix86_vectorize_builtin_scatter (const_tree vectype,
   if (!TARGET_AVX512F)
 return NULL_TREE;
 
+  if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 2u)
+  ? !TARGET_USE_SCATTER_2PARTS
+  : (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u)
+? !TARGET_USE_SCATTER_4PARTS
+: !TARGET_USE_SCATTER))
+return NULL_TREE;
+
   if ((TREE_CODE (index_type) != INTEGER_TYPE
&& !POINTER_TYPE_P (index_type))
   || (TYPE_MODE (index_type) != SImode
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e6a603ed31a..cd7ed19e29c 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -397,10 +397,16 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
ix86_tune_features[X86_TUNE_AVOID_4BYTE_PREFIXES]
 #define TARGET_USE_GATHER_2PARTS \
ix86_tune_features[X86_TUNE_USE_GATHER_2PARTS]
+#define TARGET_USE_SCATTER_2PARTS \
+   ix86_tune_features[X86_TUNE_USE_SCATTER_2PARTS]
 #define TARGET_USE_GATHER_4PARTS \
ix86_tune_features[X86_TUNE_USE_GATHER_4PARTS]
+#define TARGET_USE_SCATTER_4PARTS \
+   ix86_tune_features[X86_TUNE_USE_SCATTER_4PARTS]
 #define TARGET_USE_GATHER \
ix86_tune_features[X86_TUNE_USE_GATHER]
+#define TARGET_USE_SCATTER \
+   ix86_tune_features[X86_TUNE_USE_SCATTER]
 #define TARGET_FUSE_CMP_AND_BRANCH_32 \
ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32]
 #define TARGET_FUSE_CMP_AND_BRANCH_64 \
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index fae3b650434..7e9c7244fc0 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -483,28 +483,43 @@ DEF_TUNE (X86_TUNE_AVOID_4BYTE_PREFIXES, 
"avoid_4byte_prefixes",
 DEF_TUNE (X86_TUNE_USE_GATHER_2PARTS, "use_gather_2parts",
  ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
m_CORE_ATOM | m_GENERIC))
 
+/* X86_TUNE_USE_SCATTER_2PARTS: Use scater instructions for vectors with 2
+   elements.  */
+DEF_TUNE (X86_TUNE_USE_SCATTER_2PARTS, "use_scatter_2parts",
+ ~(m_ZNVER4 | m_GENERIC))
+
 /* X86_TUNE_USE_GATHER_4PARTS: Use gather instructions for vectors with 4
elements.  */
 DEF_TUNE (X86_TUNE_USE_GATHER_4PARTS, "use_gather_4parts",
  ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
m_CORE_ATOM | m_GENERIC))
 
+/* X86_TUNE_USE_SCATTER_4PARTS: Use scater instructions for vectors with 4
+   elements.  */
+DEF_TUNE (X86_TUNE_USE_SCATTER_4PARTS, "use_scatter_4parts",
+ ~(m_ZNVER4 | m_GENERIC))
+
 /* X86_TUNE_USE_GATHER: Use gather instructions for vectors with 8 or more
elements.  */
 DEF_TUNE (X86_TUNE_USE_GATHER, "use_gather",
- ~(m_ZNVER1 | m_ZNVER2 | m_ALDERLAKE | m_CORE_ATOM | m_GENERIC))
+ ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER4 | m_ALDERLAKE | m_CORE_ATOM | 
m_GENERIC))
+
+/* X86_TUNE_USE_SCATTER: Use scater instructions for vectors with 8 or more
+   elements.  */
+DEF_TUNE (X86_TUNE_USE_SCATTER, "use_scatter",
+ ~(m_ZNVER4 | m_GENERIC))
 
 /* X86_TUNE_AVOID_128FMA_CHAINS: Avoid creating loops with tight 128bit or
smaller FMA chain.  */
-DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER)
+DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER1 | 
m_ZNVER2 | m_ZNVER3)
 
 /* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
smaller FMA chain.  */
-DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
m_ZNVER3 | m_ZNVER4
+DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
m_ZNVER3
  | m_ALDERLAKE | m_SAPPHIRERAPIDS | m_CORE_ATOM)
 
 /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight 512bit or
smaller FMA chain.  */
-DEF_TUNE (X86_TUNE_AVOID_512FMA_CHAINS, "avoid_fma512_chains", m_ZNVER4)
+DEF_TUNE (X86_TUNE_AVOID_512FMA_CHAINS, "avoid_fma512_chains", m_NONE)
 
 /* X86_TUNE_V2DF_RE

Re: More znver4 x86-tune flags

2023-01-09 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 10, 2023 at 12:32 PM Jan Hubicka via Gcc-patches
 wrote:
>
>
> Hi,
> this patch adds more tunes for zen4:
>  - new tunes for avx512 scater instructions.
>In micro benchmarks these seems consistent loss compared to open-coded coe
>  - disable use of gather for zen4
>While these are win for a micro benchmarks (based on TSVC), enabling gather
>is a loss for parest. So for now it seems safe to keep it off.
>  - disable pass to avoid FMA chains for znver4 since fmadd was optimized and 
> does not seem
>to cause regressions.
>
> Bootstrapped/regtested x86_64.
> Honza
>
> * i386.cc (ix86_vectorize_builtin_scatter): Guard scatter by 
> TARGET_USE_SCATTER.
> * i386.h (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
> TARGET_USE_SCATTER): New macros.
> * x86-tune.def (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
> TARGET_USE_SCATTER): New tunes.
> (X86_TUNE_AVOID_256FMA_CHAINS, X86_TUNE_AVOID_512FMA_CHAINS): Disable 
> for znver4.
> (X86_TUNE_USE_GATHER): Disable for zen4.
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index de978d19063..9fb69f6c174 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -19051,6 +19051,13 @@ ix86_vectorize_builtin_scatter (const_tree vectype,
>if (!TARGET_AVX512F)
>  return NULL_TREE;
>
> +  if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 2u)
> +  ? !TARGET_USE_SCATTER_2PARTS
> +  : (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u)
> +? !TARGET_USE_SCATTER_4PARTS
> +: !TARGET_USE_SCATTER))
> +return NULL_TREE;
> +
>if ((TREE_CODE (index_type) != INTEGER_TYPE
> && !POINTER_TYPE_P (index_type))
>|| (TYPE_MODE (index_type) != SImode
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index e6a603ed31a..cd7ed19e29c 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -397,10 +397,16 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
> ix86_tune_features[X86_TUNE_AVOID_4BYTE_PREFIXES]
>  #define TARGET_USE_GATHER_2PARTS \
> ix86_tune_features[X86_TUNE_USE_GATHER_2PARTS]
> +#define TARGET_USE_SCATTER_2PARTS \
> +   ix86_tune_features[X86_TUNE_USE_SCATTER_2PARTS]
>  #define TARGET_USE_GATHER_4PARTS \
> ix86_tune_features[X86_TUNE_USE_GATHER_4PARTS]
> +#define TARGET_USE_SCATTER_4PARTS \
> +   ix86_tune_features[X86_TUNE_USE_SCATTER_4PARTS]
>  #define TARGET_USE_GATHER \
> ix86_tune_features[X86_TUNE_USE_GATHER]
> +#define TARGET_USE_SCATTER \
> +   ix86_tune_features[X86_TUNE_USE_SCATTER]
>  #define TARGET_FUSE_CMP_AND_BRANCH_32 \
> ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32]
>  #define TARGET_FUSE_CMP_AND_BRANCH_64 \
> diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
> index fae3b650434..7e9c7244fc0 100644
> --- a/gcc/config/i386/x86-tune.def
> +++ b/gcc/config/i386/x86-tune.def
> @@ -483,28 +483,43 @@ DEF_TUNE (X86_TUNE_AVOID_4BYTE_PREFIXES, 
> "avoid_4byte_prefixes",
>  DEF_TUNE (X86_TUNE_USE_GATHER_2PARTS, "use_gather_2parts",
>   ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
> m_CORE_ATOM | m_GENERIC))
>
> +/* X86_TUNE_USE_SCATTER_2PARTS: Use scater instructions for vectors with 2
> +   elements.  */
> +DEF_TUNE (X86_TUNE_USE_SCATTER_2PARTS, "use_scatter_2parts",
> + ~(m_ZNVER4 | m_GENERIC))
> +
>  /* X86_TUNE_USE_GATHER_4PARTS: Use gather instructions for vectors with 4
> elements.  */
>  DEF_TUNE (X86_TUNE_USE_GATHER_4PARTS, "use_gather_4parts",
>   ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
> m_CORE_ATOM | m_GENERIC))
>
> +/* X86_TUNE_USE_SCATTER_4PARTS: Use scater instructions for vectors with 4
> +   elements.  */
> +DEF_TUNE (X86_TUNE_USE_SCATTER_4PARTS, "use_scatter_4parts",
> + ~(m_ZNVER4 | m_GENERIC))
> +
>  /* X86_TUNE_USE_GATHER: Use gather instructions for vectors with 8 or more
> elements.  */
>  DEF_TUNE (X86_TUNE_USE_GATHER, "use_gather",
> - ~(m_ZNVER1 | m_ZNVER2 | m_ALDERLAKE | m_CORE_ATOM | m_GENERIC))
> + ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER4 | m_ALDERLAKE | m_CORE_ATOM | 
> m_GENERIC))
> +
> +/* X86_TUNE_USE_SCATTER: Use scater instructions for vectors with 8 or more
> +   elements.  */
> +DEF_TUNE (X86_TUNE_USE_SCATTER, "use_scatter",
> + ~(m_ZNVER4 | m_GENERIC))
>
>  /* X86_TUNE_AVOID_128FMA_CHAINS: Avoid creating loops with tight 128bit or
> smaller FMA chain.  */
> -DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER)
> +DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER1 | 
> m_ZNVER2 | m_ZNVER3)
According to comments, it's *256bit or smaller*, so shouldn't
avoid_fma_chains be implied by avoid_fma256_chains.
>
>  /* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
> smaller FMA chain.  */
> -DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
> m_ZNVER3 | m_ZNVER4
> +DEF_TUNE (X86_TUNE_A

Re: [PATCH][pushed] contrib: add 'contrib' to default dirs in update-copyright.py

2023-01-09 Thread Martin Liška
> However, I noticed when I run ./contrib/update-copyright.py --this-year
> I get much more modifications out of contrib folder:

@Jakub?

Martin