Re: [PATCH] powerpc: Fix ICE with fp conditional move (PR target/93073)

2020-01-21 Thread Segher Boessenkool
Hi!

On Mon, Jan 20, 2020 at 08:31:43PM -0500, Nicholas Krause wrote:
> On 1/20/20 6:51 PM, Segher Boessenkool wrote:
> >We can (and should) use other instructions than just fsel here as well
> >(say, xscmpgedp followed by xxsel).  This can also work for QP float,
> >which also can be TFmode, just to complicate matters more.
> >
> >This function is a bit big, and the logic is all over the place, but
> >let's try to make it better, little by little :-)
>
> If your goal is make it smaller

That of course is not the goal in and of itself.  The function is a
little big and hard to follow, it is just run-on logic.  But splitting
random pieces to some new functions does not help: it just dillutes the
complexity, it does not reduce it.

It can of course easily have the "fsel" handling code split off, or all
the scalar float handling, whichever works out best.  Or maybe both.

A good test for "is this a good factor" is: can you think of a short
name for it, that from just looking at a call to it you understand
what it does, and also understand what its arguments are?  *Without*
looking at the declaration of the function.

> I just looked at it but wasn't sure due 
> to this line:
> enum rtx_code code = GET_CODE (op);
> in rs6000_emit_cmove and how to move it out but without duplicating it.

Don't worry about lines of code.  We have 60k lines or whatever, just in
the C files in config/rs6000.  We need code that is easier to change,
code that is better maintainable, code that is easier to read.

Different levels of abstraction should be in separate functions.


Segher


Re: [PATCH Coroutines] Fix issue with unused corutine function parameters

2020-01-21 Thread Iain Sandoe

JunMa  wrote:


在 2020/1/21 上午9:34, JunMa 写道:

在 2020/1/21 上午12:39, Iain Sandoe 写道:

JunMa  wrote:


在 2020/1/20 下午8:21, Iain Sandoe 写道:

JunMa  wrote:


在 2020/1/20 下午7:55, Iain Sandoe 写道:

Hi JunMa,

JunMa  wrote:


在 2020/1/20 下午6:07, Iain Sandoe 写道:

Hi JunMa,

JunMa  wrote:


Hi
Accroding to N4835: When a coroutine is invoked, a copy is created
for each coroutine parameter. While in current implementation, we
only collect used parameters. This may lost behavior inside  
constructor

and destructor of unused parameters.

This patch change the implementation to collect all parameter.

thanks for the patch.

I am not convinced this is the right way to fix this, we do not  
want to increase
the size of the coroutine frame unnecessarily.  I would like to  
check the lifetime
requirements; such copies might only need to be made local to the  
ramp function.


Iain
For type with trivial destructor, There is no need to copy  
parameter if it is
never referenced after a reachable suspend  
point(return-to-caller/resume) in the
coroutine. Since we are in front end, that should be  
inital_suspend point. so we
can create field for type with non-trivial destructor first, and  
skip unused parameters

in register_param_uses. I'll update the patch
I think that, even if the DTOR is non-trivial, the copy only needs  
to be in the stack

frame of the ramp, not in the coroutine frame.

I do think this.  It's just need more work on front end.
I think we already did the work, and know the answer (about param  
uses in the body), right?

Yes, we may need extra work on copy parameters, I'll do this.
Having discussed this with Nathan (and I’ve also mailed Gor for  
clarification).  I think it might be
a good idea to wait for those responses before revising (it could be  
that your original reading of

the wording is correct, and the clang impl. needs to be updated).

ok, thanks.
The reason why i consider about non-trivial destructors is that if the  
destructors are called in ramp
function, actor function may has different status on something. the  
testcase I attachted is such

example: it changes global variable in destructor.


Yes we all agree on this :-)

The issue is:
  1. should the copy exist for the duration of the ramp only? (i.e. copied to 
the ramp() stack frame)
  2. should the copy exist for the duration of the resume() function? (i.e. 
copied to the coro frame)

At the present, clang appears to do 1. when optimisation is on and 2.  
without optimisation.
Nathan pointed out that the lifetime of an object should not depend on the  
optimisation level.


So - I think we need some clarification on the intent (from the designer)  
and maybe some revision

of the standard wording to make the lifetime clear.

thanks
Iain




Re: [PR 80005] __has_include parsing

2020-01-21 Thread Jakub Jelinek
On Mon, Jan 20, 2020 at 06:01:27PM -0800, Jim Wilson wrote:
> On Mon, Jan 20, 2020 at 5:44 AM Nathan Sidwell  wrote:
> > I've pushed this to master, to address 80005
> >
> > __has_include is funky in that it is macro-like from the POV of #ifdef
> > ...
> 
> With this patch, __has_include__ no longer works.  There is a use of
> this in the RISC-V glibc port.  I see the docs only mention
> __has_include, so maybe this has always been a silent bug in the
> RISC-V glibc port?  If glibc is broken I can submit a patch.  This
> might cause trouble for other people too though.

I only see one spot where it has been added and then
2019-06-06  Florian Weimer  

* sysdeps/unix/sysv/linux/riscv/flush-icache.c: Do not use
internal GCC preprocessor identifier __has_include__.
which corrected it to use __has_include.

Jakub



Re: [PATCH Coroutines] Fix issue with unused corutine function parameters

2020-01-21 Thread JunMa

在 2020/1/21 下午4:07, Iain Sandoe 写道:

JunMa  wrote:


在 2020/1/21 上午9:34, JunMa 写道:

在 2020/1/21 上午12:39, Iain Sandoe 写道:

JunMa  wrote:


在 2020/1/20 下午8:21, Iain Sandoe 写道:

JunMa  wrote:


在 2020/1/20 下午7:55, Iain Sandoe 写道:

Hi JunMa,

JunMa  wrote:


在 2020/1/20 下午6:07, Iain Sandoe 写道:

Hi JunMa,

JunMa  wrote:


Hi
Accroding to N4835: When a coroutine is invoked, a copy is 
created
for each coroutine parameter. While in current 
implementation, we
only collect used parameters. This may lost behavior inside 
constructor

and destructor of unused parameters.

This patch change the implementation to collect all parameter.

thanks for the patch.

I am not convinced this is the right way to fix this, we do 
not want to increase
the size of the coroutine frame unnecessarily. I would like 
to check the lifetime
requirements; such copies might only need to be made local to 
the ramp function.


Iain
For type with trivial destructor, There is no need to copy 
parameter if it is
never referenced after a reachable suspend 
point(return-to-caller/resume) in the
coroutine. Since we are in front end, that should be 
inital_suspend point. so we
can create field for type with non-trivial destructor first, 
and skip unused parameters

in register_param_uses. I'll update the patch
I think that, even if the DTOR is non-trivial, the copy only 
needs to be in the stack

frame of the ramp, not in the coroutine frame.

I do think this.  It's just need more work on front end.
I think we already did the work, and know the answer (about param 
uses in the body), right?

Yes, we may need extra work on copy parameters, I'll do this.
Having discussed this with Nathan (and I’ve also mailed Gor for 
clarification).  I think it might be
a good idea to wait for those responses before revising (it could 
be that your original reading of

the wording is correct, and the clang impl. needs to be updated).

ok, thanks.
The reason why i consider about non-trivial destructors is that if 
the destructors are called in ramp
function, actor function may has different status on something. the 
testcase I attachted is such

example: it changes global variable in destructor.


Yes we all agree on this :-)

The issue is:
  1. should the copy exist for the duration of the ramp only? (i.e. 
copied to the ramp() stack frame)
  2. should the copy exist for the duration of the resume() function? 
(i.e. copied to the coro frame)


At the present, clang appears to do 1. when optimisation is on and 2. 
without optimisation.
Nathan pointed out that the lifetime of an object should not depend on 
the optimisation level.


Although I prefer 2, it seems that maybe clang is hard to control 
this(not sure:), maybe new attribute).


So - I think we need some clarification on the intent (from the 
designer) and maybe some revision

of the standard wording to make the lifetime clear.


Yeah, let's make this more clear. Thanks.

Regards
JunMa

thanks
Iain





[PATCH] riscv: Fix up riscv_rtx_costs for RTL checking (PR target/93333)

2020-01-21 Thread Jakub Jelinek
Hi!

As mentioned in the PR, during combine rtx_costs can be called sometimes
even on RTL that has not been validated yet and so can contain even operands
that aren't valid in any instruction.

The following patch ought to fix this case, ok for trunk?

2020-01-21  Jakub Jelinek  

PR target/9
* config/riscv/riscv.c (riscv_rtx_costs) : Verify
the last two operands are CONST_INT_P before using them as such.

* gcc.c-torture/compile/pr9.c: New test.

--- gcc/config/riscv/riscv.c.jj 2020-01-21 09:14:07.500268371 +0100
+++ gcc/config/riscv/riscv.c2020-01-21 09:27:37.629974828 +0100
@@ -1642,7 +1642,10 @@ riscv_rtx_costs (rtx x, machine_mode mod
 
 case ZERO_EXTRACT:
   /* This is an SImode shift.  */
-  if (outer_code == SET && (INTVAL (XEXP (x, 2)) > 0)
+  if (outer_code == SET
+ && CONST_INT_P (XEXP (x, 1))
+ && CONST_INT_P (XEXP (x, 2))
+ && (INTVAL (XEXP (x, 2)) > 0)
  && (INTVAL (XEXP (x, 1)) + INTVAL (XEXP (x, 2)) == 32))
{
  *total = COSTS_N_INSNS (SINGLE_SHIFT_COST);
--- gcc/testsuite/gcc.c-torture/compile/pr9.c.jj2020-01-21 
09:27:25.710155732 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr9.c   2020-01-21 
09:27:08.234420958 +0100
@@ -0,0 +1,10 @@
+/* PR target/9 */
+
+unsigned
+foo (int b, int c, int d, unsigned long e, int x, int y, int g, int h,
+ unsigned i)
+{
+  e >>= b;
+  i >>= e & 31;
+  return i & 1;
+}

Jakub



Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-21 Thread Uros Bizjak
On Mon, Jan 20, 2020 at 10:46 PM H.J. Lu  wrote:

> > > OK. Let's go with this version, but please investigate if we need to
> > > calculate TLS address in ptr_mode instead of Pmode. Due to quite some
> > > zero-extension from ptr_mode to Pmode hacks in this area, it looks to
> > > me that the whole calculation should be performed in ptr_mode (SImode
> > > in case of x32), and the result zero-extended to Pmode in case when
> > > Pmode = DImode.
> > >
> >
> > I checked it in.  I will investigate if we can use ptr_mode for TLS.
>
> Here is a patch to perform GNU2 TLS address computation in ptr_mode
> and zero-extend result to Pmode.

 case TLS_MODEL_GLOBAL_DYNAMIC:
-  dest = gen_reg_rtx (Pmode);
+  dest = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);

Please put these in their respective arms of "if (TARGET_GNU2_TLS).

 case TLS_MODEL_LOCAL_DYNAMIC:
-  base = gen_reg_rtx (Pmode);
+  base = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);

Also here.

A question: Do we need to emit the following part in Pmode?

  off = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, x), UNSPEC_DTPOFF);
  off = gen_rtx_CONST (Pmode, off);

  dest = force_reg (Pmode, gen_rtx_PLUS (Pmode, base, off));

If this can operate in ptr_mode then ...

-emit_insn (gen_tls_dynamic_gnu2_64 (Pmode, base, tmp));
+{
+  emit_insn (gen_tls_dynamic_gnu2_64 (ptr_mode, base, tmp));
+  if (GET_MODE (base) != Pmode)
+base = gen_rtx_ZERO_EXTEND (Pmode, base);
+}

... please keep base in ptr_mode, no need to zero_extend it.

-  tp = get_thread_pointer (Pmode, true);
-  set_unique_reg_note (get_last_insn (), REG_EQUAL,
-   gen_rtx_MINUS (Pmode, tmp, tp));

Just change the above to ptr_mode. There is no zero_extend needed here, because:

--cut here--
  off = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, x), UNSPEC_DTPOFF);
  off = gen_rtx_CONST (Pmode, off);

  dest = force_reg (Pmode, gen_rtx_PLUS (Pmode, base, off));

  if (TARGET_GNU2_TLS)
{
  if (GET_MODE (tp) != Pmode)
{
  dest = lowpart_subreg (ptr_mode, dest, Pmode);
  dest = gen_rtx_PLUS (ptr_mode, tp, dest);
  dest = gen_rtx_ZERO_EXTEND (Pmode, dest);
}
  else
dest = gen_rtx_PLUS (ptr_mode, tp, dest);
  dest = force_reg (Pmode, dest);
--cut here--

all the above part should operate in ptr_mode (for TARGET_GNU2_TLS at
least), followed by the same part of

--cut here--
  dest = gen_rtx_PLUS (ptr_mode, tp, dest);
  if (GET_MODE (dest) != Pmode)
 dest = gen_rtx_ZERO_EXTEND (Pmode, dest);
  dest = force_reg (Pmode, dest);

  if (GET_MODE (x) != Pmode)
x = gen_rtx_ZERO_EXTEND (Pmode, x);

  set_unique_reg_note (get_last_insn (), REG_EQUAL, x);
--cut here--

as is the case with TLS_MODEL_GLOBAL_DYNAMIC.

Uros.


[PATCHv2] Change recursive prepare_block_for_update to use a worklist

2020-01-21 Thread apinski
From: Andrew Pinski 

Reported as PR 93321, prepare_block_for_update with some huge
recusive inlining can go past the stack limit. Transforming this
recursive into worklist improves the stack usage here and we no
longer seg fault for the testcase.  Note the order we

OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.

ChangeLog:
* tree-into-ssa.c (prepare_block_for_update_1): Split out from ...
(prepare_block_for_update): This.  Use a worklist instead of recursiving
into the function.  Remove bb argument.
(update_ssa): Update call to prepare_block_for_update.
---
 gcc/tree-into-ssa.c | 61 +++--
 1 file changed, 48 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
index c27bf2ce121..9f1e8ece737 100644
--- a/gcc/tree-into-ssa.c
+++ b/gcc/tree-into-ssa.c
@@ -2593,11 +2593,9 @@ mark_use_interesting (tree var, gimple *stmt, 
basic_block bb,
 }
 }
 
-
-/* Do a dominator walk starting at BB processing statements that
-   reference symbols in SSA operands.  This is very similar to
-   mark_def_sites, but the scan handles statements whose operands may
-   already be SSA names.
+/* Processing statements in BB that reference symbols in SSA operands.
+   This is very similar to mark_def_sites, but the scan handles
+   statements whose operands may already be SSA names.
 
If INSERT_PHI_P is true, mark those uses as live in the
corresponding block.  This is later used by the PHI placement
@@ -2610,9 +2608,8 @@ mark_use_interesting (tree var, gimple *stmt, basic_block 
bb,
   that.  */
 
 static void
-prepare_block_for_update (basic_block bb, bool insert_phi_p)
+prepare_block_for_update_1 (basic_block bb, bool insert_phi_p)
 {
-  basic_block son;
   edge e;
   edge_iterator ei;
 
@@ -2694,13 +2691,51 @@ prepare_block_for_update (basic_block bb, bool 
insert_phi_p)
}
 }
 
-  /* Now visit all the blocks dominated by BB.  */
-  for (son = first_dom_son (CDI_DOMINATORS, bb);
-   son;
-   son = next_dom_son (CDI_DOMINATORS, son))
-prepare_block_for_update (son, insert_phi_p);
 }
 
+/* Do a dominator walk starting at entry block processing statements that
+   reference symbols in SSA operands.  This is very similar to
+   mark_def_sites, but the scan handles statements whose operands may
+   already be SSA names.
+
+   If INSERT_PHI_P is true, mark those uses as live in the
+   corresponding block.  This is later used by the PHI placement
+   algorithm to make PHI pruning decisions.
+
+   FIXME.  Most of this would be unnecessary if we could associate a
+  symbol to all the SSA names that reference it.  But that
+  sounds like it would be expensive to maintain.  Still, it
+  would be interesting to see if it makes better sense to do
+  that.  */
+static void
+prepare_block_for_update (bool insert_phi_p)
+{
+  size_t sp = 0;
+  basic_block *worklist;
+
+  /* Allocate the worklist.  */
+  worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
+  /* Add the entry BB to the worklist.  */
+  worklist[sp++] = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+
+  while (sp)
+{
+  basic_block bb;
+  basic_block son;
+
+  /* Pick a block from the worklist.  */
+  bb = worklist[--sp];
+
+  prepare_block_for_update_1 (bb, insert_phi_p);
+
+  /* Now add all the blocks dominated by BB to the worklist.  */
+  for (son = first_dom_son (CDI_DOMINATORS, bb);
+  son;
+  son = next_dom_son (CDI_DOMINATORS, son))
+   worklist[sp++] = son;
+}
+  free (worklist);
+}
 
 /* Helper for prepare_names_to_update.  Mark all the use sites for
NAME as interesting.  BLOCKS and INSERT_PHI_P are as in
@@ -3392,7 +3427,7 @@ update_ssa (unsigned update_flags)
 symbols in SSA operands.  Mark interesting blocks and
 statements and set local live-in information for the PHI
 placement heuristics.  */
-  prepare_block_for_update (start_bb, insert_phi_p);
+  prepare_block_for_update (insert_phi_p);
 
   tree name;
 
-- 
2.17.1



Re: [PATCH Coroutines]Fix ICE when co_awaiting on void type

2020-01-21 Thread Iain Sandoe
Hi Nathan,

This corrects my mixup between the error returns for complete_type and
complete_type_or_else, spotted by Jakub,

I have included Bin’s change that makes the coro-specific error message
fire if a void type is provided for this.

Additional comments below.

Iain Sandoe  wrote:

> Bin.Cheng  wrote:
> 
>> On Mon, Jan 20, 2020 at 6:31 PM Iain Sandoe  wrote:

>>> Bin.Cheng  wrote:
>>> 

>>>   tree o_type = TREE_TYPE (o);
>>> -  if (!VOID_TYPE_P (o_type))
>>> -o_type = complete_type_or_else (TREE_TYPE (o), o);
>>> +  if (o_type && !VOID_TYPE_P (o_type) && !COMPLETE_TYPE_P (o_type))
>> IIUC, Jakub doesn't like checking void type specially here?
> 
> that is necessary if we want the coro-specific diagnostic (which is more 
> meaningful
> than the vanilla void one) however, if there’s opposition to that, I don’t 
> have such a
> strong opinion.

Note, we cannot eliminate the following code for the diagnostic check; we still 
have to
verify that complete types are suitable.  Bin’s addition makes that warning 
fire for void
types too.

OK for trunk?
thanks
Iain


[coro] Fix co_await of void type.

gcc/cp
2020-01-21  Iain Sandoe  
Bin Cheng  

* coroutines.cc (coro_promise_type_found_p): Check for NULL return
from complete_type_or_else.
(register_param_uses): Likewise.
(build_co_await): Do not try to use complete_type_or_else for void
types, otherwise for incomplete types, check for NULL return from
complete_type_or_else.

gcc/testsuite
2020-01-21  Bin Cheng  

* g++.dg/coroutines/co-await-void_type.C: New test.

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 8a8c1b9829b..aea341604b9 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -428,8 +428,9 @@ coro_promise_type_found_p (tree fndecl, location_t loc)
 
   /* Complete this, we're going to use it.  */
   coro_info->handle_type = complete_type_or_else (handle_type, fndecl);
+
   /* Diagnostic would be emitted by complete_type_or_else.  */
-  if (coro_info->handle_type == error_mark_node)
+  if (!coro_info->handle_type)
return false;
 
   /* Build a proxy for a handle to "self" as the param to
@@ -633,7 +634,13 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
   else
 o = a; /* This is most likely about to fail anyway.  */
 
-  tree o_type = complete_type_or_else (TREE_TYPE (o), o);
+  tree o_type = TREE_TYPE (o);
+  if (o_type && !VOID_TYPE_P (o_type) && !COMPLETE_TYPE_P (o_type))
+o_type = complete_type_or_else (o_type, o);
+
+  if (!o_type)
+return error_mark_node;
+
   if (TREE_CODE (o_type) != RECORD_TYPE)
 {
   error_at (loc, "awaitable type %qT is not a structure",
@@ -2730,6 +2737,10 @@ register_param_uses (tree *stmt, int *do_subtree 
ATTRIBUTE_UNUSED, void *d)
   if (!COMPLETE_TYPE_P (actual_type))
actual_type = complete_type_or_else (actual_type, *stmt);
 
+  if (actual_type == NULL_TREE)
+   /* Diagnostic emitted by complete_type_or_else.  */
+   actual_type = error_mark_node;
+
   if (TREE_CODE (actual_type) == REFERENCE_TYPE)
actual_type = build_pointer_type (TREE_TYPE (actual_type));
 
diff --git a/gcc/testsuite/g++.dg/coroutines/co-await-void_type.C 
b/gcc/testsuite/g++.dg/coroutines/co-await-void_type.C
new file mode 100644
index 000..0bb8818133e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/co-await-void_type.C
@@ -0,0 +1,44 @@
+//  { dg-additional-options "-std=c++17 -fsyntax-only -w" }
+
+#include 
+
+class resumable {
+public:
+  struct promise_type;
+  using coro_handle = std::coroutine_handle;
+  resumable(coro_handle handle) : handle_(handle) {}
+  resumable(resumable&) = delete;
+  resumable(resumable&&) = delete;
+  bool resume() {
+if (not handle_.done())
+  handle_.resume();
+return not handle_.done();
+  }
+  int recent_val();
+  ~resumable() { handle_.destroy(); }
+private:
+  coro_handle handle_;
+};
+
+struct resumable::promise_type {
+  int value_;
+
+  using coro_handle = std::coroutine_handle;
+  auto get_return_object() {
+return coro_handle::from_promise(*this);
+  }
+  auto initial_suspend() { return std::suspend_always(); }
+  auto final_suspend() { return std::suspend_always(); }
+  void yield_value(int v) { value_ = v; }
+  void unhandled_exception() {}
+};
+
+int resumable::recent_val(){return handle_.promise().value_;}
+
+resumable foo(int n){
+  int x = 1;
+  co_await std::suspend_always();
+  int y = 2;
+  co_yield n + x + y;  // { dg-error "awaitable type 'void' is not a 
structure" }
+}
+





Re: [PATCH Coroutines]Fix ICE when co_awaiting on void type

2020-01-21 Thread Jakub Jelinek
On Tue, Jan 21, 2020 at 10:12:21AM +, Iain Sandoe wrote:
> @@ -633,7 +634,13 @@ build_co_await (location_t loc, tree a, 
> suspend_point_kind suspend_kind)
>else
>  o = a; /* This is most likely about to fail anyway.  */
>  
> -  tree o_type = complete_type_or_else (TREE_TYPE (o), o);
> +  tree o_type = TREE_TYPE (o);
> +  if (o_type && !VOID_TYPE_P (o_type) && !COMPLETE_TYPE_P (o_type))

What is the !COMPLETE_TYPE_P for?  Isn't the point of complete_type_or_else
checking that itself, if it is complete, just return the passed argument, if
it is not, try to complete it and return the passed in argument if
successful and NULL otherwise?

> +o_type = complete_type_or_else (o_type, o);
> +
> +  if (!o_type)
> +return error_mark_node;
> +

Jakub



Re: [PATCH Coroutines]Access promise via actor function's frame pointer argument

2020-01-21 Thread Iain Sandoe
Hi Nathan, Bin,

bin.cheng  wrote:

> On Mon, Jan 20, 2020 at 10:59 PM Iain Sandoe  wrote:
>> Hi Bin,
>> 
>> bin.cheng  wrote:
>> 
>>> By standard, coroutine body should be encapsulated in try-catch block
>>> as following:
>>>  try {
>>>// coroutine body
>>>  } catch(...) {
>>>promise.unhandled_exception();
>>>  }
>>> Given above try-catch block is implemented in the coroutine actor
>>> function called by coroutine ramp function, so the promise should
>>> be accessed via actor function's coroutine frame pointer argument,
>>> rather than the ramp function's coroutine frame variable.
>> 
>> thanks, good catch!
>> it has not triggered for me (including on some more complex test-cases that 
>> are not useable
>> in the testsuite).
>> 
>>> This patch also refactored code to make the fix easy, unfortunately,
>> 
>> see below,
>> 
>>> I failed to reduce a test case from cpproro.
>> 
>> it would be good if we could try to find a reproducer.  I’m happy to try and 
>> throw
>> creduce at a preprocessed file, if you have one.
>> 
>>> gcc/cp
>>> 2020-01-20  Bin Cheng  
>>> 
>>>* coroutines.cc (act_des_fn, wrap_coroutine_body): New.
>>>(morph_fn_to_coro): Call act_des_fn to build actor/destroy decls, as
>>>well as access promise via actor function's frame pointer argument.
>>>Refactor code into above functions.
>>>(build_actor_fn, build_destroy_fn): Use frame pointer argument
>> +  /* We still try to look for the promise method and warn if it's not
>> +present.  */
>> +  tree ueh_meth
>> +   = lookup_promise_method (orig, coro_unhandled_exception_identifier,
>> +loc, /*musthave=*/ false);
>> +  if (!ueh_meth || ueh_meth == error_mark_node)
>> +   warning_at (loc, 0, "no member named %qE in %qT",
>> +   coro_unhandled_exception_identifier,
>> +   get_coroutine_promise_type (orig));
>> +}
>> +  /* Else we don't check and don't care if the method is missing.  */
>> +
>> +  return fnbody;
>> +}
>> 
>> IMO this ^^^ obfuscates the fix, and I don’t think it should be done at the 
>> same time.
> Sure, given we are in this stage, I split the patch and leave refactoring to 
> future.

Thank you, this is easier to see.

>  IMHO, the function is too long so self-contained operations worth outlined 
> function even it's called once.

There are two changes “in the pipeline” that might make an alternate split 
better:
1) There is a change to wrap the await_resume() for initial suspend in a 
try-catch, this probably means
   that it would be better to partition the code that looks up 
unhandled_exception (), since that would then
  probably be used twice.

2) There is on-going discussion about how to guarantee tail-calling in 
symmetric transfers.  At present,
   ISTM that this will need a second wrapper around the whole resume() function.

I would prefer to leave any refactoring until these two things are clarified, 
does that make sense?



Nathan, is this OK for trunk as-is?
thanks
Iain

> Patch updated as attached.
> 
> Thanks,
> bin
> 
> gcc/cp
> 2020-01-20  Bin Cheng  
> * coroutines.cc (act_des_fn): New.
> (morph_fn_to_coro): Call act_des_fn to build actor/destroy decls.
> Access promise via actor function's frame pointer argument.
> (build_actor_fn, build_destroy_fn): Use frame pointer argument.


0001-Use-promise-in-coroutine-frame-in-actor-function.patch
Description: Binary data




Re: [v2] contrib: New remotes structure for vendor and personal refs

2020-01-21 Thread Richard Earnshaw (lists)

On 21/01/2020 01:47, Hans-Peter Nilsson wrote:

From: "Richard Earnshaw (lists)" 
Date: Fri, 17 Jan 2020 12:21:07 +0100



As far as possible, I've made the script automatically restructure any
existing fetch or push lines that earlier versions of the scripts may
have created - the gcc-git-customization.sh script will convert all
vendor refs that it can find, so it is not necessary to re-add any
vendors you've already added.


I fail, using these instructions, trying to create a
vendor-branch named axis/cris-decc0, using git-2.11.0 from
Debian 9.


You might, however, want to run
git remote prune 
after running to clean up any stale upstream-refs that might still be in
your local repo, and then
git fetch vendors/
or
git fetch 
to re-populate the remotes/ structures.


(I did not use gcc-git-customization.sh or git-fetch-vendor.sh before
XX, so there's presumably nothing to clean up.)

I've done
$ ./contrib/gcc-git-customization.sh
and
$ ./contrib/git-fetch-vendor.sh --enable-push axis


Also, for any branch you already have that tracks a personal or vendor
branch upstream, you might need to run
git config branch..remote 

so that merges and pushes go to the right place (I haven't attempted to
automate this last part).

For vendors, the new structure means that

git checkout -b / remotes/vendors//

will correctly set up a remote tracking branch.


On master, doing

$ git checkout -b axis/cris-decc0 remotes/vendors/axis/cris-decc0
fatal: Cannot update paths and switch to branch 'axis/cris-decc0' at the same 
time.
Did you intend to checkout 'remotes/vendors/axis/cris-decc0' which can not be 
resolved as commit?



It's nothing to do with setting up your vendors space, but a consequence 
that you can't track a remote branch that hasn't been created yet; 
you'll see the same thing even for personal branches or for other new 
vendor branches.  I'll need to document that.


Initially, you'll need to create the upstream branch, something like 
(but untested).


# Set up axis vendor area
contrib/git-fetch-vendor.sh --enable-push axis
# Using master, or some other branch you want to base this on.
git checkout -b axis/cris-decc0 master
# create upstream
git push vendors/axis axis/cris-decc0:refs/vendors/axis/heads/cris-decc0

(In theory, you should be able to add '-u' to the above, but 
unfortunately, that doesn't work.  So, after the above)


# re-fetch the new branch
git fetch vendors/asis
git branch --set-upstream remotes/vendors/axis/cris-decc0

Now things should push and pull properly.


My .git/config looks like this after the gcc-descr and
gcc-undescr lines:

[diff "md"]
xfuncname = ^\\(define.*$
[gcc-config]
upstream = origin
user = hp
userpfx = me
[remote "me"]
url = git+ssh://gcc.gnu.org/git/gcc.git
fetch = +refs/users/hp/heads/*:refs/remotes/me/*
fetch = +refs/users/hp/tags/*:refs/tags/me/*
push = refs/heads/me/*:refs/users/hp/heads/*
[remote "vendors/axis"]
url = git+ssh://gcc.gnu.org/git/gcc.git
fetch = +refs/vendors/axis/heads/*:refs/remotes/vendors/axis/*
fetch = +refs/vendors/axis/tags/*:refs/tags/vendors/axis/*
push = refs/heads/axis/*:refs/vendors/axis/heads/*

Bug in script (undiscovered because e.g. everybody else uses an
existing vendor or branch) or PEBKAC?

I'm past git 101, maybe even intermediate, for some definition
thereof, but this refs-configury is way beyond my
error-correction capabilities; I can't tell typos.

I'm about to create a devel/ branch instead, as that seems way
simpler than playing hide-and-seek like this, but that will make
everyone else fetch an additional blob that may be several
kilobytes (compressed).  Probably much larger than this email. :)

brgds, H-P



Hope that helps,

R.


Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-21 Thread Uros Bizjak
On Tue, Jan 21, 2020 at 9:47 AM Uros Bizjak  wrote:
>
> On Mon, Jan 20, 2020 at 10:46 PM H.J. Lu  wrote:
>
> > > > OK. Let's go with this version, but please investigate if we need to
> > > > calculate TLS address in ptr_mode instead of Pmode. Due to quite some
> > > > zero-extension from ptr_mode to Pmode hacks in this area, it looks to
> > > > me that the whole calculation should be performed in ptr_mode (SImode
> > > > in case of x32), and the result zero-extended to Pmode in case when
> > > > Pmode = DImode.
> > > >
> > >
> > > I checked it in.  I will investigate if we can use ptr_mode for TLS.
> >
> > Here is a patch to perform GNU2 TLS address computation in ptr_mode
> > and zero-extend result to Pmode.
>
>  case TLS_MODEL_GLOBAL_DYNAMIC:
> -  dest = gen_reg_rtx (Pmode);
> +  dest = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);
>
> Please put these in their respective arms of "if (TARGET_GNU2_TLS).
>
>  case TLS_MODEL_LOCAL_DYNAMIC:
> -  base = gen_reg_rtx (Pmode);
> +  base = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);
>
> Also here.
>
> A question: Do we need to emit the following part in Pmode?

To answer my own question: Yes. Linker doesn't like SImode relocs fox
x86_64 and for

addl$foo@dtpoff, %eax

errors out with:

pr93319-1a.s: Assembler messages:
pr93319-1a.s:20: Error: relocated field and relocation type differ in signedness

So, the part below is OK, except:

-  tp = get_thread_pointer (Pmode, true);
-  set_unique_reg_note (get_last_insn (), REG_EQUAL,
-   gen_rtx_MINUS (Pmode, tmp, tp));
+  tp = get_thread_pointer (ptr_mode, true);
+  tmp = gen_rtx_MINUS (ptr_mode, tmp, tp);
+  if (GET_MODE (tmp) != Pmode)
+tmp = gen_rtx_ZERO_EXTEND (Pmode, tmp);
+  set_unique_reg_note (get_last_insn (), REG_EQUAL, tmp);

I don't think we should attach this note to the thread pointer
initialization. I have removed this part from the patch, but please
review the decision.

and

-dest = gen_rtx_PLUS (Pmode, tp, dest);
+dest = gen_rtx_PLUS (ptr_mode, tp, dest);

Please leave Pmode here. ptr_mode == Pmode at this point, but Pmode
better documents the mode selection logic.

Also, the tests fail for me with:

/usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
file or directory

so either use __builtin_printf or some other approach that doesn't
need to #include stdio.h.

A patch that implements above changes is attached to the message.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0b8a4b9ee4f0..ffe60baa72ad 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10717,7 +10717,7 @@ ix86_tls_module_base (void)
   if (!ix86_tls_module_base_symbol)
 {
   ix86_tls_module_base_symbol
-   = gen_rtx_SYMBOL_REF (Pmode, "_TLS_MODULE_BASE_");
+   = gen_rtx_SYMBOL_REF (ptr_mode, "_TLS_MODULE_BASE_");
 
   SYMBOL_REF_FLAGS (ix86_tls_module_base_symbol)
|= TLS_MODEL_GLOBAL_DYNAMIC << SYMBOL_FLAG_TLS_SHIFT;
@@ -10748,8 +10748,6 @@ legitimize_tls_address (rtx x, enum tls_model model, 
bool for_mov)
   switch (model)
 {
 case TLS_MODEL_GLOBAL_DYNAMIC:
-  dest = gen_reg_rtx (Pmode);
-
   if (!TARGET_64BIT)
{
  if (flag_pic && !TARGET_PECOFF)
@@ -10763,24 +10761,16 @@ legitimize_tls_address (rtx x, enum tls_model model, 
bool for_mov)
 
   if (TARGET_GNU2_TLS)
{
+ dest = gen_reg_rtx (ptr_mode);
  if (TARGET_64BIT)
-   emit_insn (gen_tls_dynamic_gnu2_64 (Pmode, dest, x));
+   emit_insn (gen_tls_dynamic_gnu2_64 (ptr_mode, dest, x));
  else
emit_insn (gen_tls_dynamic_gnu2_32 (dest, x, pic));
 
- tp = get_thread_pointer (Pmode, true);
-
- /* NB: Since DEST set by tls_dynamic_gnu2_64 is in ptr_mode,
-make sure that PLUS is done in ptr_mode.  */
- if (Pmode != ptr_mode)
-   {
- tp = lowpart_subreg (ptr_mode, tp, Pmode);
- dest = lowpart_subreg (ptr_mode, dest, Pmode);
- dest = gen_rtx_PLUS (ptr_mode, tp, dest);
- dest = gen_rtx_ZERO_EXTEND (Pmode, dest);
-   }
- else
-   dest = gen_rtx_PLUS (Pmode, tp, dest);
+ tp = get_thread_pointer (ptr_mode, true);
+ dest = gen_rtx_PLUS (ptr_mode, tp, dest);
+ if (GET_MODE (dest) != Pmode)
+dest = gen_rtx_ZERO_EXTEND (Pmode, dest);
  dest = force_reg (Pmode, dest);
 
  if (GET_MODE (x) != Pmode)
@@ -10792,6 +10782,7 @@ legitimize_tls_address (rtx x, enum tls_model model, 
bool for_mov)
{
  rtx caddr = ix86_tls_get_addr ();
 
+ dest = gen_reg_rtx (Pmode);
  if (TARGET_64BIT)
{
  rtx rax = gen_rtx_REG (Pmode, AX_REG);
@@ -10815,8 +10806,6 @@ legitimize_tls_address (rtx x, enum tls_model model, 
bool for_mov)
   break;
 
 case TLS_MODEL_LOCAL_DYNAMIC:
-  base = gen_reg_rtx (Pmode);
-
   

Re: [v2] contrib: New remotes structure for vendor and personal refs

2020-01-21 Thread Richard Earnshaw (lists)

On 21/01/2020 10:26, Richard Earnshaw (lists) wrote:

On 21/01/2020 01:47, Hans-Peter Nilsson wrote:

From: "Richard Earnshaw (lists)" 
Date: Fri, 17 Jan 2020 12:21:07 +0100



As far as possible, I've made the script automatically restructure any
existing fetch or push lines that earlier versions of the scripts may
have created - the gcc-git-customization.sh script will convert all
vendor refs that it can find, so it is not necessary to re-add any
vendors you've already added.


I fail, using these instructions, trying to create a
vendor-branch named axis/cris-decc0, using git-2.11.0 from
Debian 9.


You might, however, want to run
    git remote prune 
after running to clean up any stale upstream-refs that might still be in
your local repo, and then
    git fetch vendors/
or
    git fetch 
to re-populate the remotes/ structures.


(I did not use gcc-git-customization.sh or git-fetch-vendor.sh before
XX, so there's presumably nothing to clean up.)

I've done
$ ./contrib/gcc-git-customization.sh
and
$ ./contrib/git-fetch-vendor.sh --enable-push axis


Also, for any branch you already have that tracks a personal or vendor
branch upstream, you might need to run
    git config branch..remote 

so that merges and pushes go to the right place (I haven't attempted to
automate this last part).

For vendors, the new structure means that

    git checkout -b / remotes/vendors//

will correctly set up a remote tracking branch.


On master, doing

$ git checkout -b axis/cris-decc0 remotes/vendors/axis/cris-decc0
fatal: Cannot update paths and switch to branch 'axis/cris-decc0' at 
the same time.
Did you intend to checkout 'remotes/vendors/axis/cris-decc0' which can 
not be resolved as commit?




It's nothing to do with setting up your vendors space, but a consequence 
that you can't track a remote branch that hasn't been created yet; 
you'll see the same thing even for personal branches or for other new 
vendor branches.  I'll need to document that.


Initially, you'll need to create the upstream branch, something like 
(but untested).


# Set up axis vendor area
contrib/git-fetch-vendor.sh --enable-push axis
# Using master, or some other branch you want to base this on.
git checkout -b axis/cris-decc0 master
# create upstream
git push vendors/axis axis/cris-decc0:refs/vendors/axis/heads/cris-decc0

(In theory, you should be able to add '-u' to the above, but 
unfortunately, that doesn't work.  So, after the above)


# re-fetch the new branch
git fetch vendors/asis
git branch --set-upstream remotes/vendors/axis/cris-decc0


--set-upstream-to of course (I said it was untested:)

R.



Now things should push and pull properly.


My .git/config looks like this after the gcc-descr and
gcc-undescr lines:

[diff "md"]
xfuncname = ^\\(define.*$
[gcc-config]
upstream = origin
user = hp
userpfx = me
[remote "me"]
url = git+ssh://gcc.gnu.org/git/gcc.git
fetch = +refs/users/hp/heads/*:refs/remotes/me/*
fetch = +refs/users/hp/tags/*:refs/tags/me/*
push = refs/heads/me/*:refs/users/hp/heads/*
[remote "vendors/axis"]
url = git+ssh://gcc.gnu.org/git/gcc.git
fetch = +refs/vendors/axis/heads/*:refs/remotes/vendors/axis/*
fetch = +refs/vendors/axis/tags/*:refs/tags/vendors/axis/*
push = refs/heads/axis/*:refs/vendors/axis/heads/*

Bug in script (undiscovered because e.g. everybody else uses an
existing vendor or branch) or PEBKAC?

I'm past git 101, maybe even intermediate, for some definition
thereof, but this refs-configury is way beyond my
error-correction capabilities; I can't tell typos.

I'm about to create a devel/ branch instead, as that seems way
simpler than playing hide-and-seek like this, but that will make
everyone else fetch an additional blob that may be several
kilobytes (compressed).  Probably much larger than this email. :)

brgds, H-P



Hope that helps,

R.




Re: [PATCHv2] Change recursive prepare_block_for_update to use a worklist

2020-01-21 Thread Richard Biener
On Tue, Jan 21, 2020 at 10:27 AM  wrote:
>
> From: Andrew Pinski 
>
> Reported as PR 93321, prepare_block_for_update with some huge
> recusive inlining can go past the stack limit. Transforming this
> recursive into worklist improves the stack usage here and we no
> longer seg fault for the testcase.  Note the order we
>
> OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Can you keep the start_bb argument please?  Without it the callers setting
of start_bb and the comment doesn't make much sense.

OK with that change.
Thanks,
Richard.

> ChangeLog:
> * tree-into-ssa.c (prepare_block_for_update_1): Split out from ...
> (prepare_block_for_update): This.  Use a worklist instead of 
> recursiving
> into the function.  Remove bb argument.
> (update_ssa): Update call to prepare_block_for_update.
> ---
>  gcc/tree-into-ssa.c | 61 +++--
>  1 file changed, 48 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
> index c27bf2ce121..9f1e8ece737 100644
> --- a/gcc/tree-into-ssa.c
> +++ b/gcc/tree-into-ssa.c
> @@ -2593,11 +2593,9 @@ mark_use_interesting (tree var, gimple *stmt, 
> basic_block bb,
>  }
>  }
>
> -
> -/* Do a dominator walk starting at BB processing statements that
> -   reference symbols in SSA operands.  This is very similar to
> -   mark_def_sites, but the scan handles statements whose operands may
> -   already be SSA names.
> +/* Processing statements in BB that reference symbols in SSA operands.
> +   This is very similar to mark_def_sites, but the scan handles
> +   statements whose operands may already be SSA names.
>
> If INSERT_PHI_P is true, mark those uses as live in the
> corresponding block.  This is later used by the PHI placement
> @@ -2610,9 +2608,8 @@ mark_use_interesting (tree var, gimple *stmt, 
> basic_block bb,
>that.  */
>
>  static void
> -prepare_block_for_update (basic_block bb, bool insert_phi_p)
> +prepare_block_for_update_1 (basic_block bb, bool insert_phi_p)
>  {
> -  basic_block son;
>edge e;
>edge_iterator ei;
>
> @@ -2694,13 +2691,51 @@ prepare_block_for_update (basic_block bb, bool 
> insert_phi_p)
> }
>  }
>
> -  /* Now visit all the blocks dominated by BB.  */
> -  for (son = first_dom_son (CDI_DOMINATORS, bb);
> -   son;
> -   son = next_dom_son (CDI_DOMINATORS, son))
> -prepare_block_for_update (son, insert_phi_p);
>  }
>
> +/* Do a dominator walk starting at entry block processing statements that
> +   reference symbols in SSA operands.  This is very similar to
> +   mark_def_sites, but the scan handles statements whose operands may
> +   already be SSA names.
> +
> +   If INSERT_PHI_P is true, mark those uses as live in the
> +   corresponding block.  This is later used by the PHI placement
> +   algorithm to make PHI pruning decisions.
> +
> +   FIXME.  Most of this would be unnecessary if we could associate a
> +  symbol to all the SSA names that reference it.  But that
> +  sounds like it would be expensive to maintain.  Still, it
> +  would be interesting to see if it makes better sense to do
> +  that.  */
> +static void
> +prepare_block_for_update (bool insert_phi_p)
> +{
> +  size_t sp = 0;
> +  basic_block *worklist;
> +
> +  /* Allocate the worklist.  */
> +  worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
> +  /* Add the entry BB to the worklist.  */
> +  worklist[sp++] = ENTRY_BLOCK_PTR_FOR_FN (cfun);
> +
> +  while (sp)
> +{
> +  basic_block bb;
> +  basic_block son;
> +
> +  /* Pick a block from the worklist.  */
> +  bb = worklist[--sp];
> +
> +  prepare_block_for_update_1 (bb, insert_phi_p);
> +
> +  /* Now add all the blocks dominated by BB to the worklist.  */
> +  for (son = first_dom_son (CDI_DOMINATORS, bb);
> +  son;
> +  son = next_dom_son (CDI_DOMINATORS, son))
> +   worklist[sp++] = son;
> +}
> +  free (worklist);
> +}
>
>  /* Helper for prepare_names_to_update.  Mark all the use sites for
> NAME as interesting.  BLOCKS and INSERT_PHI_P are as in
> @@ -3392,7 +3427,7 @@ update_ssa (unsigned update_flags)
>  symbols in SSA operands.  Mark interesting blocks and
>  statements and set local live-in information for the PHI
>  placement heuristics.  */
> -  prepare_block_for_update (start_bb, insert_phi_p);
> +  prepare_block_for_update (insert_phi_p);
>
>tree name;
>
> --
> 2.17.1
>


[AArch64] effective_target for aarch64 f64mm asm

2020-01-21 Thread Matthew Malcomson
Commit 9ceec73 introduced intrinsics for the AArch64 FP64 matrix
multiply instructions.  These require binutils support for the same
instructions.
( See https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01234.html for the
testsuite failures this introduced. )

This patch adds a DejaGNU test to ensure this binutils support is there
and uses it in the files that need this test.

NOTE:
I tried to find some way to run the assembly tests if the given version
of binutils is available, but run the compile tests if not.  This is
pretty awkward -- It seems I either have to duplicate all the DejaGNU
comments between two files, or write filename exceptions into the
list of files that aarch64-sve-acle-asm.exp runs tests for.

I decided to not do either, since I figure not running the tests on
older binutils isn't too bad compared to having a bunch more DejaGNU
stuff making the tests harder to read.

Testing Done:
Checked on a cross-compiler that:
Tests running for binutils commit e264b5b7a are listed as UNSUPPORTED.
Tests running for binutils commit 26916852e all pass.

gcc/testsuite/ChangeLog:

2020-01-21  Matthew Malcomson  

* gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Use require
directive.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise.
* lib/target-supports.exp: Add assembly requirement directive.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
index 
7badc75a43ab2009e9406afc04c980fc01834716..6eb94f1ca5fda961bb23f5c4cf66bc5694d26f36
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
index 
dd8a1c53cd0fb7b7acd0b92394f3977382ac26e0..0a77c37ddd5978db765b4bad23f24da82a0aed11
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
index 
30563698310f65060d34be4bef4c57a74ef9d734..65c6d9b02b804a1480cd5014c8f0ca5f534bacbe
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
index 
d4702fa6cc15e9f93751d8579cfecfd37759306e..e3dc9bd51cf93c2f97e4277181e686d0bf53a1ea
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
index 
4604b0b5fbfb716ae814bf88f7acfe8bf0eaa9f5..f3af8e5cc25791a56930618abf5c092d51e94e9b
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acl

[PATCH/commited] Change recursive prepare_block_for_update to use a worklist

2020-01-21 Thread apinski
From: Andrew Pinski 


This is what I committed.

Reported as PR 93321, prepare_block_for_update with some huge
recusive inlining can go past the stack limit. Transforming this
recursive into worklist improves the stack usage here and we no
longer seg fault for the testcase.  Note the order we walk the siblings
change.

ChangeLog:
PR tree-opt/93321
* tree-into-ssa.c (prepare_block_for_update_1): Split out from ...
(prepare_block_for_update): This.  Use a worklist instead of recursing.
---
 gcc/ChangeLog   |  8 ++
 gcc/tree-into-ssa.c | 59 -
 2 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8c17e5992d2..262f0d6506f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2020-01-21  Andrew Pinski  
+
+   PR tree-opt/93321
+   * tree-into-ssa.c (prepare_block_for_update_1): Split out
+   from ...
+   (prepare_block_for_update): This.  Use a worklist instead of
+   recursing.
+
 2020-01-21  Mihail-Calin Ionescu  
 
* gcc/config/arm/arm.c (clear_operation_p):
diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
index c27bf2ce121..6528acac31a 100644
--- a/gcc/tree-into-ssa.c
+++ b/gcc/tree-into-ssa.c
@@ -2593,11 +2593,9 @@ mark_use_interesting (tree var, gimple *stmt, 
basic_block bb,
 }
 }
 
-
-/* Do a dominator walk starting at BB processing statements that
-   reference symbols in SSA operands.  This is very similar to
-   mark_def_sites, but the scan handles statements whose operands may
-   already be SSA names.
+/* Processing statements in BB that reference symbols in SSA operands.
+   This is very similar to mark_def_sites, but the scan handles
+   statements whose operands may already be SSA names.
 
If INSERT_PHI_P is true, mark those uses as live in the
corresponding block.  This is later used by the PHI placement
@@ -2610,9 +2608,8 @@ mark_use_interesting (tree var, gimple *stmt, basic_block 
bb,
   that.  */
 
 static void
-prepare_block_for_update (basic_block bb, bool insert_phi_p)
+prepare_block_for_update_1 (basic_block bb, bool insert_phi_p)
 {
-  basic_block son;
   edge e;
   edge_iterator ei;
 
@@ -2694,13 +2691,51 @@ prepare_block_for_update (basic_block bb, bool 
insert_phi_p)
}
 }
 
-  /* Now visit all the blocks dominated by BB.  */
-  for (son = first_dom_son (CDI_DOMINATORS, bb);
-   son;
-   son = next_dom_son (CDI_DOMINATORS, son))
-prepare_block_for_update (son, insert_phi_p);
 }
 
+/* Do a dominator walk starting at BB processing statements that
+   reference symbols in SSA operands.  This is very similar to
+   mark_def_sites, but the scan handles statements whose operands may
+   already be SSA names.
+
+   If INSERT_PHI_P is true, mark those uses as live in the
+   corresponding block.  This is later used by the PHI placement
+   algorithm to make PHI pruning decisions.
+
+   FIXME.  Most of this would be unnecessary if we could associate a
+  symbol to all the SSA names that reference it.  But that
+  sounds like it would be expensive to maintain.  Still, it
+  would be interesting to see if it makes better sense to do
+  that.  */
+static void
+prepare_block_for_update (basic_block bb, bool insert_phi_p)
+{
+  size_t sp = 0;
+  basic_block *worklist;
+
+  /* Allocate the worklist.  */
+  worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
+  /* Add the BB to the worklist.  */
+  worklist[sp++] = bb;
+
+  while (sp)
+{
+  basic_block bb;
+  basic_block son;
+
+  /* Pick a block from the worklist.  */
+  bb = worklist[--sp];
+
+  prepare_block_for_update_1 (bb, insert_phi_p);
+
+  /* Now add all the blocks dominated by BB to the worklist.  */
+  for (son = first_dom_son (CDI_DOMINATORS, bb);
+  son;
+  son = next_dom_son (CDI_DOMINATORS, son))
+   worklist[sp++] = son;
+}
+  free (worklist);
+}
 
 /* Helper for prepare_names_to_update.  Mark all the use sites for
NAME as interesting.  BLOCKS and INSERT_PHI_P are as in
-- 
2.17.1



[PATCH Coroutines] Handle type deduction of auto and decltype(auto) with indirect_ref expression

2020-01-21 Thread JunMa

Hi
When test gcc with cppcoro, I find case like:

  int& awaitable::await_resume()
  auto x1 = co_await awaitable;
  decltype(auto) x2 = co_await awaitable;

Based on standard, typeof(x1) should be int, typeof(x2) is int&.
However, we get both x1 and x2 as int, this because we donot
consider indirect_ref which wrap await_resume call expression
(convert_from_reference), and it is invoked into type deduction
of auto and decltype(auto).

This patch wrap co_await expression with indirect_ref which should be
same with await_resume expression, and it sink await_resume expression
to real call_expr when replace co_await expression. it fix type deduction
of auto and decltype(auto) in coroutine.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa

gcc/cp
2020-01-21  Jun Ma 

    * coroutines.cc (build_co_await): Wrap co_await with
    indirect_ref when needed.
    (co_await_expander):  Sink to call_expr if await_resume
    is wrapped by indirect_ref.

gcc/testsuite
2020-01-21  Jun Ma 

    * g++.dg/coroutines/co-await-14-return-ref-to-auto.C: Add label.
---
 gcc/cp/coroutines.cc  | 16 +--
 .../torture/co-await-14-return-ref-to-auto.C  | 45 +++
 2 files changed, 58 insertions(+), 3 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/co-await-14-return-ref-to-auto.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index d25c1887c1e..0a8943d8348 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -743,9 +743,16 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
   TREE_VEC_ELT (awaiter_calls, 1) = awsp_call; /* await_suspend().  */
   TREE_VEC_ELT (awaiter_calls, 2) = awrs_call; /* await_resume().  */
 
-  return build5_loc (loc, CO_AWAIT_EXPR, TREE_TYPE (awrs_call), a,
-e_proxy, o, awaiter_calls,
-build_int_cst (integer_type_node, (int) suspend_kind));
+  tree await_expr = build5_loc (loc, CO_AWAIT_EXPR,
+   TREE_TYPE (TREE_TYPE (awrs_func)),
+   a, e_proxy, o, awaiter_calls,
+   build_int_cst (integer_type_node,
+  (int) suspend_kind));
+  /* Wrap co_await_expr.  */
+  if (TREE_CODE (awrs_call) == INDIRECT_REF)
+await_expr = build1_loc (loc, INDIRECT_REF, TREE_TYPE (awrs_call),
+await_expr);
+  return await_expr;
 }
 
 tree
@@ -1503,6 +1510,9 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void 
*d)
   /* This will produce the value (if one is provided) from the co_await
  expression.  */
   tree resume_call = TREE_VEC_ELT (awaiter_calls, 2); /* await_resume().  */
+  if (TREE_CODE (resume_call) == INDIRECT_REF)
+/* Sink to await_resume call_expr.  */
+resume_call = TREE_OPERAND (resume_call, 0);
   switch (stmt_code)
 {
 default: /* not likely to work .. but... */
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/co-await-14-return-ref-to-auto.C 
b/gcc/testsuite/g++.dg/coroutines/torture/co-await-14-return-ref-to-auto.C
new file mode 100644
index 000..0a7c035ef2a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/co-await-14-return-ref-to-auto.C
@@ -0,0 +1,45 @@
+//  { dg-do run }
+
+/* The simplest valued co_await we can do.  */
+
+#include "../coro.h"
+
+// boiler-plate for tests of codegen
+#include "../coro1-ret-int-yield-int.h"
+
+
+coro1
+f ()
+{
+  int t1 = 5;
+  int t2 = 5;
+  auto gX = co_await coro1::suspend_always_intrefprt{t1};
+  if (gX != t1)
+ abort();
+  decltype(auto) gX1 = co_await coro1::suspend_always_intrefprt{t2};
+  if (&gX1 != &t2)
+ abort();
+  co_return t1 + 10;
+}
+
+int main ()
+{
+  PRINT ("main: create coro1");
+  struct coro1 f_coro = f ();
+  if (f_coro.handle.done())
+{
+  PRINT ("main: we should not be 'done' [1]");
+  abort ();
+}
+  PRINT ("main: resuming [1] initial suspend");
+  while (!f_coro.handle.done())
+f_coro.handle.resume();
+  /* we should now have returned with the co_return (15) */
+  if (!f_coro.handle.done())
+{
+  PRINT ("main: we should be 'done' ");
+  abort ();
+}
+  puts ("main: done");
+  return 0;
+}
-- 
2.19.1.3.ge56e4f7



Re: [PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Mark Rutland
Hi Szabolcs,

Answers from a linux dev perspective below.

On Mon, Jan 20, 2020 at 10:53:33AM +, Szabolcs Nagy wrote:
> On 19/01/2020 08:53, Fāng-ruì Sòng via gcc-patches wrote:
> > It'd be great to have some tests, e.g.
> > 
> > 1. -fpatchable-function-entry=0 -mbranch-protection=bti
> > 2. -fpatchable-function-entry=2 -mbranch-protection=bti
> > 
> > I have updated clang to emit   `.Lfunc_begin0: bti c; nop; nop` for case 2.
> > The __patchable_function_entries entry points to .Lfunc_begin0 (bti c).
> > 
> > (The change is not included in the llvm 10.0 branch.)
> 
> i have to ask some linux developers which way they prefer:
> 
> e.g. -fpatchable-function-entry=3,1 is
> 
>  .section __patchable_function_entries
>  .8byte .Lpatch
>  .text
> .Lpatch:
>   nop
> func:
>   nop
>   nop
>   ...
> 
> with bti the code will be emitted as:
> 
> .Lpatch:
>   nop
> func:
>   bti c
>   nop
>   nop
>   ...

That looks good to me.

> but e.g. -fpatchable-function-entry=2,0 has two reasonable
> approaches with bti:
> 
> (a)
> 
> func:
> .Lpatch:
>   bti c
>   nop
>   nop
>   ...
> 
> (b)
> 
> func:
>   bti c
> .Lpatch:
>   nop
>   nop
>   ...

I had assumed (b); that means that .Lpatch consistently points to the
first NOP. To my mental model, that seems more consistent than (a).

However, I can work with either so long as it's consistent.

> i think (a) is more consistent across fancy N,M settings
> (bti is always included into the patch area, user needs
> to know to skip it), but (b) is more compatible with
> existing usage (M=0 is i believe the common setting and
> with that or with M=N the patching code does not need to
> know about bti, existing patching code works unmodified).
> 
> current llvm fix does (a), proposed gcc fix does (b),
> i guess we have to pick one.
>
> (solution (a) is a bit messier in gcc, because currently
> there is no target hook between the emission of .Lpatch
> and the nops, i avoided refactoring that code to get a
> backend only fix that is easy to backport, but (a) is
> possible to do with a bit more changes.)

As above, my weak preference is (b), but I can work with either. I just
need the behaviour to be consistent.

Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
that also ease of implementation? If there isn't a rationale otherwise,
and if LLVM could also do (b), that would be nice from my PoV.

How big is "a bit more changes" for GCC?

Thanks,
Mark.


Re: [Patch,Fortran] PR93309 – permit repeated 'implicit none(external)' in sub-scoping unit

2020-01-21 Thread Bernhard Reutner-Fischer
On 20 January 2020 15:55:23 CET, Tobias Burnus  wrote:
>Using "implicit none" multiple times in a scoping unit is not permitted
>
>– and checked for.
>
>However, using one in the parent name space and re-confirming it in the
>
>current name space is permitted – but was before rejected.


+  if (sym->attr.proc == PROC_UNKNOWN)
+   for (gfc_namespace *ns = sym->ns; ns; ns = ns->parent)
+  if (ns->has_implicit_none_export)
+has_implicit_none_export = true;

s/;/, break;/

If we don't do this for >= -O1 already.

+  if (has_implicit_none_export)
>
>OK for the trunk?

LGTM otherwise,
thanks


[PATCH] tree-optimization/92328 fix value-number with bogus type

2020-01-21 Thread Richard Biener


We were actually value-numbering two entities with different type
the same rather than just having the same representation in the
hashtable.  The following fixes that by wrapping the value in a
to be inserted VIEW_CONVERT_EXPR.

Bootstrapped & tested on x86_64-unknown-linux-gnu, pushed.

Richard.


2020-01-21  Richard Biener  

PR tree-optimization/92328
* tree-ssa-sccvn.c (vn_reference_lookup_3): Preserve
type when value-numbering same-sized store by inserting a
VIEW_CONVERT_EXPR.
(eliminate_dom_walker::eliminate_stmt): When eliminating
a redundant store handle bit-reinterpretation of the same value.

* gcc.dg/torture/pr92328.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/torture/pr92328.c 
b/gcc/testsuite/gcc.dg/torture/pr92328.c
new file mode 100644
index 000..7898b9efe80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr92328.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-pre -Wno-div-by-zero" } */
+
+int nt;
+
+void
+ja (int os)
+{
+  int *ku = &os, *id = &os;
+  unsigned int qr = 0;
+
+  for (;;)
+{
+  if (os == *ku)
+{
+  *id = 0;
+  qr += os != *ku;
+  id = &qr;
+}
+
+  *id &= qr;
+
+  if (os != 0)
+{
+  nt /= 0;
+  ku = &qr;
+}
+}
+}
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 4d1301593d7..0b8ee586139 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -2712,26 +2712,30 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
  || known_eq (ref->size, TYPE_PRECISION (vr->type)))
  && multiple_p (ref->size, BITS_PER_UNIT))
{
- if (known_eq (ref->size, size2))
-   return vn_reference_lookup_or_insert_for_pieces
-   (vuse, get_alias_set (lhs), vr->type, vr->operands,
-SSA_VAL (def_rhs));
- else if (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs))
-  || type_has_mode_precision_p (TREE_TYPE (def_rhs)))
+ tree val = NULL_TREE;
+ if (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs))
+ || type_has_mode_precision_p (TREE_TYPE (def_rhs)))
{
  gimple_match_op op (gimple_match_cond::UNCOND,
  BIT_FIELD_REF, vr->type,
  SSA_VAL (def_rhs),
  bitsize_int (ref->size),
  bitsize_int (offset - offset2));
- tree val = vn_nary_build_or_lookup (&op);
- if (val
- && (TREE_CODE (val) != SSA_NAME
- || ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val)))
-   return vn_reference_lookup_or_insert_for_pieces
-   (vuse, get_alias_set (lhs), vr->type,
-vr->operands, val);
+ val = vn_nary_build_or_lookup (&op);
}
+ else if (known_eq (ref->size, size2))
+   {
+ gimple_match_op op (gimple_match_cond::UNCOND,
+ VIEW_CONVERT_EXPR, vr->type,
+ SSA_VAL (def_rhs));
+ val = vn_nary_build_or_lookup (&op);
+   }
+ if (val
+ && (TREE_CODE (val) != SSA_NAME
+ || ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val)))
+   return vn_reference_lookup_or_insert_for_pieces
+   (vuse, get_alias_set (lhs), vr->type,
+vr->operands, val);
}
  else if (maxsize.is_constant (&maxsizei)
   && maxsizei % BITS_PER_UNIT == 0
@@ -5599,7 +5603,6 @@ eliminate_dom_walker::eliminate_stmt (basic_block b, 
gimple_stmt_iterator *gsi)
   && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
  || is_gimple_min_invariant (gimple_assign_rhs1 (stmt
 {
-  tree val;
   tree rhs = gimple_assign_rhs1 (stmt);
   vn_reference_t vnresult;
   /* ???  gcc.dg/torture/pr91445.c shows that we lookup a boolean
@@ -5640,14 +5643,22 @@ eliminate_dom_walker::eliminate_stmt (basic_block b, 
gimple_stmt_iterator *gsi)
  else
lookup_lhs = NULL_TREE;
}
-  val = NULL_TREE;
+  tree val = NULL_TREE;
   if (lookup_lhs)
val = vn_reference_lookup (lookup_lhs, gimple_vuse (stmt), VN_WALK,
   &vnresult, false);
   if (TREE_CODE (rhs) == SSA_NAME)
rhs = VN_INFO (rhs)->valnum;
   if (val
- && operand_equal_p (val, rhs, 0))
+ && (operand_equal_p (val, rhs, 0)
+ /* Due to the bitfield lookups above we can get bit
+interpretations of the same RHS as values here.  Those
+are redundant as well.  */
+ || (TREE_CODE (val) == SSA

Re: [AArch64] effective_target for aarch64 f64mm asm

2020-01-21 Thread Richard Sandiford
Matthew Malcomson  writes:
> Commit 9ceec73 introduced intrinsics for the AArch64 FP64 matrix
> multiply instructions.  These require binutils support for the same
> instructions.
> ( See https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01234.html for the
> testsuite failures this introduced. )
>
> This patch adds a DejaGNU test to ensure this binutils support is there
> and uses it in the files that need this test.
>
> NOTE:
> I tried to find some way to run the assembly tests if the given version
> of binutils is available, but run the compile tests if not.  This is
> pretty awkward -- It seems I either have to duplicate all the DejaGNU
> comments between two files, or write filename exceptions into the
> list of files that aarch64-sve-acle-asm.exp runs tests for.
>
> I decided to not do either, since I figure not running the tests on
> older binutils isn't too bad compared to having a bunch more DejaGNU
> stuff making the tests harder to read.
>
> Testing Done:
> Checked on a cross-compiler that:
> Tests running for binutils commit e264b5b7a are listed as UNSUPPORTED.
> Tests running for binutils commit 26916852e all pass.
>
> gcc/testsuite/ChangeLog:
>
> 2020-01-21  Matthew Malcomson  
>
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Use require
>   directive.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise.
>   * lib/target-supports.exp: Add assembly requirement directive.
> [...]
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 
> cdee31e24132b591625168ab588e61a3943919b0..c9184728f26918dbaea4bb09ab99df890571069b
>  100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -8987,7 +8987,7 @@ proc check_effective_target_aarch64_tiny { } {
>  # Create functions to check that the AArch64 assembler supports the
>  # various architecture extensions via the .arch_extension pseudo-op.
>  
> -foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve"} {
> +foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve" 
> "f64mm"} {

It would be good to keep this under the 80 char limit, e.g.:

foreach aarch64_ext {"fp" "simd" "crypto" "crc" "lse" "dotprod" "sve"
 "f64mm"} {

(Not that the file is very consistent about doing that.)

OK with that change, thanks.

Richard


Re: [PR 80005] __has_include parsing

2020-01-21 Thread Nathan Sidwell

On 1/20/20 9:01 PM, Jim Wilson wrote:

On Mon, Jan 20, 2020 at 5:44 AM Nathan Sidwell  wrote:

I've pushed this to master, to address 80005

__has_include is funky in that it is macro-like from the POV of #ifdef
...


With this patch, __has_include__ no longer works.  There is a use of
this in the RISC-V glibc port.  I see the docs only mention
__has_include, so maybe this has always been a silent bug in the
RISC-V glibc port?  If glibc is broken I can submit a patch.  This
might cause trouble for other people too though.


tl;dr; glibc is broken.

AFAICT __has_include__ was an undocumented piece of internal 
implementation goop, that didn't really work anyway.  I'm sure it's 
presence has confused users.


I guess I have to add compatibility ...


and with this patch I get
gamma05:2206$ build-install/bin/riscv64-unknown-linux-gnu-gcc -O -S tmp4.c
tmp4.c:1:21: error: missing binary operator before token "("
 1 | #if __has_include__ ()


yup, 'cos as I'm sure you're aware __has_include__ will now be a regular 
identifier, and treated as a literal '0' in this context.


nathan

--
Nathan Sidwell


Re: [PATCH Coroutines 1/2] Add error messages for missing methods of awaitable class

2020-01-21 Thread Nathan Sidwell

On 1/20/20 10:38 PM, JunMa wrote:

在 2020/1/21 上午9:31, JunMa 写道:

在 2020/1/20 下午11:49, Nathan Sidwell 写道:

On 1/20/20 12:18 AM, JunMa wrote:

Hi
This patch adds lookup_awaitable_member, it outputs error messages 
when any of
the await_ready/suspend/resume functions are missing in awaitable 
class.


This patch also add some error check on return value of 
build_co_await since we

may failed to build co_await_expr.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa

gcc/cp
2020-01-20  Jun Ma 

 * coroutines.cc (lookup_awaitable_member): Lookup an 
awaitable member.
 (build_co_await): Use lookup_awaitable_member instead of 
lookup_member.
 (finish_co_yield_expr, finish_co_await_expr): Add error 
check on return

 value of build_co_await.

gcc/testsuite
2020-01-20  Jun Ma 

 * g++.dg/coroutines/coro1-missing-await-method.C: New test.



1) you're mixing two distinct changes, the one about the error 
message and the one about changing error_mark_node.



   tree op = build_co_await (kw, a, CO_AWAIT_SUSPEND_POINT);
-  TREE_SIDE_EFFECTS (op) = 1;
-  SET_EXPR_LOCATION (op, kw);
-
+  if (op != error_mark_node)
+    {
+  TREE_SIDE_EFFECTS (op) = 1;
+  SET_EXPR_LOCATION (op, kw);
+    }
   return op;
 }


these two fragments are ok, as a separate patch.



ok, I'll split it.
+/* Lookup an Awaitable member, which should be await_ready, 
await_suspend

+   or await_resume  */
+
+static tree
+lookup_awaitable_member (tree await_type, tree member_id, 
location_t loc)

+{
+  tree aw_memb
+    = lookup_member (await_type, member_id,
+ /*protect*/ 1, /*want_type*/ 0, tf_warning_or_error);
fwiw '/*foo=*/expr' is the canonical form -- i know Iain wasn't 
consistent, and it looks like I may have missed a few places in review.



ok.

+  if (aw_memb == NULL_TREE || aw_memb == error_mark_node)
+    {
+  error_at (loc, "no member named %qE in %qT", member_id, 
await_type);

+  return error_mark_node;
+    }
+  return aw_memb;
+}


This looks wrong.  lookup_member is being passed tf_warning_or_error, 
so it should already be emitting a diagnostic, for the cases where 
something is found, but is ambiguous or whatever.  So, I think you 
only want a diagnostic here when you get NULL_TREE back.


you are right, I follow with same code of lookup_promise_member, I'll 
update both.


Regards
JunMa

nathan


Hi nathan,
attachment is the updated patch which add error messages for lookup 
awaitable member.


thanks this is ok.  Could you remove the space in '/*protect=*/ 1' and 
similar?  This appear to be one place where our coding style has fewer 
spaces than expected!


nathan

--
Nathan Sidwell


Re: [PATCH Coroutines 2/2] Add error check on return value of build_co_await

2020-01-21 Thread Nathan Sidwell

On 1/20/20 10:44 PM, JunMa wrote:

在 2020/1/21 上午9:31, JunMa 写道:



Regards
JunMa

nathan


Hi nathan,
attachment is the updated patch which check error_mark_node of 
build_co_coawait.


Regards
JunMa

gcc/cp
2020-01-21  Jun Ma 

  * coroutines.cc (finish_co_await_expr): Add error check on return
  value of build_co_await.
  (finish_co_yield_expr,): Ditto.


ok, thanks

nathan

--
Nathan Sidwell


Re: [PATCH Coroutines]Fix ICE when co_awaiting on void type

2020-01-21 Thread Nathan Sidwell

On 1/21/20 5:12 AM, Iain Sandoe wrote:

Hi Nathan,

This corrects my mixup between the error returns for complete_type and
complete_type_or_else, spotted by Jakub,

I have included Bin’s change that makes the coro-specific error message
fire if a void type is provided for this.



ok thanks.

nathan

--
Nathan Sidwell


[PATCH] RX update functions with string constraint

2020-01-21 Thread Darius Galis

Hello,

The following patch updates the setmemsi and rx_setmem functions.
The patch adds the rx_allow_string_insns constraint to these functions
in order to be used only when the string support is enabled.

Regression test is OK, without additional fails, and was tested with the
following command:

make -k check-gcc RUNTESTFLAGS=--target_board=rx-sim

Please let me know if this is OK, Thank you!
Darius

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 280157)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2020-01-16  Darius Galis  
+
+   * config/rx/rx.md (setmemsi): Added rx_allow_string_insns constraint
+   (rx_setmem): Likewise.
+
 2020-01-10  Thomas Schwinge  
 
 	* tree.h (OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT): New definition.

Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 280157)
+++ gcc/config/rx/rx.md (working copy)
@@ -2511,7 +2511,7 @@
(use (match_operand:SI  1 "nonmemory_operand")) ;; Length
(match_operand  3 "immediate_operand")  ;; Align
(unspec_volatile:BLK [(reg:SI 1) (reg:SI 2) (reg:SI 3)] UNSPEC_SETMEM)]
-  ""
+  "rx_allow_string_insns"
   {
 rtx addr = gen_rtx_REG (SImode, 1);
 rtx val  = gen_rtx_REG (QImode, 2);
@@ -2530,7 +2530,7 @@
(unspec_volatile:BLK [(reg:SI 1) (reg:SI 2) (reg:SI 3)] UNSPEC_SETMEM))
(clobber (reg:SI 1))
(clobber (reg:SI 3))]
-  ""
+  "rx_allow_string_insns"
   "sstr.b"
   [(set_attr "length" "2")
(set_attr "timings" "")] ;; The timing is a guesstimate.

/Software Engineer - CyberTHOR Studios Ltd./



Re: [Patch,Fortran] PR93309 – permit repeated 'implicit none(external)' in sub-scoping unit

2020-01-21 Thread Tobias Burnus

Hi Bernhard,

On 1/21/20 12:43 PM, Bernhard Reutner-Fischer wrote:

+  if (sym->attr.proc == PROC_UNKNOWN)
+   for (gfc_namespace *ns = sym->ns; ns; ns = ns->parent)
+  if (ns->has_implicit_none_export)
+has_implicit_none_export = true;

s/;/, break;/

If we don't do this for >= -O1 already.


We don't. All three variants generate for -O0 to -O3 a different 
optimized tree. (I didn't look at RTL or assembler.) But it is also not 
completely clear to me which approach is the best. I think it also 
depends how it is converted into assembler and what the hardware 
actually does.


I was thinking about all three variants and had settled for the shortest 
when sending the patch. The third variant would be:


for (gfc_namespace *ns = sym->ns;
 ns && !has_implicit_none_export;
 ns = ns->parent)
   if (ns->has_implicit_none_export)
 has_implicit_none_export = true;

which may or may not be faster in execution – the conditional assignment 
might be faster, but one has to do the "&&" operation in each step (can 
be AND_EXPR; it does not need to be THEN_IF_EXPR). — However, I find it 
less readable.


At the end, it does not really matter as the loop body and expressions 
are small and the typical nest-depth is probably two (module + module 
procedures) or three (module procedures + BLOCK or host-associated 
procedure) such that an early abort is not really worthwhile. Even with 
more nesting levels, the chance that one has many sym->attr.proc == 
PROC_UNKNOWN in a deeply nested scoping unit should be low. (And if it 
is not, it will be still not the hot loop for that code.)


However, I have now used the version with "break", which is also fine.
r10-6108-gb31f80231df9ce6d9b50c62d28b8d2a4654ef564

Cheers,

Tobias



Re: [PATCH] Make target_clones resolver fn static.

2020-01-21 Thread Martin Liška

On 1/20/20 3:52 PM, Richard Biener wrote:

On Mon, Jan 20, 2020 at 3:46 PM H.J. Lu  wrote:


On Mon, Jan 20, 2020 at 6:41 AM Alexander Monakov  wrote:




On Mon, 20 Jan 2020, H.J. Lu wrote:


Bare IFUNC's don't seem to have this restriction. Why do we want to
constrain target clones this way?



foo's resolver acts as foo.  It should have the same visibility as foo.


What do you mean by that? From the implementation standpoint, there's
two symbols of different type with the same value. There's no problem
allowing one of them have local binding and the other have global binding.

Is there something special about target clones that doesn't come into
play with ifuncs?



I stand corrected.   Resolver should be static and it shouldn't be weak.


Reading the patch again and looking up more context it seems that the resolver
is already static we just mangle it extra when the original function
is public (?)
If so the patch looks quite obvious to me if we use some character not valid
in indetifiers in C but valid in assembly for the resolver decl.


Hello.

I'm sending updated version of the patch where I'm adding a run-time test
for 2 static symbols (with the same name) and it works fine. Moreover
I'm newly setting TREE_PUBLIC (ifunc_alias_decl) = 0 which seems to
work properly.

I tested both x86_64-linux-gnu and ppc64le-linux-gnu.

Martin



Richard.



--
H.J.


>From a3faaced989869867671ceadd89b56fabde225ff Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 16 Jan 2020 10:38:41 +0100
Subject: [PATCH] Make target_clones resolver fn static.

gcc/ChangeLog:

2020-01-17  Martin Liska  

	PR target/93274
	* config/i386/i386-features.c (make_resolver_func):
	Align the code with ppc64 target implementaion.
	We do not need to have gnu_indirect_function
	as a global function.  Drop TREE_PUBLIC on
	ifunc symbol.
	* config/rs6000/rs6000.c (make_resolver_func): Drop
	TREE_PUBLIC on ifunc symbol.

gcc/testsuite/ChangeLog:

2020-01-17  Martin Liska  

	PR target/93274
	* gcc.target/i386/pr81213.c: Adjust to not expect
	a global unique name.
	* gcc.target/i386/pr81213-2.c: New test.
---
 gcc/config/i386/i386-features.c   | 23 ---
 gcc/config/rs6000/rs6000.c| 12 
 gcc/testsuite/gcc.target/i386/pr81213-2.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr81213.c   | 11 +++
 4 files changed, 38 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81213-2.c

diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index e580b26b995..07eca286544 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -2738,26 +2738,17 @@ make_resolver_func (const tree default_decl,
 		const tree ifunc_alias_decl,
 		basic_block *empty_bb)
 {
-  char *resolver_name;
-  tree decl, type, decl_name, t;
+  tree decl, type, t;
 
-  /* IFUNC's have to be globally visible.  So, if the default_decl is
- not, then the name of the IFUNC should be made unique.  */
-  if (TREE_PUBLIC (default_decl) == 0)
-{
-  char *ifunc_name = make_unique_name (default_decl, "ifunc", true);
-  symtab->change_decl_assembler_name (ifunc_alias_decl,
-	  get_identifier (ifunc_name));
-  XDELETEVEC (ifunc_name);
-}
-
-  resolver_name = make_unique_name (default_decl, "resolver", false);
+  /* Make the resolver function static.  The resolver function returns
+ void *.  */
+  tree decl_name = clone_function_name (default_decl, "resolver");
+  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
 
   /* The resolver function should return a (void *). */
   type = build_function_type_list (ptr_type_node, NULL_TREE);
 
   decl = build_fn_decl (resolver_name, type);
-  decl_name = get_identifier (resolver_name);
   SET_DECL_ASSEMBLER_NAME (decl, decl_name);
 
   DECL_NAME (decl) = decl_name;
@@ -2784,6 +2775,9 @@ make_resolver_func (const tree default_decl,
   DECL_COMDAT (decl) = 1;
   make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
 }
+  else
+TREE_PUBLIC (ifunc_alias_decl) = 0;
+
   /* Build result decl and add to function_decl. */
   t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
   DECL_CONTEXT (t) = decl;
@@ -2809,7 +2803,6 @@ make_resolver_func (const tree default_decl,
 
   /* Create the alias for dispatch to resolver here.  */
   cgraph_node::create_same_body_alias (ifunc_alias_decl, decl);
-  XDELETEVEC (resolver_name);
   return decl;
 }
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 127927d9583..97b8ebe4d80 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -23858,6 +23858,18 @@ make_resolver_func (const tree default_decl,
   DECL_INITIAL (decl) = make_node (BLOCK);
   DECL_STATIC_CONSTRUCTOR (decl) = 0;
 
+  if (DECL_COMDAT_GROUP (default_decl)
+  || TREE_PUBLIC (default_decl))
+{
+  /* In this case, each translation unit with a call to this
+	 versioned fun

Re: [PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Szabolcs Nagy
On 21/01/2020 11:34, Mark Rutland wrote:
> Hi Szabolcs,
> 
> Answers from a linux dev perspective below.

thanks.

> On Mon, Jan 20, 2020 at 10:53:33AM +, Szabolcs Nagy wrote:
>> i have to ask some linux developers which way they prefer:
>>
>> e.g. -fpatchable-function-entry=3,1 is
>>
>>  .section __patchable_function_entries
>>  .8byte .Lpatch
>>  .text
>> .Lpatch:
>>   nop
>> func:
>>   nop
>>   nop
>>   ...
>>
>> with bti the code will be emitted as:
>>
>> .Lpatch:
>>   nop
>> func:
>>   bti c
>>   nop
>>   nop
>>   ...
> 
> That looks good to me.
> 
>> but e.g. -fpatchable-function-entry=2,0 has two reasonable
>> approaches with bti:
>>
>> (a)
>>
>> func:
>> .Lpatch:
>>   bti c
>>   nop
>>   nop
>>   ...
>>
>> (b)
>>
>> func:
>>   bti c
>> .Lpatch:
>>   nop
>>   nop
>>   ...
> 
> I had assumed (b); that means that .Lpatch consistently points to the
> first NOP. To my mental model, that seems more consistent than (a).
> 
> However, I can work with either so long as it's consistent.
...
> As above, my weak preference is (b), but I can work with either. I just
> need the behaviour to be consistent.
> 
> Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
> that also ease of implementation? If there isn't a rationale otherwise,
> and if LLVM could also do (b), that would be nice from my PoV.
> 
> How big is "a bit more changes" for GCC?

".Lpatch: nop; nop" is generated by generic code now which
a backend can override, but then i have to copy that logic
(~30 lines) to add something after the .Lpatch, or i have
to change generic code to have a new hook after the .Lpatch:.

to provide more argument for (b) and expand on consistency:

right now bti is not emitted if a function is not called
indirectly (compiler knows this for some local functions)
or bti can be turned off by a target attribute per function.

so a mix of bti and non-bti functions are possible, so
the user patching code has to deal with this (e.g. read
the instructions before patching etc)

in the M>0, M!=N case bti is emitted in the middle of a
patch area (e.g. 3,1 case above), but some functions will
not have bti. i can make it so that if patchable function
entry is requested then bti is always added, irrespective
of any other option or attribute, but that seems to be a
big hammer, without such big hammer solution users will
have a difficult time to deal with the additional bti.

if the patch area is entirely before the function label,
then no changes are needed (M=N), if patch area is after
the function label (M=0) then using fix (b) would allow
simple user code. in other cases user code will be
complicated no matter what. so if linux uses M=0 then i
think (b) is preferable.

hope this helps.


Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-21 Thread Alexander Monakov
On Mon, 20 Jan 2020, Joseph Myers wrote:

> On Mon, 20 Jan 2020, Alexander Monakov wrote:
> 
> > Hi,
> > 
> > we have this paragraph in the documentation that attempts to prohibit 
> > something that is allowed by the language.  Instead, I think we should 
> > say that this generally should work and explain that a problem in GCC 
> > implementation breaks this.
> 
> Is this based on the proposals to adopt a PNVI model (as described in 
> N2362 / N2363 / N2364) for C2x?  Do you have a more detailed analysis of 
> exactly which issues would need changes in GCC to follow the proposed 
> PNVI-ae-udi semantics?  Setting up a meta-bug in Bugzilla with 
> dependencies on such issues might be useful, for example - I know there 
> are existing bugs you've filed or commented on, but it would help to have 
> a list in a single place of issues for implementing PNVI-ae-udi.
> 
> It's not obvious that documentation relating to things that are 
> implementation-defined in existing C standard versions, where detailed 
> memory model questions were not defined, is an appropriate place to 
> discuss questions of how some optimizations might not follow a more 
> precise definition proposed for C2x.

My intent was more basic. I observed that the paragraph can be interpreted as
saying that if you have a cast 'I1 = (intptr_t) P1', then perform some
computations on I1 that do not in any way depend on values of other pointers,
then casting the result back can not point to a different object than P1.
But that is not very helpful, because this is the case where the user might have
used normal pointer arithmetic in the first place. The important cases
involve computations that depend on multiple pointers to unrelated objects.

I think that the case discussed in the proposed patch is already not ambiguous
and does not need further clarifications to the standard.

I also think we should provide the users with rationale when imposing such
restrictions, and, as Sandra noted, offer a way to work around the problem.

However I don't mind dropping the patch if it's not appropriate.  Richard,
can you share your opinion here?

Thanks.
Alexander


[patch] contrib: script to create a new vendor branch

2020-01-21 Thread Richard Earnshaw (lists)
This script is intended to create a new vendor branch.  Doing so is not 
completely obvious if
you are not familiar with the upstream structure, so this takes the pain 
out of getting it right.


It doesn't check out the branch locally, but does set everything up so 
that, if you have push enabled for your vendor branches, then


  git push vendors/ 

will work as expected.

Run the script as

contrib/git-add-vendor-branch.sh / 

the  space must have previously been set up in the way 
git-fetch-vendor.sh expects.


* git-add-vendor-bransh.sh: New file.


diff --git a/contrib/git-add-vendor-branch.sh b/contrib/git-add-vendor-branch.sh
new file mode 100755
index 000..04d39ed6b63
--- /dev/null
+++ b/contrib/git-add-vendor-branch.sh
@@ -0,0 +1,43 @@
+#! /bin/sh -xve
+
+# Create a new upstream vendor branch.
+
+# Usage:
+#  contrib/git-add-vendor-branch.sh / 
+
+usage ()
+{
+echo "Usage:"
+echo "  $0 / "
+echo
+echo " must have already been set up using contrib/git-fetch-vendor.sh"
+exit 1
+}
+
+if [ $# != 2 ]
+then
+usage
+fi
+
+vendor=$(echo "$1" | sed -r "s:([^/]*)/.*$:\1:")
+branch=$(echo "$1" | sed -r "s:[^/]*/(.*)$:\1:")
+start=$2
+
+if [ -z "$vendor" -o -z "$branch" ]
+then
+usage
+fi
+
+# Check that we know about the vendor
+url=$(git config --get "remote.vendors/${vendor}.url"||true)
+if [ -z "$url" ]
+then
+echo "Cannot locate remote data for vendor ${vendor}.  Have you set it up?"
+exit 1
+fi
+
+git branch --no-track ${vendor}/${branch} ${start}
+git push vendors/${vendor} ${vendor}/${branch}:refs/vendors/${vendor}/heads/${branch}
+git fetch -q vendors/${vendor}
+git branch --set-upstream-to=remotes/vendors/${vendor}/${branch} ${vendor}/$branch
+echo "Now ready to check out ${vendor}/${branch}"


Re: [patch] contrib: script to create a new vendor branch

2020-01-21 Thread Richard Earnshaw (lists)

On 21/01/2020 13:29, Richard Earnshaw (lists) wrote:
This script is intended to create a new vendor branch.  Doing so is not 
completely obvious if
you are not familiar with the upstream structure, so this takes the pain 
out of getting it right.


It doesn't check out the branch locally, but does set everything up so 
that, if you have push enabled for your vendor branches, then


   git push vendors/ 



Correction, the branch should be named /, so the push 
should be


git push vendors/ /

For example, for the ARM vendor, the push would be

git push vendors/ARM ARM/

R.


will work as expected.

Run the script as

contrib/git-add-vendor-branch.sh / 

the  space must have previously been set up in the way 
git-fetch-vendor.sh expects.


 * git-add-vendor-bransh.sh: New file.






[PATCH] ipa-profile.c: reset call_sums state within ipa-profile.c (v2; PR 93315)

2020-01-21 Thread David Malcolm
On Sat, 2020-01-18 at 18:42 +0100, Jan Hubicka wrote:
> > > > On Mon, 2020-01-13 at 11:23 +0800, luoxhu wrote:
> > > > On 2020/1/10 19:08, Jan Hubicka wrote:
> > > > > OK. You will need to do the obvious updates for Martin's
> > > > > patch
> > > > > which turned some member functions into static functions.
> > > > > 
> > > > > Honza
> > > > 
> > > > Thanks a lot!  Rebased & updated, will commit below patch
> > > > shortly
> > > > when git push is ready.
> > > > 
> > > > 
> > > > v8:
> > > >  1. Rebase to master with Martin's static function (r280043)
> > > > comments
> > > > merge.
> > > > Boostrap/testsuite/SPEC2017 tested pass on Power8-LE.
> > > >  2. TODO:
> > > > 2.1. C++ devirt for multiple speculative call targets.
> > > > 2.2. ipa-icf ipa_merge_profiles refine with COMDAT inline
> > > > testcase.
> > > 
> > > [...]
> > > 
> > > > diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
> > > 
> > > [...]
> > > 
> > > > +static ipa_profile_call_summaries *call_sums = NULL;
> > > 
> > > [...]
> > >  
> > > >  static void
> > > >  ipa_profile_generate_summary (void)
> > > > @@ -169,7 +261,10 @@ ipa_profile_generate_summary (void)
> > > >basic_block bb;
> > > >  
> > > >hash_table hashtable (10);
> > > > -  
> > > > +
> > > > +  gcc_checking_assert (!call_sums);
> > > > +  call_sums = new ipa_profile_call_summaries (symtab);
> > > > +
> > > 
> > > [...]
> > > 
> > > Unfortunately, this assertion is failing for most of the
> > > testcases in
> > > jit.dg, reducing the number of PASS results in jit.sum from 10473
> > > down
> > > to 3254 in my builds.
> > > 
> > > 
> > > Counter-intuitively, "jit" is not in --enable-languages=all (as
> > > it also
> > > needs --enable-host-shared at configure time, which slows down
> > > the
> > > built compiler code).
> > > 
> > > 
> > > The jit code expects to be able to invoke the compiler code more
> > > than
> > > once within the same process, purging all state.
> > > 
> > > It looks like this "call_sums" state needs deleting and resetting
> > > to
> > > NULL after the compiler has run (or else we'll likely get an ICE
> > > due to
> > > using old symtab/call summaries in subsequent in-process runs of
> > > the
> > > compiler).
> > > 
> > > Is there a natural place to do that within the IPA hooks?  
> > > 
> > > 
> > > Alternatively the following patch seems to fix things (not yet
> > > fully
> > > tested though); hopefully it seems sane.
> > 
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
> > fixes
> > the issue with jit.sum.
> > 
> > OK for master?
> > 
> > gcc/ChangeLog:
> > PR ipa/93315
> > * ipa-profile.c (ipa_profile_c_finalize): New function.
> > * toplev.c (toplev::finalize): Call it.
> > * toplev.h (ipa_profile_c_finalize): New decl.
> 
> Other similar summaries are freed at the end of execute method.  I
> think
> that we probably want to do the same for consistency as well.
> Advantage is that this releases memory prior late
> compilation/streaming.

Thanks; here's an updated, much simpler patch, which does it at the
end of ipa_profile (which is effectively the execute method).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
fixes the issue with jit.sum:

Changes to jit.sum
--

  FAIL: 69->0 (-69)
  PASS: 3254->10471 (+7217)
  UNRESOLVED: 1->0 (-1)

OK for master?

> I think for all these summaries we have leak for -flto compilation
> where
> we do not call execute methods and thus we do not free the summaries.
> Is this problem for jit?
> 
> Honza

If we do, then, if I understand correctly, this would only affect
someone who tried to use libgccjit to generate .o files with -flto,
repeatedly, within a single process.  I don't know of anyone doing
that, and if that's broken, that would be a separate, pre-existing,
bug, I think.

Dave

gcc/ChangeLog:
PR ipa/93315
* ipa-profile.c (ipa_profile): Delete call_sums and set it to
NULL on exit.
---
 gcc/ipa-profile.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
index 03272f20987..a69ba0c373a 100644
--- a/gcc/ipa-profile.c
+++ b/gcc/ipa-profile.c
@@ -1023,6 +1023,9 @@ ipa_profile (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
 symtab->dump (dump_file);
 
+  delete call_sums;
+  call_sums = NULL;
+
   return 0;
 }
 
-- 
2.21.0



Re: [PATCH][ARM] Correctly set SLOW_BYTE_ACCESS

2020-01-21 Thread Wilco Dijkstra
ping (updated comment to use the same wording as the AArch64 version on trunk)

Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
bitfields by their declared type, which results in better codegeneration
on practically any target.  So set it correctly to 1 on Arm.

As a result we generate much better code for bitfields:

typedef struct
{
  int x : 2, y : 8, z : 2;
} X;

int bitfield (X *p)
{
  return p->x + p->y + p->z;
}


Before:
ldrbr3, [r0]@ zero_extendqisi2
ldrhr2, [r0]
ldrbr0, [r0, #1]@ zero_extendqisi2
sbfxr3, r3, #0, #2
sbfxr2, r2, #2, #8
sbfxr0, r0, #2, #2
sxtab   r3, r2, r3
sxtab   r0, r3, r0
bx  lr

After:
ldr r0, [r0]
sbfxr3, r0, #0, #2
sbfxr2, r0, #2, #8
sbfxr0, r0, #10, #2
sxtab   r3, r2, r3
add r0, r0, r3
bx  lr

Bootstrap OK, OK for commit?

ChangeLog:
2019-09-11  Wilco Dijkstra  

* config/arm/arm.h (SLOW_BYTE_ACCESS): Set to 1.

--
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
e07cf03538c5bb23e3285859b9e44a627b6e9ced..998139ce759d5829b7f868367d4263df9d0e12d9
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1956,8 +1956,8 @@ enum arm_auto_incmodes
((arm_arch4 || (MODE) == QImode) ? ZERO_EXTEND  \
 : ((BYTES_BIG_ENDIAN && (MODE) == HImode) ? SIGN_EXTEND : UNKNOWN)))
 
-/* Nonzero if access to memory by bytes is slow and undesirable.  */
-#define SLOW_BYTE_ACCESS 0
+/* Enable wide bitfield accesses for more efficient bitfield code.  */
+#define SLOW_BYTE_ACCESS 1
 
 /* Immediate shift counts are truncated by the output routines (or was it
the assembler?).  Shift counts in a register are truncated by ARM.  Note



Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2020-01-21 Thread Wilco Dijkstra
ping

Hi,

While code hoisting generally improves codesize, it can affect performance
negatively. Benchmarking shows it doesn't help SPEC and negatively affects
embedded benchmarks. Since the impact is relatively small with -O2 and mainly
affects -O3, the simplest option is to disable code hoisting for -O3 and higher.

OK for commit?

ChangeLog:
2019-11-26  Wilco Dijkstra  

PR tree-optimization/80155
* common/config/arm/arm-common.c (arm_option_optimization_table):
Disable -fcode-hoisting with -O3.
--

diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 
b761d3abd670a144a593c4b410b1e7fbdcb52f56..3e11f21b7dd76cc071b645c32a6fdb4a92511279
 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -39,6 +39,8 @@ static const struct default_options 
arm_option_optimization_table[] =
 /* Enable section anchors by default at -O1 or higher.  */
 { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
 { OPT_LEVELS_FAST, OPT_fsched_pressure, NULL, 1 },
+/* Disable code hoisting with -O3 or higher.  */
+{ OPT_LEVELS_3_PLUS, OPT_fcode_hoisting, NULL, 0 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };


Re: [PATCH Coroutines 1/2] Add error messages for missing methods of awaitable class

2020-01-21 Thread JunMa

在 2020/1/21 下午8:06, Nathan Sidwell 写道:

On 1/20/20 10:38 PM, JunMa wrote:

在 2020/1/21 上午9:31, JunMa 写道:

在 2020/1/20 下午11:49, Nathan Sidwell 写道:

On 1/20/20 12:18 AM, JunMa wrote:

Hi
This patch adds lookup_awaitable_member, it outputs error messages 
when any of
the await_ready/suspend/resume functions are missing in awaitable 
class.


This patch also add some error check on return value of 
build_co_await since we

may failed to build co_await_expr.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa

gcc/cp
2020-01-20  Jun Ma 

 * coroutines.cc (lookup_awaitable_member): Lookup an 
awaitable member.
 (build_co_await): Use lookup_awaitable_member instead of 
lookup_member.
 (finish_co_yield_expr, finish_co_await_expr): Add error 
check on return

 value of build_co_await.

gcc/testsuite
2020-01-20  Jun Ma 

 * g++.dg/coroutines/coro1-missing-await-method.C: New test.



1) you're mixing two distinct changes, the one about the error 
message and the one about changing error_mark_node.



   tree op = build_co_await (kw, a, CO_AWAIT_SUSPEND_POINT);
-  TREE_SIDE_EFFECTS (op) = 1;
-  SET_EXPR_LOCATION (op, kw);
-
+  if (op != error_mark_node)
+    {
+  TREE_SIDE_EFFECTS (op) = 1;
+  SET_EXPR_LOCATION (op, kw);
+    }
   return op;
 }


these two fragments are ok, as a separate patch.



ok, I'll split it.
+/* Lookup an Awaitable member, which should be await_ready, 
await_suspend

+   or await_resume  */
+
+static tree
+lookup_awaitable_member (tree await_type, tree member_id, 
location_t loc)

+{
+  tree aw_memb
+    = lookup_member (await_type, member_id,
+ /*protect*/ 1, /*want_type*/ 0, tf_warning_or_error);
fwiw '/*foo=*/expr' is the canonical form -- i know Iain wasn't 
consistent, and it looks like I may have missed a few places in 
review.



ok.

+  if (aw_memb == NULL_TREE || aw_memb == error_mark_node)
+    {
+  error_at (loc, "no member named %qE in %qT", member_id, 
await_type);

+  return error_mark_node;
+    }
+  return aw_memb;
+}


This looks wrong.  lookup_member is being passed 
tf_warning_or_error, so it should already be emitting a diagnostic, 
for the cases where something is found, but is ambiguous or 
whatever.  So, I think you only want a diagnostic here when you get 
NULL_TREE back.


you are right, I follow with same code of lookup_promise_member, 
I'll update both.


Regards
JunMa

nathan


Hi nathan,
attachment is the updated patch which add error messages for lookup 
awaitable member.


thanks this is ok.  Could you remove the space in '/*protect=*/ 1' and 
similar?  This appear to be one place where our coding style has fewer 
spaces than expected!



ok, I'll update it.

Regards
JunMa

nathan





Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-21 Thread Richard Biener
On Tue, 21 Jan 2020, Alexander Monakov wrote:

> On Mon, 20 Jan 2020, Joseph Myers wrote:
> 
> > On Mon, 20 Jan 2020, Alexander Monakov wrote:
> > 
> > > Hi,
> > > 
> > > we have this paragraph in the documentation that attempts to prohibit 
> > > something that is allowed by the language.  Instead, I think we should 
> > > say that this generally should work and explain that a problem in GCC 
> > > implementation breaks this.
> > 
> > Is this based on the proposals to adopt a PNVI model (as described in 
> > N2362 / N2363 / N2364) for C2x?  Do you have a more detailed analysis of 
> > exactly which issues would need changes in GCC to follow the proposed 
> > PNVI-ae-udi semantics?  Setting up a meta-bug in Bugzilla with 
> > dependencies on such issues might be useful, for example - I know there 
> > are existing bugs you've filed or commented on, but it would help to have 
> > a list in a single place of issues for implementing PNVI-ae-udi.
> > 
> > It's not obvious that documentation relating to things that are 
> > implementation-defined in existing C standard versions, where detailed 
> > memory model questions were not defined, is an appropriate place to 
> > discuss questions of how some optimizations might not follow a more 
> > precise definition proposed for C2x.
> 
> My intent was more basic. I observed that the paragraph can be interpreted as
> saying that if you have a cast 'I1 = (intptr_t) P1', then perform some
> computations on I1 that do not in any way depend on values of other pointers,
> then casting the result back can not point to a different object than P1.
> But that is not very helpful, because this is the case where the user might 
> have
> used normal pointer arithmetic in the first place. The important cases
> involve computations that depend on multiple pointers to unrelated objects.
> 
> I think that the case discussed in the proposed patch is already not ambiguous
> and does not need further clarifications to the standard.
> 
> I also think we should provide the users with rationale when imposing such
> restrictions, and, as Sandra noted, offer a way to work around the problem.
> 
> However I don't mind dropping the patch if it's not appropriate.  Richard,
> can you share your opinion here?

Let me say a few things here.

First I always thought the only thing C guarantees is that you can
convert a pointer to an integer and back (the same value!) then you
get the same pointer if the integer type has sufficient size.  The
C standard nowhere specifies semantics of the case where you modify
the integer "representation" and then convert that to a pointer.
But that may have changed.

Second.  Fact is RTL does not distinguish between pointers and
integers and thus any attempt to make something valid when you
use integers and invalid when you use pointers is not going to work.

Third.  GCC tracks pointed to things through integers.  That means it
 a) assumes the integer representation cannot "leave" an object as
the current documentation states
 b) it handles "mixing" pointers to different objects in their
integer representation correctly as to that it assumes the
result converted to a pointer might point to either object
the tracking implementation details also track pieces and funneling
pointers through FP values.

Fourth.  That PNVI (I assume it's the whole pointer-provenance stuff)
wants to get the "best" of both which can never be done since a compiler
needs to have a way to be conservative - in this area it's conflicting
conservative treatment which is impossible.

Fifth.  GCC has issues involving equivalences and when (implicitely)
turning control into data dependences.  The C standard seems to
have conflicting (to GCC) wording with regard to equivalences
of pointers vs. integers (well, not actually sure).  For all these
issues I usually point at the second issue above because trying
to plug holes without addressing that very thing isn't going to work
(unless you want to plug the "hole" for integer code as well).

And yeah, a meta-bug or even better a single WIKI page listing
issues (and maybe referencing the various partially redundant and
partially mixing different issues bugzillas) would be helpful here.
To be the entry-point in bugzilla is PR49330, see comment#7 for
the fundamental RTL representation issue.

Thanks,
Richard.


[PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Richard Earnshaw (lists)

[updated, following some comments from Gerald, main differences are
 slight tweaks to the html markup and changing "email" to "e-mail"]

This patch proposes some new (additional) rules for email subject lines
when contributing to GCC.  The goal is to make sure that, as far as
possible, the subject for a patch will form a good summary when the
message is committed to the repository if applied with 'git am'.  Where
possible, I've tried to align these rules with those already in
use for glibc, so that the differences are minimal and only where
necessary.

Some things that differ from existing practice (at least by some people)
are:

- Use ':' rather than '[]'
  - This is more git friendly and works with 'git am'.
- Put bug numbers at the end of the line rather than the beginning.
  - The bug number is useful, but not as useful as the brief summary.
Also, use the shortened form, as the topic part is more usefully
conveyed in the proper topic field (see above).


diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index 042ff069..861f7e5e 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -249,13 +249,98 @@ that ChangeLog entries may be included as part of the patch and diffs
 representing new files may be omitted, especially if large, since they
 can be accessed directly from the repository.) 
 
+E-mail subject lines
+
+Your contribution e-mail subject line will become the first line of
+the commit message for your patch.
+
+A high-quality e-mail subject line for a contribution contains the
+following elements:
+
+
+  A classifier
+  Component tags
+  An optional series identifier
+  A brief summary
+  An optional bug number
+
+
+Classifier
+
+The classifier identifies the type of contribution, for example a
+patch, an RFC (request for comments) or a committed patch (where
+approval is not necessary.  The classifier should be written in upper
+case and surrounded with square brackets.  This is the only component
+of the e-mail subject line that will not appear in the commit itself.
+The classifier may optionally contain a version number (vN) and
+a series marker (N/M).  Examples are:
+
+
+  [PATCH] - a single patch
+  [PATCH v2] - the second version of a single patch
+  [PATCH 3/7] - the third patch in a series of seven
+patches
+  [RFC] - a point of discussion, may contain a patch
+  [COMMITTED] - a patch that has already been committed.
+
+
+Component tags
+
+A component tag is a short identifier that identifies the part of
+the compiler being modified.  This highlights to the relevant
+maintainers that the patch may need their attention.  Multiple
+components may be listed if necessary.  Each component tag should be
+followed by a colon.  For example,
+
+
+  libstdc++:
+  combine:
+  vax: testsuite:
+
+
+Series identifier
+
+The series identifier is optional and is only relevant if a number of
+patches are needed in order to effect an overall change.  It should be
+a short string that identifies the series (it is common to all
+patches) and should be followed by a single dash surrounded by white
+space.
+
+Brief summary
+
+The brief summary encapsulates in a few words the intent of the
+change.  For example: cleanup check_field_decls.
+
+Bug number
+
+If your patch fixes a bug in the compiler for which there is an
+existing PR number the bug number should be stated.  Use the
+short-form variant (PRn) without the bugzilla component
+identifier.
+
+Other messages
+
+Some large patch sets benefit from an introductory e-mail that
+provides more context for the patch series and describes how the
+patches have been broken up to provide for review.  The convention is
+that such messages should follow the same format as described above,
+but the patch number should be set to zero, for example: [PATCH
+0/7].  Remember that the introductory message will not be
+committed with the patches themselves, so it should not contain any
+important information that is not also covered in the individual
+patches.  If you send a summary e-mail with a series it is a good idea
+to send the patches as follow-ups (essentially replies) to your
+initial message so that mail software can group the messages
+together.
+
 Pinging patches, Getting patches applied
 
 If you do not receive a response to a patch that you have submitted
 within two weeks or so, it may be a good idea to chase it by sending a
-follow-up email to the same list(s).  Patches can occasionally fall through
-the cracks.  Please be sure to include a brief summary of the patch and the
-URL of the entry in the mailing list archive of the original submission.
+follow-up e-mail to the same list(s).  Patches can occasionally fall
+through the cracks.  Please be sure to include a brief summary of the
+patch and the URL of the entry in the mailing list archive of the
+original submission.
 
 If you do not have write access and a patch of yours has been approved,
 but not committed, please advise the approver of that fact.  You may w

[PATCH] Remove dead variable.

2020-01-21 Thread Martin Liška

Hi.

The patch is obvious and approved by Segher.

Martin

gcc/ChangeLog:

2020-01-21  Martin Liska  

* config/rs6000/rs6000.c (common_mode_defined): Remove
unused variable.
---
 gcc/config/rs6000/rs6000.c | 3 ---
 1 file changed, 3 deletions(-)


diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9405816fa5d..fc36bb6714b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -101,9 +101,6 @@
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
-/* Set to nonzero once AIX common-mode calls have been defined.  */
-static GTY(()) int common_mode_defined;
-
 #ifdef USING_ELFOS_H
 /* Counter for labels which are to be placed in .fixup.  */
 int fixuplabelno = 0;



[PATCH] testsuite: Add target/xfail argument to check-function-bodies

2020-01-21 Thread Richard Sandiford
check-function-bodies allows individual function tests to be
annotated with target/xfail selectors, but sometimes it's
useful to have the same selector for all functions.

Tested on aarch64-linux-gnu and x86_64-linux-gnu, and with a following
ILP32 patch that needs it.  OK to install?

Richard


2020-01-21  Richard Sandiford  

gcc/
* doc/sourcebuild.texi (check-function-bodies): Add an
optional target/xfail selector.

gcc/testsuite/
* lib/scanasm.exp (check-function-bodies): Add an optional
target/xfail selector.
---
 gcc/doc/sourcebuild.texi  |  2 +-
 gcc/testsuite/lib/scanasm.exp | 18 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index c18a630343e..1025802206b 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2723,7 +2723,7 @@ assembly output.
 Passes if @var{symbol} is not defined as a hidden symbol in the test's
 assembly output.
 
-@item check-function-bodies @var{prefix} @var{terminator} [@var{option}]
+@item check-function-bodies @var{prefix} @var{terminator} [@var{option} [@{ 
target/xfail @var{selector} @}]]
 Looks through the source file for comments that give the expected assembly
 output for selected functions.  Each line of expected output starts with the
 prefix string @var{prefix} and the expected output for a function as a whole
diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index ec91788edef..5ca58d40420 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -604,7 +604,7 @@ proc check_function_body { functions name body_regexp } {
 
 # Check the implementations of functions against expected output.  Used as:
 #
-# { dg-do { check-function-bodies PREFIX TERMINATOR[ OPTION] } }
+# { dg-do { check-function-bodies PREFIX TERMINATOR[ OPTION[ SELECTOR]] } }
 #
 # See sourcebuild.texi for details.
 
@@ -612,11 +612,11 @@ proc check-function-bodies { args } {
 if { [llength $args] < 2 } {
error "too few arguments to check-function-bodies"
 }
-if { [llength $args] > 3 } {
+if { [llength $args] > 4 } {
error "too many arguments to check-function-bodies"
 }
 
-if { [llength $args] == 3 } {
+if { [llength $args] >= 3 } {
set required_flag [lindex $args 2]
 
upvar 2 dg-extra-tool-flags extra_tool_flags
@@ -631,6 +631,16 @@ proc check-function-bodies { args } {
}
 }
 
+set xfail_all 0
+if { [llength $args] >= 4 } {
+   switch [dg-process-target [lindex $args 3]] {
+   "S" { }
+   "N" { return }
+   "F" { set xfail_all 1 }
+   "P" { }
+   }
+}
+
 set testcase [testname-for-summary]
 # The name might include a list of options; extract the file name.
 set filename [lindex $testcase 0]
@@ -694,7 +704,7 @@ proc check-function-bodies { args } {
}
} elseif { [string equal -length $terminator_len $line $terminator] } {
if { ![string equal $selector "N"] } {
-   if { [string equal $selector "F"] } {
+   if { $xfail_all || [string equal $selector "F"] } {
setup_xfail "*-*-*"
}
set testname "$testcase check-function-bodies $function_name"


[PATCH v2][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Szabolcs Nagy
v2:
- emit bti based on feedback from Richard Sandiford
  (dont copy varasm logic).
- add testcases.
- kept bti outside the patch area if possible, i.e. option (b)
  in earlier discussion.

This fix does not update the documentation of the generic
option, I think some text would be useful there about patch
area layout with indirect branch hardening (but i wanted
to keep this fix target specific).

gcc/ChangeLog:

2020-01-21  Szabolcs Nagy  

* config/aarch64/aarch64.c (aarch64_declare_function_name): Set
cfun->machine->label_is_assembled.
(aarch64_print_patchable_function_entry): New.
(TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
* config/aarch64/aarch64.h (struct machine_function): New field,
label_is_assembled.

gcc/testsuite/ChangeLog:

2020-01-21  Szabolcs Nagy  

* gcc.target/aarch64/pr92424-1.c: New test.
* gcc.target/aarch64/pr92424-2.c: New test.
* gcc.target/aarch64/pr92424-3.c: New test.
From 65a60c0dc4318a33e0a0e0a6573084d84bd18a88 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy 
Date: Wed, 15 Jan 2020 12:23:40 +
Subject: [PATCH] [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with
 BTI

This is a workaround that emits a BTI after the function label if that
is followed by a patch area. We try to remove the BTI that follows the
patch area (this may fail e.g. if the first instruction is a PACIASP).

So before this commit -fpatchable-function-entry=3,1 with bti generates

.section __patchable_function_entries
.8byte .LPFE
.text
  .LPFE:
nop
  foo:
nop
nop
bti c // or paciasp
...

and after this commit

.section __patchable_function_entries
.8byte .LPFE
.text
  .LPFE:
nop
  foo:
bti c
nop
nop
// may be paciasp
...

and with -fpatchable-function-entry=1 (M=0) the code now is

  foo:
bti c
.section __patchable_function_entries
.8byte .LPFE
.text
  .LPFE:
nop
// may be paciasp
...

There is a new bti insn in the middle of the patchable area users need
to be aware of unless M=0 (patch area is after the new bti) or M=N
(patch area is before the label, no new bti). Note: bti is not added to
all functions consistently (it can be turned off per function using a
target attribute or the compiler may detect that the function is never
called indirectly), so if bti is inserted in the middle of a patch area
then user code needs to deal with detecting it.

Tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	* config/aarch64/aarch64.c (aarch64_declare_function_name): Set
	cfun->machine->label_is_assembled.
	(aarch64_print_patchable_function_entry): New.
	(TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
	* config/aarch64/aarch64.h (struct machine_function): New field,
	label_is_assembled.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pr92424-1.c: New test.
	* gcc.target/aarch64/pr92424-2.c: New test.
	* gcc.target/aarch64/pr92424-3.c: New test.
---
 gcc/config/aarch64/aarch64.c |  31 +
 gcc/config/aarch64/aarch64.h |   1 +
 gcc/testsuite/gcc.target/aarch64/pr92424-1.c | 122 +++
 gcc/testsuite/gcc.target/aarch64/pr92424-2.c |  12 ++
 gcc/testsuite/gcc.target/aarch64/pr92424-3.c |  12 ++
 5 files changed, 178 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr92424-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr92424-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr92424-3.c

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e40750380cc..ef037e226a7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -18123,6 +18123,34 @@ aarch64_declare_function_name (FILE *stream, const char* name,
   /* Don't forget the type directive for ELF.  */
   ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
   ASM_OUTPUT_LABEL (stream, name);
+
+  cfun->machine->label_is_assembled = true;
+}
+
+/* Implement PRINT_PATCHABLE_FUNCTION_ENTRY.  Check if the patch area is after
+   the function label and emit a BTI if necessary.  */
+
+void
+aarch64_print_patchable_function_entry (FILE *file,
+	unsigned HOST_WIDE_INT patch_area_size,
+	bool record_p)
+{
+  if (cfun->machine->label_is_assembled
+  && aarch64_bti_enabled ()
+  && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
+{
+  /* Remove the BTI that follows the patch area and insert a new BTI
+	 before the patch area right after the function label.  */
+  rtx_insn *insn = next_real_nondebug_insn (get_insns ());
+  if (insn
+	  && INSN_P (insn)
+	  && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
+	  && XINT (PATTERN (insn), 1) == UNSPECV_BTI_C)
+	delete_insn (insn);
+  asm_fprintf (file, "\thint\t34 // bti c\n");
+}
+
+  default_print_patchable_function_entry (file, patch_area_size, record_p);
 }
 
 /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */
@@ -21970,6 +21998,9 @@ aarch64

Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-21 Thread Alexander Monakov
On Tue, 21 Jan 2020, Richard Biener wrote:

> Fourth.  That PNVI (I assume it's the whole pointer-provenance stuff)
> wants to get the "best" of both which can never be done since a compiler
> needs to have a way to be conservative - in this area it's conflicting
> conservative treatment which is impossible.

This paragraph is unclear, I don't immediately see what the conflicting goals
are. The rest is clear enough given the previous discussions I saw.

Did you mean the restriction that you cannot do arithmetic involving two
integers based on pointers, get a value corresponding to one of them,
cast it back and get a pointer suitable for accessing either of two
originally pointed-to objects? I don't see that as a conflict because
it places a restriction on users, not the compiler.

Alexander


Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Jakub Jelinek
On Tue, Jan 21, 2020 at 02:52:00PM +, Richard Earnshaw (lists) wrote:
> [updated, following some comments from Gerald, main differences are
>  slight tweaks to the html markup and changing "email" to "e-mail"]
> 
> This patch proposes some new (additional) rules for email subject lines
> when contributing to GCC.  The goal is to make sure that, as far as
> possible, the subject for a patch will form a good summary when the
> message is committed to the repository if applied with 'git am'.  Where
> possible, I've tried to align these rules with those already in
> use for glibc, so that the differences are minimal and only where
> necessary.
> 
> Some things that differ from existing practice (at least by some people)
> are:
> 
> - Use ':' rather than '[]'
>   - This is more git friendly and works with 'git am'.
> - Put bug numbers at the end of the line rather than the beginning.
>   - The bug number is useful, but not as useful as the brief summary.
> Also, use the shortened form, as the topic part is more usefully
> conveyed in the proper topic field (see above).

Some examples would be useful I'd say, e.g. it is unclear in what way you
want the PR number to be appended, shall it be
something: whatever words describe it PR12345
or
something: whatever words describe it (PR12345)
or
something: whatever words describe it: PR12345
or
something: whatever words describe it [PR12345]
or something else?

Also, it would be nice to stress that the PR long form should be in the
ChangeLog and somewhere on the later lines of the commit message, I don't
think we pick up the shorter form from the first line when it short form (I
could be wrong, but e.g.
https://gcc.gnu.org/g:865257c447cc50f5819e9b53da83145f3c36c305
commit didn't make it into bugzilla).

Jakub



[PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Martin Liška

Hi.

The patch strips '#' in filenames used for Makefile.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-01-21  Martin Liska  

PR driver/93057
* lto-wrapper.c (prune_filename_for_make): Prune
characters like '#'.
---
 gcc/lto-wrapper.c | 11 +++
 1 file changed, 11 insertions(+)


diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index fe8f292f877..f2504cc5b4f 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1241,6 +1241,16 @@ jobserver_active_p (void)
 	  && is_valid_fd (wfd));
 }
 
+/* Prune invalid characters that can't be in name due to Makefile syntax.  */
+
+static void
+prune_filename_for_make (char *filename)
+{
+  for (unsigned i = 0; i < strlen (filename); i++)
+if (filename[i] == '#')
+  filename[i] = '_';
+}
+
 /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. */
 
 static void
@@ -1780,6 +1790,7 @@ cont:
 	  obstack_grow (&env_obstack, input_name, strlen (input_name) - 2);
 	  obstack_grow (&env_obstack, ".ltrans.o", sizeof (".ltrans.o"));
 	  output_name = XOBFINISH (&env_obstack, char *);
+	  prune_filename_for_make (output_name);
 
 	  /* Adjust the dumpbase if the linker output file was seen.  */
 	  if (linker_output)



Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Jan Hubicka
> Hi.
> 
> The patch strips '#' in filenames used for Makefile.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2020-01-21  Martin Liska  
> 
>   PR driver/93057
>   * lto-wrapper.c (prune_filename_for_make): Prune
>   characters like '#'.

I think this is not enough - you need to take into consideration all
special characters used by make and bash, such as $ and others...

Honza
> ---
>  gcc/lto-wrapper.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> 

> diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> index fe8f292f877..f2504cc5b4f 100644
> --- a/gcc/lto-wrapper.c
> +++ b/gcc/lto-wrapper.c
> @@ -1241,6 +1241,16 @@ jobserver_active_p (void)
> && is_valid_fd (wfd));
>  }
>  
> +/* Prune invalid characters that can't be in name due to Makefile syntax.  */
> +
> +static void
> +prune_filename_for_make (char *filename)
> +{
> +  for (unsigned i = 0; i < strlen (filename); i++)
> +if (filename[i] == '#')
> +  filename[i] = '_';
> +}
> +
>  /* Execute gcc. ARGC is the number of arguments. ARGV contains the 
> arguments. */
>  
>  static void
> @@ -1780,6 +1790,7 @@ cont:
> obstack_grow (&env_obstack, input_name, strlen (input_name) - 2);
> obstack_grow (&env_obstack, ".ltrans.o", sizeof (".ltrans.o"));
> output_name = XOBFINISH (&env_obstack, char *);
> +   prune_filename_for_make (output_name);
>  
> /* Adjust the dumpbase if the linker output file was seen.  */
> if (linker_output)
> 



Re: [PATCH] ipa-profile.c: reset call_sums state within ipa-profile.c (v2; PR 93315)

2020-01-21 Thread Jan Hubicka
> 
> If we do, then, if I understand correctly, this would only affect
> someone who tried to use libgccjit to generate .o files with -flto,
> repeatedly, within a single process.  I don't know of anyone doing
> that, and if that's broken, that would be a separate, pre-existing,
> bug, I think.

Yes, i think we can play with that incrementally especially if someone
tries to use -flto with JIT setup (which by itself looks like bit of
overkill but perhaps things like offloading or so could make this
meaningful).

Honza
> 
> Dave
> 
> gcc/ChangeLog:
>   PR ipa/93315
>   * ipa-profile.c (ipa_profile): Delete call_sums and set it to
>   NULL on exit.
> ---
>  gcc/ipa-profile.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
> index 03272f20987..a69ba0c373a 100644
> --- a/gcc/ipa-profile.c
> +++ b/gcc/ipa-profile.c
> @@ -1023,6 +1023,9 @@ ipa_profile (void)
>if (dump_file && (dump_flags & TDF_DETAILS))
>  symtab->dump (dump_file);
>  
> +  delete call_sums;
> +  call_sums = NULL;
> +
>return 0;
>  }
>  
> -- 
> 2.21.0
> 


Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Andreas Schwab
On Jan 21 2020, Martin Liška wrote:

> diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> index fe8f292f877..f2504cc5b4f 100644
> --- a/gcc/lto-wrapper.c
> +++ b/gcc/lto-wrapper.c
> @@ -1241,6 +1241,16 @@ jobserver_active_p (void)
> && is_valid_fd (wfd));
>  }
>  
> +/* Prune invalid characters that can't be in name due to Makefile syntax.  */
> +
> +static void
> +prune_filename_for_make (char *filename)
> +{
> +  for (unsigned i = 0; i < strlen (filename); i++)

That's quadratic runtime.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Martin Liška

On 1/21/20 4:15 PM, Andreas Schwab wrote:

On Jan 21 2020, Martin Liška wrote:


diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index fe8f292f877..f2504cc5b4f 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1241,6 +1241,16 @@ jobserver_active_p (void)
  && is_valid_fd (wfd));
  }
  
+/* Prune invalid characters that can't be in name due to Makefile syntax.  */

+
+static void
+prune_filename_for_make (char *filename)
+{
+  for (unsigned i = 0; i < strlen (filename); i++)


That's quadratic runtime.

Andreas.



Thanks for heads up.

Martin
>From 88b3817dc377ecf030300078514eb5749c361406 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 21 Jan 2020 16:03:36 +0100
Subject: [PATCH] Prune invalid filename due to makefile syntax.

gcc/ChangeLog:

2020-01-21  Martin Liska  

	PR driver/93057
	* lto-wrapper.c (prune_filename_for_make): Prune
	characters like '#'.
---
 gcc/lto-wrapper.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index fe8f292f877..81d0fdb2410 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1241,6 +1241,17 @@ jobserver_active_p (void)
 	  && is_valid_fd (wfd));
 }
 
+/* Prune invalid characters that can't be in name due to Makefile syntax.  */
+
+static void
+prune_filename_for_make (char *filename)
+{
+  unsigned length = strlen (filename);
+  for (unsigned i = 0; i < length; i++)
+if (filename[i] == '#')
+  filename[i] = '_';
+}
+
 /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. */
 
 static void
@@ -1780,6 +1791,7 @@ cont:
 	  obstack_grow (&env_obstack, input_name, strlen (input_name) - 2);
 	  obstack_grow (&env_obstack, ".ltrans.o", sizeof (".ltrans.o"));
 	  output_name = XOBFINISH (&env_obstack, char *);
+	  prune_filename_for_make (output_name);
 
 	  /* Adjust the dumpbase if the linker output file was seen.  */
 	  if (linker_output)
-- 
2.24.1



[PATCH] PR c++/91476 - anon-namespace reference temp clash between TUs.

2020-01-21 Thread Jason Merrill
The problem in the PR was that make_temporary_var_for_ref_to_temp ran before
determine_visibility, so when we copied the linkage of the reference
variable it had not yet been restricted by its anonymous namespace context,
so the temporary wrongly ended up with TREE_PUBLIC set.  The natural
solution is to run determine_visibility earlier.  But that needs to happen
after maybe_commonize_var increases the linkage of some local variables, and
on targets without weak symbol support, that function does different things
based on the results of check_initializer, which is what calls
make_temporary_var_for_ref_to_temp.  To break this circular dependency I'm
calling maybe_commonize_var early, and then again later if the target
doesn't support weak symbols.

It also occurred to me that make_temporary_var_for_ref_to_temp wasn't
handling DECL_VISIBILITY at all, and verified that we were doing the wrong
thing.  So I've combined the linkage-copying code from there and two other
places.

Tested x86_64-pc-linux-gnu, applying to trunk.

* decl2.c (copy_linkage): Factor out of get_guard.
* call.c (make_temporary_var_for_ref_to_temp): Use it.
* decl.c (cp_finish_decomp): Use it.
(cp_finish_decl): determine_visibility sooner.
---
 gcc/cp/cp-tree.h  |  1 +
 gcc/cp/call.c |  9 +
 gcc/cp/decl.c | 33 +++--
 gcc/cp/decl2.c| 36 ---
 .../g++.dg/ext/visibility/ref-temp1.C | 17 +
 5 files changed, 55 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 890d5a27350..77bcf046608 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6636,6 +6636,7 @@ extern bool mark_used (tree);
 extern bool mark_used  (tree, tsubst_flags_t);
 extern void finish_static_data_member_decl (tree, tree, bool, tree, int);
 extern tree cp_build_parm_decl (tree, tree, tree);
+extern void copy_linkage   (tree, tree);
 extern tree get_guard  (tree);
 extern tree get_guard_cond (tree, bool);
 extern tree set_guard  (tree);
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index d47747117b9..aacd961faa1 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -11973,14 +11973,7 @@ make_temporary_var_for_ref_to_temp (tree decl, tree 
type)
 GR and suffixed with a sequence number mangled using the usual rules
 for a seq-id. Temporaries are numbered with a pre-order, depth-first,
 left-to-right walk of the complete initializer.  */
-
-  TREE_STATIC (var) = TREE_STATIC (decl);
-  TREE_PUBLIC (var) = TREE_PUBLIC (decl);
-  if (vague_linkage_p (decl))
-   comdat_linkage (var);
-
-  CP_DECL_THREAD_LOCAL_P (var) = CP_DECL_THREAD_LOCAL_P (decl);
-  set_decl_tls_model (var, DECL_TLS_MODEL (decl));
+  copy_linkage (var, decl);
 
   tree name = mangle_ref_init_variable (decl);
   DECL_NAME (var) = name;
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 28a79029d92..47ff7eea88f 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7657,6 +7657,14 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   TREE_READONLY (decl) = 0;
 }
 
+  /* This needs to happen before extend_ref_init_temps.  */
+  if (VAR_OR_FUNCTION_DECL_P (decl))
+{
+  if (VAR_P (decl))
+   maybe_commonize_var (decl);
+  determine_visibility (decl);
+}
+
   if (VAR_P (decl))
 {
   duration_kind dk = decl_storage_duration (decl);
@@ -7786,12 +7794,11 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   if (VAR_P (decl))
{
  layout_var_decl (decl);
- maybe_commonize_var (decl);
+ if (!flag_weak)
+   /* Check again now that we have an initializer.  */
+   maybe_commonize_var (decl);
}
 
-  /* This needs to happen after the linkage is set. */
-  determine_visibility (decl);
-
   if (var_definition_p && TREE_STATIC (decl))
{
  /* If a TREE_READONLY variable needs initialization
@@ -8328,23 +8335,7 @@ cp_finish_decomp (tree decl, tree first, unsigned int 
count)
}
  if (!processing_template_decl)
{
- TREE_PUBLIC (v[i]) = TREE_PUBLIC (decl);
- TREE_STATIC (v[i]) = TREE_STATIC (decl);
- DECL_COMMON (v[i]) = DECL_COMMON (decl);
- DECL_COMDAT (v[i]) = DECL_COMDAT (decl);
- if (TREE_STATIC (v[i]))
-   {
- CP_DECL_THREAD_LOCAL_P (v[i])
-   = CP_DECL_THREAD_LOCAL_P (decl);
- set_decl_tls_model (v[i], DECL_TLS_MODEL (decl));
- if (DECL_ONE_ONLY (decl))
-   make_decl_one_only (v[i], cxx_comdat_gr

Re: [PATCH] PR c++/91476 - anon-namespace reference temp clash between TUs.

2020-01-21 Thread Jason Merrill

On 1/21/20 10:17 AM, Jason Merrill wrote:

The problem in the PR was that make_temporary_var_for_ref_to_temp ran before
determine_visibility, so when we copied the linkage of the reference
variable it had not yet been restricted by its anonymous namespace context,
so the temporary wrongly ended up with TREE_PUBLIC set.  The natural
solution is to run determine_visibility earlier.  But that needs to happen
after maybe_commonize_var increases the linkage of some local variables, and
on targets without weak symbol support, that function does different things
based on the results of check_initializer, which is what calls
make_temporary_var_for_ref_to_temp.  To break this circular dependency I'm
calling maybe_commonize_var early, and then again later if the target
doesn't support weak symbols.

It also occurred to me that make_temporary_var_for_ref_to_temp wasn't
handling DECL_VISIBILITY at all, and verified that we were doing the wrong
thing.  So I've combined the linkage-copying code from there and two other
places.

Tested x86_64-pc-linux-gnu, applying to trunk.

* decl2.c (copy_linkage): Factor out of get_guard.
* call.c (make_temporary_var_for_ref_to_temp): Use it.
* decl.c (cp_finish_decomp): Use it.
(cp_finish_decl): determine_visibility sooner.


And a much simpler patch for GCC 9, that doesn't address visibility:
commit 1669a68c844ec91dff4e9054327f2f5234e1614e
Author: Jason Merrill 
Date:   Mon Jan 20 14:09:03 2020 -0500

PR c++/91476 - anon-namespace reference temp clash between TUs.

* call.c (make_temporary_var_for_ref_to_temp): Clear TREE_PUBLIC
if DECL is in the anonymous namespace.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index bc182e2f6da..8e14e89d5d4 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -11424,6 +11424,8 @@ make_temporary_var_for_ref_to_temp (tree decl, tree type)
 
   TREE_STATIC (var) = TREE_STATIC (decl);
   TREE_PUBLIC (var) = TREE_PUBLIC (decl);
+  if (decl_anon_ns_mem_p (decl))
+	TREE_PUBLIC (var) = 0;
   if (vague_linkage_p (decl))
 	comdat_linkage (var);
 
diff --git a/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C b/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C
new file mode 100644
index 000..b56bb52a73c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C
@@ -0,0 +1,11 @@
+// PR c++/91476
+// Test that hidden and internal visibility propagates to reference temps.
+
+// { dg-final { scan-assembler-not "(weak|globl)\[^\n\]*_ZGRN12_GLOBAL__N_13fooE_" } }
+namespace { const int &foo = 1; }
+
+const void *volatile p;
+int main()
+{
+  p = &foo;
+}


Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-21 Thread Jan Hubicka
> Ok, after personal discussion with Honza that I had,
> we should be more conservative and prune only
> a run-time TOP N counters before we merge them.
> The patch does that. Can you please Honza test me the patch of Firefox?
> 
> Thanks,
> Martin

For a record, we ran tests of Firefox. It does not fix the problem with
the particular counter I referred to in the PR, but it does improve the
stats Martin collected:

> 
> $ ./stats.py before
> == Stats for before ==
> stats for indirect_call:
>   total: 160441
>   invalid: 930
>   tracked values:
> 0 values:   134841 times (84.04%)
> 1 values:20085 times (12.52%)
> 2 values: 3098 times (1.93%)
> 3 values:  956 times (0.60%)
> 4 values:  531 times (0.33%)
> 
> stats for topn:
>   total: 167061
>   invalid: 999
>   tracked values:
> 0 values:   141082 times (84.45%)
> 1 values:20280 times (12.14%)
> 2 values: 3174 times (1.90%)
> 3 values:  974 times (0.58%)
> 4 values:  552 times (0.33%)
> 
> $ ./stats.py after
> == Stats for after ==
> stats for indirect_call:
>   total: 160441
>   invalid: 309
>   tracked values:
> 0 values:   134945 times (84.11%)
> 1 values:21129 times (13.17%)
> 2 values: 2950 times (1.84%)
> 3 values:  837 times (0.52%)
> 4 values:  271 times (0.17%)
> 
> stats for topn:
>   total: 167061
>   invalid: 326
>   tracked values:
> 0 values:   141207 times (84.52%)
> 1 values:21356 times (12.78%)
> 2 values: 3033 times (1.82%)
> 3 values:  857 times (0.51%)
> 4 values:  282 times (0.17%)
> 
> $ == Stats for hack ==
> stats for indirect_call:
>   total: 160441
>   invalid: 0
>   tracked values:
> 0 values:   134852 times (84.05%)
> 1 values:20089 times (12.52%)
> 2 values: 3090 times (1.93%)
> 3 values:  954 times (0.59%)
> 4 values: 1456 times (0.91%)
> 
> stats for topn:
>   total: 167061
>   invalid: 0
>   tracked values:
> 0 values:   141093 times (84.46%)
> 1 values:20285 times (12.14%)
> 2 values: 3164 times (1.89%)
> 3 values:  972 times (0.58%)
> 4 values: 1547 times (0.93%)
> 

Here 
- before is current mainline,
- after is with patch applied
- hack is w/o reproducible profile merging (i.e. what GCC 9 and ealrier
  does)

So the patch reduces number of discarded values to 1/3 which is clear
improvement.  Incrementally I think we could consider command line
option specifying degree of reproducibility we expect from profile (i.e.
no reproducibility, reproducibility for single threaded train run,
reproducibility for threaded train run).

If distro build takes seriously FDO and reproducibility it will
eventually run into the limitation for single-threaded programs and we
will need to do reproducible runs for multithread apps which means
putting indirect call profiles into thread local storage I think, and
giving up on time profiling.

I think non-reproducible merging could be implemneted easier atop of
current infrastructure by simply adding
HIST_TYPE_INDIR_CALL_REPRODUCIBLE, HIST_TYPE_INDR_CALL_NONRERPODUCIBGL
and HIST_TYPE_TOPN_VALUE_REPRODUCIBLE,
HIST_TYPE_TOPN_VALUE_NONREPRODUCIBLE and use appropriate merging
function (i.e. one which invalidates full counter versus one whic does
not)

I wonder if we want to teach one of our autotesters to verify that
profiledbootstrap is reproducible? I guess we could-re-use our exisitng
bootstrap comparsion, we just need to produce two sets of train data and
then do the comparsion.
> Martin

> From 7fe1e6a59139ae00cefd1f5edf082d428952203e Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Wed, 15 Jan 2020 15:39:55 +0100
> Subject: [PATCH] Smart relaxation of TOP N counter.
> 
> ---
>  gcc/profile.c |  9 +-
>  libgcc/libgcov-driver.c   | 58 ++-
>  libgcc/libgcov-profiler.c | 27 --
>  3 files changed, 77 insertions(+), 17 deletions(-)
> 
> diff --git a/gcc/profile.c b/gcc/profile.c
> index 6a2de21c3bd..1a6caf75a61 100644
> --- a/gcc/profile.c
> +++ b/gcc/profile.c
> @@ -863,7 +863,14 @@ compute_value_histograms (histogram_values values, 
> unsigned cfg_checksum,
>  
>if (hist->type == HIST_TYPE_TOPN_VALUES
> || hist->type == HIST_TYPE_INDIR_CALL)
> - sort_hist_values (hist);
> + {
> +   /* Each count value is multiplied by GCOV_TOPN_VALUES.  */
> +   if (hist->hvalue.counters[2] != -1)
> + for (unsigned i = 0; i < GCOV_TOPN_VALUES; i++)
> +   hist->hvalue.counters[2 * i + 2] /= GCOV_TOPN_VALUES;

I would use RDIV for consistency with rest of profile code scaling.

> +
> +   sort_hist_values (hist);
> + }
>  
>/* Time profiler counter is not related to any statement,
>   so that we have to read the counter and set the value to
> diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
> index 1c07761b7f1..b71b0e3edac 100644
> --- a/libgcc/libg

Re: [PATCH] ipa-profile.c: reset call_sums state within ipa-profile.c (v2; PR 93315)

2020-01-21 Thread David Malcolm
On Tue, 2020-01-21 at 16:10 +0100, Jan Hubicka wrote:
> > If we do, then, if I understand correctly, this would only affect
> > someone who tried to use libgccjit to generate .o files with -flto,
> > repeatedly, within a single process.  I don't know of anyone doing
> > that, and if that's broken, that would be a separate, pre-existing,
> > bug, I think.
> 
> Yes, i think we can play with that incrementally especially if
> someone
> tries to use -flto with JIT setup (which by itself looks like bit of
> overkill but perhaps things like offloading or so could make this
> meaningful).
> 
> Honza
> > Dave

Thanks.  Is the patch OK?

> > gcc/ChangeLog:
> > PR ipa/93315
> > * ipa-profile.c (ipa_profile): Delete call_sums and set it to
> > NULL on exit.
> > ---
> >  gcc/ipa-profile.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
> > index 03272f20987..a69ba0c373a 100644
> > --- a/gcc/ipa-profile.c
> > +++ b/gcc/ipa-profile.c
> > @@ -1023,6 +1023,9 @@ ipa_profile (void)
> >if (dump_file && (dump_flags & TDF_DETAILS))
> >  symtab->dump (dump_file);
> >  
> > +  delete call_sums;
> > +  call_sums = NULL;
> > +
> >return 0;
> >  }
> >  
> > -- 
> > 2.21.0
> > 



Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Richard Earnshaw (lists)

On 21/01/2020 15:04, Jakub Jelinek wrote:

On Tue, Jan 21, 2020 at 02:52:00PM +, Richard Earnshaw (lists) wrote:

[updated, following some comments from Gerald, main differences are
  slight tweaks to the html markup and changing "email" to "e-mail"]

This patch proposes some new (additional) rules for email subject lines
when contributing to GCC.  The goal is to make sure that, as far as
possible, the subject for a patch will form a good summary when the
message is committed to the repository if applied with 'git am'.  Where
possible, I've tried to align these rules with those already in
use for glibc, so that the differences are minimal and only where
necessary.

Some things that differ from existing practice (at least by some people)
are:

- Use ':' rather than '[]'
   - This is more git friendly and works with 'git am'.
- Put bug numbers at the end of the line rather than the beginning.
   - The bug number is useful, but not as useful as the brief summary.
 Also, use the shortened form, as the topic part is more usefully
 conveyed in the proper topic field (see above).


Some examples would be useful I'd say, e.g. it is unclear in what way you
want the PR number to be appended, shall it be
something: whatever words describe it PR12345
or
something: whatever words describe it (PR12345)
or
something: whatever words describe it: PR12345
or
something: whatever words describe it [PR12345]
or something else?


Glibc use "[BZ #]" - obviously BZ becomes PR, but after that, I'm 
not too worried.  I'd be happy with [PR #], but if folk want 
something else, please say so quickly...


I'll prepare an edit for that...



Also, it would be nice to stress that the PR long form should be in the
ChangeLog and somewhere on the later lines of the commit message, I don't
think we pick up the shorter form from the first line when it short form (I
could be wrong, but e.g.
https://gcc.gnu.org/g:865257c447cc50f5819e9b53da83145f3c36c305
commit didn't make it into bugzilla).



Yes, good point.

Thanks.

R.


Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Martin Liška

On 1/21/20 4:08 PM, Jan Hubicka wrote:

I think this is not enough - you need to take into consideration all
special characters used by make and bash, such as $ and others...


Hm, you are right. Do you have a reasonable list which we should support?
Or should we leave this known limitation?

Martin


Re: [PATCH] ipa-profile.c: reset call_sums state within ipa-profile.c (v2; PR 93315)

2020-01-21 Thread Jan Hubicka
> On Tue, 2020-01-21 at 16:10 +0100, Jan Hubicka wrote:
> > > If we do, then, if I understand correctly, this would only affect
> > > someone who tried to use libgccjit to generate .o files with -flto,
> > > repeatedly, within a single process.  I don't know of anyone doing
> > > that, and if that's broken, that would be a separate, pre-existing,
> > > bug, I think.
> > 
> > Yes, i think we can play with that incrementally especially if
> > someone
> > tries to use -flto with JIT setup (which by itself looks like bit of
> > overkill but perhaps things like offloading or so could make this
> > meaningful).
> > 
> > Honza
> > > Dave
> 
> Thanks.  Is the patch OK?
Yes, i meant to apporve it :)
Thanks for looking into this.

Honza
> 
> > > gcc/ChangeLog:
> > >   PR ipa/93315
> > >   * ipa-profile.c (ipa_profile): Delete call_sums and set it to
> > >   NULL on exit.
> > > ---
> > >  gcc/ipa-profile.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
> > > index 03272f20987..a69ba0c373a 100644
> > > --- a/gcc/ipa-profile.c
> > > +++ b/gcc/ipa-profile.c
> > > @@ -1023,6 +1023,9 @@ ipa_profile (void)
> > >if (dump_file && (dump_flags & TDF_DETAILS))
> > >  symtab->dump (dump_file);
> > >  
> > > +  delete call_sums;
> > > +  call_sums = NULL;
> > > +
> > >return 0;
> > >  }
> > >  
> > > -- 
> > > 2.21.0
> > > 
> 


Re: [PATCH v2][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Richard Sandiford
Szabolcs Nagy  writes:
> v2:
> - emit bti based on feedback from Richard Sandiford
>   (dont copy varasm logic).
> - add testcases.
> - kept bti outside the patch area if possible, i.e. option (b)
>   in earlier discussion.
>
> This fix does not update the documentation of the generic
> option, I think some text would be useful there about patch
> area layout with indirect branch hardening (but i wanted
> to keep this fix target specific).

Thanks for the update.  Looks great to me, and given Mark's response,
I agree we should go ahead with this as-is rather than try to change
the position of the BTI.

> gcc/ChangeLog:
>
> 2020-01-21  Szabolcs Nagy  
>
>   * config/aarch64/aarch64.c (aarch64_declare_function_name): Set
>   cfun->machine->label_is_assembled.
>   (aarch64_print_patchable_function_entry): New.
>   (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
>   * config/aarch64/aarch64.h (struct machine_function): New field,
>   label_is_assembled.
>
> gcc/testsuite/ChangeLog:
>
> 2020-01-21  Szabolcs Nagy  
>
>   * gcc.target/aarch64/pr92424-1.c: New test.
>   * gcc.target/aarch64/pr92424-2.c: New test.
>   * gcc.target/aarch64/pr92424-3.c: New test.

OK.  Same in principle for the backport too, but check-function-bodies is
only available on master.

Richard


Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Jan Hubicka
> On 1/21/20 4:08 PM, Jan Hubicka wrote:
> > I think this is not enough - you need to take into consideration all
> > special characters used by make and bash, such as $ and others...
> 
> Hm, you are right. Do you have a reasonable list which we should support?
No - I guess one needs to dig into manual, try them out or just assume
that all letters in filenams except for a-z, 0-9, _, -, ., ',' and
perhaps more common caracters needs to be taken out?  It is used only to
make more meaningful temp filenames, right?
> Or should we leave this known limitation?
I would say we want to fix that... Even GCC gets idea to put # info
filenames and I am sure except for HHVM there are others.

Honza
> 
> Martin


Fix updating of call_stmt_site_hash

2020-01-21 Thread Jan Hubicka
Hi,
this patch fixes ICE causes by call stmt site hash going out of sync.  For
speculative edges it is assumed to contain a direct call so if we are
removing it hashtable needs to be updated.  I realize that the code is ugly
but I will leave cleanup for next stage1.

Bootstrapped/regtested x86_64-linux. This patch makes it possible to build
Firefox again.

PR lto/93318
* cgraph.c (cgraph_edge::resolve_speculation,
cgraph_edge::redirect_call_stmt_to_callee): Fix update of
call_stmt_site_hash.
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 187f6ed30ba..f7ebcc917d1 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1248,7 +1248,22 @@ cgraph_edge::resolve_speculation (cgraph_edge *edge, 
tree callee_decl)
   else
 e2->callee->remove_symbol_and_inline_clones ();
   if (edge->caller->call_site_hash)
-cgraph_update_edge_in_call_site_hash (edge);
+{
+  /* We always maintain direct edge in the call site hash, if one
+exists.  */
+  if (!edge->num_speculative_call_targets_p ())
+   cgraph_update_edge_in_call_site_hash (edge);
+  else
+   {
+ cgraph_edge *e;
+ for (e = edge->caller->callees;
+  e->call_stmt != edge->call_stmt
+  || e->lto_stmt_uid != edge->lto_stmt_uid;
+  e = e->next_callee)
+   ;
+ cgraph_update_edge_in_call_site_hash (e);
+   }
+}
   return edge;
 }
 
@@ -1414,7 +1429,20 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge 
*e)
  /* Indirect edges are not both in the call site hash.
 get it updated.  */
  if (e->caller->call_site_hash)
-   cgraph_update_edge_in_call_site_hash (e2);
+   {
+ if (!e2->num_speculative_call_targets_p ())
+   cgraph_update_edge_in_call_site_hash (e2);
+ else
+   {
+ cgraph_edge *e;
+ for (e = e2->caller->callees;
+  e->call_stmt != e2->call_stmt
+  || e->lto_stmt_uid != e2->lto_stmt_uid;
+  e = e->next_callee)
+   ;
+ cgraph_update_edge_in_call_site_hash (e);
+   }
+   }
  pop_cfun ();
  /* Continue redirecting E to proper target.  */
}


Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Jakub Jelinek
On Tue, Jan 21, 2020 at 03:33:22PM +, Richard Earnshaw (lists) wrote:
> > Some examples would be useful I'd say, e.g. it is unclear in what way you
> > want the PR number to be appended, shall it be
> > something: whatever words describe it PR12345
> > or
> > something: whatever words describe it (PR12345)
> > or
> > something: whatever words describe it: PR12345
> > or
> > something: whatever words describe it [PR12345]
> > or something else?
> 
> Glibc use "[BZ #]" - obviously BZ becomes PR, but after that, I'm not
> too worried.  I'd be happy with [PR #], but if folk want something else,
> please say so quickly...

[PR 12345] or [PR #12345] is bad, because the bugzilla won't underline it,
it needs to be either PR12345 word, or PR component/12345 .

Jakub



Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Richard Earnshaw (lists)

On 21/01/2020 15:39, Jakub Jelinek wrote:

On Tue, Jan 21, 2020 at 03:33:22PM +, Richard Earnshaw (lists) wrote:

Some examples would be useful I'd say, e.g. it is unclear in what way you
want the PR number to be appended, shall it be
something: whatever words describe it PR12345
or
something: whatever words describe it (PR12345)
or
something: whatever words describe it: PR12345
or
something: whatever words describe it [PR12345]
or something else?


Glibc use "[BZ #]" - obviously BZ becomes PR, but after that, I'm not
too worried.  I'd be happy with [PR #], but if folk want something else,
please say so quickly...


[PR 12345] or [PR #12345] is bad, because the bugzilla won't underline it,
it needs to be either PR12345 word, or PR component/12345 .

Jakub



ok, lets go with [PR] then.

R.


[PATCH] remove bogus asserts in expr.c

2020-01-21 Thread stefan
If running "make check-gcc" with -mcpu=m68040 you get an internal compiler
error during the combine pass in expr.c since there are checks at lines
4107/4108 which seem to be not reasonable.

It combines 

(insn 8 5 9 2 (parallel [
(set (reg:SI 41)
(udiv:SI (reg/v:SI 38 [ x ])
(reg/v:SI 39 [ y ])))
(set (reg:SI 42)
(umod:SI (reg/v:SI 38 [ x ])
(reg/v:SI 39 [ y ])))
]) "../gcc/gcc/testsuite/gcc.c-torture/execute/20021120-3.c":14:3
493 {*divu}
 (expr_list:REG_UNUSED (reg:SI 42)
(nil)))
(insn 9 8 10 2 (set (mem:SI (pre_dec:SI (reg/f:SI 15 %sp)) [2  S4 A16])
(reg:SI 41))
"../gcc/gcc/testsuite/gcc.c-torture/execute/20021120-3.c":14:3 106
{*movsi_m68k2}
 (expr_list:REG_DEAD (reg:SI 41)
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil


The combine pass creates a parallel insn with auto inc, which is resolved
during reload to valid insns again.

This patch removes these checks, since they aren't helpful.


Stefan


--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1,5 +1,5 @@
 /* Convert tree expression to rtl instructions, for GNU compiler.
-   Copyright (C) 1988-2019 Free Software Foundation, Inc.
+   Copyright (C) 1988-2020 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -4103,10 +4103,6 @@ find_args_size_adjust (rtx_insn *insn)
  if (dest == stack_pointer_rtx)
break;
 
- /* We do not expect an auto-inc of the sp in the parallel.  */
- gcc_checking_assert (mem_autoinc_base (dest) !=
stack_pointer_rtx);
- gcc_checking_assert (mem_autoinc_base (SET_SRC (set))
-  != stack_pointer_rtx);
}
   if (i < 0)
return 0;



[committed] aarch64: Fix SVE ACLE handling of SImode pointers

2020-01-21 Thread Richard Sandiford
This long-overdue patch promotes SImode pointers to DImode addresses,
avoiding various ICEs in the existing tests.

Tested on aarch64-linux-gnu and aarch64_be-elf, applied.

There are still other ILP32-related ACLE failures to go...

Richard


2020-01-21  Richard Sandiford  

gcc/
* config/aarch64/aarch64-sve-builtins.h
(function_expander::convert_to_pmode): Declare.
* config/aarch64/aarch64-sve-builtins.cc
(function_expander::convert_to_pmode): New function.
(function_expander::get_contiguous_base): Use it.
(function_expander::prepare_gather_address_operands): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svwhilerw_svwhilewr_impl::expand): Likewise.
---
 gcc/config/aarch64/aarch64-sve-builtins-sve2.cc |  2 ++
 gcc/config/aarch64/aarch64-sve-builtins.cc  | 15 +++
 gcc/config/aarch64/aarch64-sve-builtins.h   |  1 +
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h 
b/gcc/config/aarch64/aarch64-sve-builtins.h
index f307233f777..9513b497368 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.h
+++ b/gcc/config/aarch64/aarch64-sve-builtins.h
@@ -526,6 +526,7 @@ public:
 
   bool overlaps_input_p (rtx);
 
+  rtx convert_to_pmode (rtx);
   rtx get_contiguous_base (machine_mode);
   rtx get_fallback_value (machine_mode, unsigned int,
  unsigned int, unsigned int &);
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 587530a61bd..3d1b610cfd6 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -2602,12 +2602,21 @@ function_expander::overlaps_input_p (rtx x)
   return false;
 }
 
+/* Convert ptr_mode value X to Pmode.  */
+rtx
+function_expander::convert_to_pmode (rtx x)
+{
+  if (ptr_mode == SImode)
+x = simplify_gen_unary (ZERO_EXTEND, DImode, x, SImode);
+  return x;
+}
+
 /* Return the base address for a contiguous load or store function.
MEM_MODE is the mode of the addressed memory.  */
 rtx
 function_expander::get_contiguous_base (machine_mode mem_mode)
 {
-  rtx base = args[1];
+  rtx base = convert_to_pmode (args[1]);
   if (mode_suffix_id == MODE_vnum)
 {
   /* Use the size of the memory mode for extending loads and truncating
@@ -2814,9 +2823,7 @@ function_expander::prepare_gather_address_operands 
(unsigned int argno,
 {
   /* Scalar base, vector displacement.  This is the order that the md
 pattern wants.  */
-  if (Pmode == SImode)
-   args[argno] = simplify_gen_unary (ZERO_EXTEND, DImode,
- args[argno], SImode);
+  args[argno] = convert_to_pmode (args[argno]);
   vector_type = displacement_vector_type ();
   if (units == UNITS_elements && !scaled_p)
shift_idx = argno + 1;
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc
index fa3b50680ba..53b16511623 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc
@@ -442,6 +442,8 @@ public:
   rtx
   expand (function_expander &e) const OVERRIDE
   {
+for (unsigned int i = 0; i < 2; ++i)
+  e.args[i] = e.convert_to_pmode (e.args[i]);
 return e.use_exact_insn (code_for_while (m_unspec, Pmode, e.gp_mode (0)));
   }
 


Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Richard Biener
On January 21, 2020 4:37:00 PM GMT+01:00, Jan Hubicka  wrote:
>> On 1/21/20 4:08 PM, Jan Hubicka wrote:
>> > I think this is not enough - you need to take into consideration
>all
>> > special characters used by make and bash, such as $ and others...
>> 
>> Hm, you are right. Do you have a reasonable list which we should
>support?
>No - I guess one needs to dig into manual, try them out or just assume
>that all letters in filenams except for a-z, 0-9, _, -, ., ',' and
>perhaps more common caracters needs to be taken out?  It is used only
>to
>make more meaningful temp filenames, right?
>> Or should we leave this known limitation?
>I would say we want to fix that... Even GCC gets idea to put # info
>filenames and I am sure except for HHVM there are others.

Can't w just quote them somehow? 

Richard. 

>
>Honza
>> 
>> Martin



[committed] aarch64: Use stdint types for SVE ACLE elements

2020-01-21 Thread Richard Sandiford
I'd used mode-based element types in the SVE ACLE implementation, but
it turns out that they don't correspond to the  types used by
ILP32 newlib.  GCC already knows what the correct  types are,
I just wasn't using the right interface to find them.

A consequence of this is that ILP32 newlib code needs to cast "int *"
pointers to "int32_t *" before passing them to s32 loads and stores,
since int32_t is defined as "long int" rather than "int".  That matches
the normal C++ overloading behaviour for this target, where passing
"int *" to:

void f(int32_t *);
void f(int64_t *);

would be ambiguous.  It also matches the corresponding 
behaviour.

Tested on aarch64-linux-gnu and aarch64_be-elf, applied.

There are still other ILP32-related ACLE failures to go...

Richard


2020-01-21  Richard Sandiford  

gcc/
* config/aarch64/aarch64-sve-builtins.def: Use get_typenode_from_name
to get the integer element types.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/general-c/load_1.c (f1): Cast to
int32_t * rather than int *.
* gcc.target/aarch64/sve/acle/general-c/load_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_gather_sv_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_gather_sv_2.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_gather_sv_restricted_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_replicate_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_1.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_index_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_2.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_restricted_1.c
(f1): Likewise.
---
 gcc/config/aarch64/aarch64-sve-builtins.def   | 19 +++
 .../aarch64/sve/acle/general-c/load_1.c   |  2 +-
 .../aarch64/sve/acle/general-c/load_2.c   |  2 +-
 .../sve/acle/general-c/load_gather_sv_1.c |  2 +-
 .../sve/acle/general-c/load_gather_sv_2.c |  2 +-
 .../general-c/load_gather_sv_restricted_1.c   |  2 +-
 .../sve/acle/general-c/load_replicate_1.c |  2 +-
 .../aarch64/sve/acle/general-c/store_1.c  |  2 +-
 .../aarch64/sve/acle/general-c/store_2.c  |  2 +-
 .../acle/general-c/store_scatter_index_1.c|  2 +-
 .../acle/general-c/store_scatter_offset_2.c   |  2 +-
 .../store_scatter_offset_restricted_1.c   |  2 +-
 12 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def 
b/gcc/config/aarch64/aarch64-sve-builtins.def
index 040f1d8cb8f..a5a5aca58dd 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.def
+++ b/gcc/config/aarch64/aarch64-sve-builtins.def
@@ -64,14 +64,17 @@ DEF_SVE_TYPE (svbool_t, 10, __SVBool_t, boolean_type_node)
 DEF_SVE_TYPE (svfloat16_t, 13, __SVFloat16_t, aarch64_fp16_type_node)
 DEF_SVE_TYPE (svfloat32_t, 13, __SVFloat32_t, float_type_node)
 DEF_SVE_TYPE (svfloat64_t, 13, __SVFloat64_t, double_type_node)
-DEF_SVE_TYPE (svint8_t, 10, __SVInt8_t, intQI_type_node)
-DEF_SVE_TYPE (svint16_t, 11, __SVInt16_t, intHI_type_node)
-DEF_SVE_TYPE (svint32_t, 11, __SVInt32_t, intSI_type_node)
-DEF_SVE_TYPE (svint64_t, 11, __SVInt64_t, intDI_type_node)
-DEF_SVE_TYPE (svuint8_t, 11, __SVUint8_t, unsigned_intQI_type_node)
-DEF_SVE_TYPE (svuint16_t, 12, __SVUint16_t, unsigned_intHI_type_node)
-DEF_SVE_TYPE (svuint32_t, 12, __SVUint32_t, unsigned_intSI_type_node)
-DEF_SVE_TYPE (svuint64_t, 12, __SVUint64_t, unsigned_intDI_type_node)
+DEF_SVE_TYPE (svint8_t, 10, __SVInt8_t, get_typenode_from_name (INT8_TYPE))
+DEF_SVE_TYPE (svint16_t, 11, __SVInt16_t, get_typenode_from_name (INT16_TYPE))
+DEF_SVE_TYPE (svint32_t, 11, __SVInt32_t, get_typenode_from_name (INT32_TYPE))
+DEF_SVE_TYPE (svint64_t, 11, __SVInt64_t, get_typenode_from_name (INT64_TYPE))
+DEF_SVE_TYPE (svuint8_t, 11, __SVUint8_t, get_typenode_from_name (UINT8_TYPE))
+DEF_SVE_TYPE (svuint16_t, 12, __SVUint16_t,
+ get_typenode_from_name (UINT16_TYPE))
+DEF_SVE_TYPE (svuint32_t, 12, __SVUint32_t,
+ get_typenode_from_name (UINT32_TYPE))
+DEF_SVE_TYPE (svuint64_t, 12, __SVUint64_t,
+ get_typenode_from_name (UINT64_TYPE))
 
 DEF_SVE_TYPE_SUFFIX (b, svbool_t, bool, 8, VNx16BImode)
 DEF_SVE_TYPE_SUFFIX (b8, svbool_t, bool, 8, VNx16BImode)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/load_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/load_1.c
index 34f989bf8bb..784fdc317e6 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/load_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/load_1.c
@@ -13,7 +13,7 @@ f1 (svbool_t pg, signed char *s8_ptr, void *void_ptr, struct 
s *s_ptr,
   svld1 (pg, s8_ptr, 0); /* { dg-error {too many argu

Re: Fix updating of call_stmt_site_hash

2020-01-21 Thread Richard Biener
On January 21, 2020 4:37:31 PM GMT+01:00, Jan Hubicka  wrote:
>Hi,
>this patch fixes ICE causes by call stmt site hash going out of sync. 
>For
>speculative edges it is assumed to contain a direct call so if we are
>removing it hashtable needs to be updated.  I realize that the code is
>ugly
>but I will leave cleanup for next stage1.
>
>Bootstrapped/regtested x86_64-linux. This patch makes it possible to
>build
>Firefox again.

It even looks quadratic? Can't you simply lookup the stmt? 

>   PR lto/93318
>   * cgraph.c (cgraph_edge::resolve_speculation,
>   cgraph_edge::redirect_call_stmt_to_callee): Fix update of
>   call_stmt_site_hash.
>diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>index 187f6ed30ba..f7ebcc917d1 100644
>--- a/gcc/cgraph.c
>+++ b/gcc/cgraph.c
>@@ -1248,7 +1248,22 @@ cgraph_edge::resolve_speculation (cgraph_edge
>*edge, tree callee_decl)
>   else
> e2->callee->remove_symbol_and_inline_clones ();
>   if (edge->caller->call_site_hash)
>-cgraph_update_edge_in_call_site_hash (edge);
>+{
>+  /* We always maintain direct edge in the call site hash, if one
>+   exists.  */
>+  if (!edge->num_speculative_call_targets_p ())
>+  cgraph_update_edge_in_call_site_hash (edge);
>+  else
>+  {
>+cgraph_edge *e;
>+for (e = edge->caller->callees;
>+ e->call_stmt != edge->call_stmt
>+ || e->lto_stmt_uid != edge->lto_stmt_uid;
>+ e = e->next_callee)
>+  ;
>+cgraph_update_edge_in_call_site_hash (e);
>+  }
>+}
>   return edge;
> }
> 
>@@ -1414,7 +1429,20 @@ cgraph_edge::redirect_call_stmt_to_callee
>(cgraph_edge *e)
> /* Indirect edges are not both in the call site hash.
>get it updated.  */
> if (e->caller->call_site_hash)
>-  cgraph_update_edge_in_call_site_hash (e2);
>+  {
>+if (!e2->num_speculative_call_targets_p ())
>+  cgraph_update_edge_in_call_site_hash (e2);
>+else
>+  {
>+cgraph_edge *e;
>+for (e = e2->caller->callees;
>+ e->call_stmt != e2->call_stmt
>+ || e->lto_stmt_uid != e2->lto_stmt_uid;
>+ e = e->next_callee)
>+  ;
>+cgraph_update_edge_in_call_site_hash (e);
>+  }
>+  }
> pop_cfun ();
> /* Continue redirecting E to proper target.  */
>   }



Re: [PATCH] Prune invalid filename due to makefile syntax.

2020-01-21 Thread Nathan Sidwell

On 1/21/20 11:26 AM, Richard Biener wrote:

On January 21, 2020 4:37:00 PM GMT+01:00, Jan Hubicka  wrote:



I would say we want to fix that... Even GCC gets idea to put # info
filenames and I am sure except for HHVM there are others.


Can't w just quote them somehow?


IIUC there are two issues

1) Make's syntax.  Its quoting is baroque, (and incomplete?).  See the 
comment in libcpp/mkdeps.c, which I think is curtesy of ZackW.  I see 
that the just-released Make 4.3 has changed something to do with # and 
its need to be quoted in some contexts.


2) Bad chars for the underlying filesystem.  But, if the temp files are 
on the same FS as the thing they're temping for, then there shouldn't be 
a conflict.  If they are different, then one might hope the temp-fs 
allows a superset.


nathan

--
Nathan Sidwell


Re: [PR 80005] __has_include parsing

2020-01-21 Thread Jim Wilson
On Tue, Jan 21, 2020 at 12:22 AM Jakub Jelinek  wrote:
> I only see one spot where it has been added and then
> 2019-06-06  Florian Weimer  
>
> * sysdeps/unix/sysv/linux/riscv/flush-icache.c: Do not use
> internal GCC preprocessor identifier __has_include__.
> which corrected it to use __has_include.

OK, so upstream glibc is already fixed which is good, but there is at
least one glibc release with the bug, and some trees like
github.com/riscv/riscv-gnu-toolchain that have the broken glibc
sources.  I can deal with that problem though.  We will probably
upgrade glibc when we upgrade gcc anyways.

Jim


[PATCH] handle virtual registers inside PLUS - taking an address plus offset

2020-01-21 Thread Stefan Franke
The function instantiate_virtual_regs_in_insn does not handle the case if an
address with offset is taken. Then a virtual register may appear (e.g
argptr) inside a PLUS.

 

This patch adds the handling for PLUS.

 

Stefan

 

--- a/gcc/function.c

+++ b/gcc/function.c

@@ -1,5 +1,5 @@

/* Expands front end tree to back end RTL for GCC.

-   Copyright (C) 1987-2019 Free Software Foundation, Inc.

+   Copyright (C) 1987-2020 Free Software Foundation, Inc.

 This file is part of GCC.

@@ -1658,9 +1658,12 @@ instantiate_virtual_regs_in_insn (rtx_insn *insn)

 }

   /* In the general case, we expect virtual registers to appear only in

- operands, and then only as either bare registers or inside memories.
*/

+ operands, and then only as either bare registers or inside memories.

+ SBF: ... and sometimes - taking an address - inside a PLUS. */

   for (i = 0; i < recog_data.n_operands; ++i)

 {

+  poly_int64 old_offset = 0;

+

   x = recog_data.operand[i];

   switch (GET_CODE (x))

{

@@ -1694,10 +1697,19 @@ instantiate_virtual_regs_in_insn (rtx_insn *insn)

  }

  break;

+   case PLUS:

+ if (!REG_P (XEXP (x, 0)) || !CONST_INT_P (XEXP (x, 1)))

+   continue;

+

+ old_offset = INTVAL (XEXP (x, 1));

+ x = XEXP (x, 0);

+ /* no break */

case REG:

  new_rtx = instantiate_new_reg (x, &offset);

  if (new_rtx == NULL)

continue;

+

+ offset += old_offset;

  if (known_eq (offset, 0))

x = new_rtx;

  else



Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Jason Merrill

On 1/21/20 10:40 AM, Richard Earnshaw (lists) wrote:

On 21/01/2020 15:39, Jakub Jelinek wrote:

On Tue, Jan 21, 2020 at 03:33:22PM +, Richard Earnshaw (lists) wrote:
Some examples would be useful I'd say, e.g. it is unclear in what 
way you

want the PR number to be appended, shall it be
something: whatever words describe it PR12345
or
something: whatever words describe it (PR12345)
or
something: whatever words describe it: PR12345
or
something: whatever words describe it [PR12345]
or something else?


Glibc use "[BZ #]" - obviously BZ becomes PR, but after that, I'm 
not
too worried.  I'd be happy with [PR #], but if folk want 
something else,

please say so quickly...


[PR 12345] or [PR #12345] is bad, because the bugzilla won't underline 
it,

it needs to be either PR12345 word, or PR component/12345 .


ok, lets go with [PR] then.


Doesn't this use of [] have the same problem with git am?

My summaries are often describing the bug I'm fixing, i.e.

[PATCH] PR c++/91476 - anon-namespace reference temp clash between TUs.

which is also the first line of my ChangeLog entry.  I think you are 
proposing


[COMMITTED] c++: Fix anon-namespace reference temp clash between TUs 
(PR91476)


which can no longer be shared with the ChangeLog.

Jason



[GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-21 Thread Joel Hutton
Hi all,

A previous change to simplify LRA introduced in 11b809 (From-SVN: 
r279550) disabled hard register splitting for -O0. This causes a problem 
on aarch64 in cases where parameters are passed in multiple registers 
(in the bug report an OI passed in 2 V4SI registers). This is mandated 
by the AAPCS.

Vlad, Eric, do you have a preferred alternate solution to reverting the 
patch?

Previously discussed here: 
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01414.html
Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221

Bootstrapped and regression tested on aarch64

Changelog:

2020-01-21  Joel Hutton  

     * ira.c (ira): Revert use of simplified LRA algorithm.

gcc/testsuite/ChangeLog:

2020-01-21  Joel Hutton  

     PR bug/93221
     * gcc.target/aarch64/pr93221.c: New test.

From 0d9980d2327c61eb99d041a217d6ea5c5b34c754 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Tue, 21 Jan 2020 09:37:48 +
Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb

11b8091fb introduced a simplified LRA algorithm for -O0 that turned off
hard register splitting, this causes a problem for parameters passed in
multiple registers on aarch64. This fixes bug 93221.
---
 gcc/ira.c  | 38 +-
 gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++
 2 files changed, 25 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c

diff --git a/gcc/ira.c b/gcc/ira.c
index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5205,35 +5205,27 @@ ira (FILE *f)
   /* Perform target specific PIC register initialization.  */
   targetm.init_pic_reg ();
 
-  if (optimize)
-{
-  ira_conflicts_p = true;
-
-  /* Determine the number of pseudos actually requiring coloring.  */
-  unsigned int num_used_regs = 0;
-  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
-	if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
-	  num_used_regs++;
-
-  /* If there are too many pseudos and/or basic blocks (e.g. 10K
-	 pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
-	 use simplified and faster algorithms in LRA.  */
-  lra_simple_p
-	= ira_use_lra_p
-	  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
-}
-  else
-{
-  ira_conflicts_p = false;
-  lra_simple_p = ira_use_lra_p;
-}
+  ira_conflicts_p = optimize > 0;
+
+  /* Determine the number of pseudos actually requiring coloring.  */
+  unsigned int num_used_regs = 0;
+  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
+if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
+  num_used_regs++;
+
+  /* If there are too many pseudos and/or basic blocks (e.g. 10K
+ pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
+ use simplified and faster algorithms in LRA.  */
+  lra_simple_p
+= ira_use_lra_p
+  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
 
   if (lra_simple_p)
 {
   /* It permits to skip live range splitting in LRA.  */
   flag_caller_saves = false;
   /* There is no sense to do regional allocation when we use
-	 simplified LRA.  */
+	simplified LRA.  */
   flag_ira_region = IRA_REGION_ONE;
   ira_conflicts_p = false;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c
new file mode 100644
index ..517135a889de8a7e379c79222f8a8b2efcc7b422
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c
@@ -0,0 +1,10 @@
+/* PR bug/93221 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */
+
+struct S { __Int32x4_t b[2]; };
+
+void __attribute__((optimize (0)))
+foo (struct S x)
+{
+}
-- 
2.17.1



Re: [PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Fāng-ruì Sòng via gcc-patches

On 2020-01-21, Szabolcs Nagy wrote:

On 21/01/2020 11:34, Mark Rutland wrote:

Hi Szabolcs,

Answers from a linux dev perspective below.


thanks.


On Mon, Jan 20, 2020 at 10:53:33AM +, Szabolcs Nagy wrote:

i have to ask some linux developers which way they prefer:

e.g. -fpatchable-function-entry=3,1 is

 .section __patchable_function_entries
 .8byte .Lpatch
 .text
.Lpatch:
  nop
func:
  nop
  nop
  ...

with bti the code will be emitted as:

.Lpatch:
  nop
func:
  bti c
  nop
  nop
  ...


That looks good to me.


but e.g. -fpatchable-function-entry=2,0 has two reasonable
approaches with bti:

(a)

func:
.Lpatch:
  bti c
  nop
  nop
  ...

(b)

func:
  bti c
.Lpatch:
  nop
  nop
  ...


I had assumed (b); that means that .Lpatch consistently points to the
first NOP. To my mental model, that seems more consistent than (a).

However, I can work with either so long as it's consistent.

...

As above, my weak preference is (b), but I can work with either. I just
need the behaviour to be consistent.

Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
that also ease of implementation? If there isn't a rationale otherwise,
and if LLVM could also do (b), that would be nice from my PoV.

How big is "a bit more changes" for GCC?


".Lpatch: nop; nop" is generated by generic code now which
a backend can override, but then i have to copy that logic
(~30 lines) to add something after the .Lpatch, or i have
to change generic code to have a new hook after the .Lpatch:.

to provide more argument for (b) and expand on consistency:

right now bti is not emitted if a function is not called
indirectly (compiler knows this for some local functions)
or bti can be turned off by a target attribute per function.

so a mix of bti and non-bti functions are possible, so
the user patching code has to deal with this (e.g. read
the instructions before patching etc)

in the M>0, M!=N case bti is emitted in the middle of a
patch area (e.g. 3,1 case above), but some functions will
not have bti. i can make it so that if patchable function
entry is requested then bti is always added, irrespective
of any other option or attribute, but that seems to be a
big hammer, without such big hammer solution users will
have a difficult time to deal with the additional bti.

if the patch area is entirely before the function label,
then no changes are needed (M=N), if patch area is after
the function label (M=0) then using fix (b) would allow
simple user code. in other cases user code will be
complicated no matter what. so if linux uses M=0 then i
think (b) is preferable.

hope this helps.


The Clang inconvenience is in the other way...

(My previous Clang patch series do not implement M>0.
 https://reviews.llvm.org/D73070 will add support for M>0.)

AsmPrinter (assembly printer/object file emitter) does the following:

1. Emit data before the function entry
2. Emit the function entry label and the label for __patchable_function_entries
3. Emit the function body

The function body has already been constructed before AsmPrinter. Among
late-stage machine code passes, -fpatchable-function-entry=,
-mbranch-protection= (AArch64 BTI) and -fcf-protection= (Intel Indirect
Branch Tracking) are implemented as passes which can prepend
instructions to the function body.

To do (b), step 2 needs to be split. The code generator should somehow
know the label for __patchable_function_entries is after bti
c/endbr32/endbr64.


Before forming a decision, I think we should also consider whether M>0
will be possible in the future. Taking M>0 into consideration, is (a) or
(b) more consistent?

Linux/arch/arm64/Kconfig currently uses -fpatchable-function-entry=2,0
but M>0 could be useful for other architectures (arch/parisc already uses 1,1).


Re: [patch, fortran] Fix PR 85781, ICE on valid

2020-01-21 Thread Thomas König

Hi Tobias,


the attached patch fixes an ICE which could occur for empty
substrings (see test case).


I think one should rather fix the following issue.


I am not sure what you mean.  Does that mean that fixing the following
issue will also fix PR 85781, or that the PR should not be fixed?

Regards

Thomas


Re: [PATCH] RISC-V: Disallow regrenme if the TO register never used before for interrupt functions

2020-01-21 Thread Jim Wilson
On Mon, Jan 20, 2020 at 6:44 PM Kito Cheng  wrote:
> Thanks, fixed and committed, and it's OK to commit to gcc 8/9 next week?

Yes, that is OK with me.

Jim


Re: [PATCH] riscv: Fix up riscv_rtx_costs for RTL checking (PR target/93333)

2020-01-21 Thread Jim Wilson
On Tue, Jan 21, 2020 at 12:36 AM Jakub Jelinek  wrote:
> 2020-01-21  Jakub Jelinek  
>
> PR target/9
> * config/riscv/riscv.c (riscv_rtx_costs) : Verify
> the last two operands are CONST_INT_P before using them as such.
>
> * gcc.c-torture/compile/pr9.c: New test.

Yes, this is OK.  I've already tested it with rtl-checking enabled builds.

Jim


Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Richard Earnshaw (lists)

On 21/01/2020 17:20, Jason Merrill wrote:

On 1/21/20 10:40 AM, Richard Earnshaw (lists) wrote:

On 21/01/2020 15:39, Jakub Jelinek wrote:
On Tue, Jan 21, 2020 at 03:33:22PM +, Richard Earnshaw (lists) 
wrote:
Some examples would be useful I'd say, e.g. it is unclear in what 
way you

want the PR number to be appended, shall it be
something: whatever words describe it PR12345
or
something: whatever words describe it (PR12345)
or
something: whatever words describe it: PR12345
or
something: whatever words describe it [PR12345]
or something else?


Glibc use "[BZ #]" - obviously BZ becomes PR, but after that, 
I'm not
too worried.  I'd be happy with [PR #], but if folk want 
something else,

please say so quickly...


[PR 12345] or [PR #12345] is bad, because the bugzilla won't 
underline it,

it needs to be either PR12345 word, or PR component/12345 .


ok, lets go with [PR] then.


Doesn't this use of [] have the same problem with git am?


No, because only 'leading' [] blocks are removed - git mailinfo --help



My summaries are often describing the bug I'm fixing, i.e.

[PATCH] PR c++/91476 - anon-namespace reference temp clash between TUs.

which is also the first line of my ChangeLog entry.  I think you are 
proposing


[COMMITTED] c++: Fix anon-namespace reference temp clash between TUs 
(PR91476)


which can no longer be shared with the ChangeLog.



I was trying to unify this with glibc.  They specify the bug number at 
the end of the line.


We can diverge if it's generally felt to be important, but details like 
this create needless friction for folk working in both communities.


R.


[PATCH] wwwdocs: document scripts to access personal and vendor spaces

2020-01-21 Thread Richard Earnshaw (lists)
This patch documents some of the scripts that I've published for 
managing the personal and vendor spaces on the server.  It also covers 
some of the other features that those scripts enable, so that it's all 
in one place.  This is a complete rewrite of the material I had written 
previously since before we did not have these scripts to make use of.


I've not filled in the documentation for gcc-descr and gcc-undescr, 
Jakub has agreed to provide that at a later date.


R.
diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index 87a18fa7..047c139f 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -26,6 +26,7 @@ maintainers and significant developers.
   Example check-in session
   Creating and using branches
   git-merge-changelog
+  Personal and Vendor branches
   Tips&Tricks around your account
 
 
@@ -370,6 +371,158 @@ git config --global merge.merge-changelog.driver "git-merge-changelog %O %A %B"
 echo "ChangeLog*   merge=merge-changelog" >> $GCCSRCDIR/.git/info/attributes
 
 
+
+Personal and vendor branches
+
+The GCC git repository is used by many people and the branch and tag
+namespace would become very polluted if all branches lived at the
+top-level naming space.  To help minimise the amount of data that
+needs to be fetched the git repository on gcc.gnu.org supports
+having personal and vendor branches that developers use to
+share their work.  These are not pulled by default, but simple
+configuration steps can give access to them.
+
+
+  Personal branches live
+in refs/users/username/heads with tags
+in refs/users/username/tags.
+  Vendor branches live
+in refs/vendors/vendor-name/heads with tags
+in refs/vendors/vendor-name/tags.
+
+
+Scripts exist the contrib directory to help manage these spaces.
+
+contrib/gcc-git-customization.sh
+
+This script will help set up your personal area.  It will also define
+some aliases that might be useful when developing GCC.  The script will
+  first ask a number of questions:
+
+  Your name - git uses this when you commit messages.  You
+can set this globally, but the script will confirm the setting is
+appropriate for GCC as well.  If you have not already set this
+then git will try to find your name from your account.
+  Your email address - similar to above.  If this is not
+set globally, the script will not attempt to guess this field, so
+you must provide a suitable answer.
+  The local name for the upstream repository - normally, the
+default (origin) will be correct.  This is the git remote that
+connects directly to the gcc.gnu.org server.
+  Your account name on gcc.gnu.org - the script will try to
+work this out based on the URL used to fetch from the git server, or
+fall back to your local user name if that fails.  Using this name
+on the server for your personal space ensures that your branches will
+not conflict with anybody elses.
+  The prefix to use for your personal branches - the default is
+me, but you can change this if you prefer.  The script
+will add configuration information to allow local branches that
+start me/ to be pushed or pulled from
+your personal space in the gcc git repository on the server.
+Do not worry if you do not have an account on gcc.gnu.org.  You will
+not be able to create a personal space on the server, but other
+settings configured by the script will still be useful.
+
+
+If you have multiple clones of the gcc repository you can fetch
+updates from your personal space by running
+  git fetch me
+(or whatever personal prefix you've chosen).  You can also push an
+already existing branch using git push me me/branch.
+Beware that if you have more than one personal branch set up locally,
+simply typing git push me will potentially push all such
+personal branches.  Use --dry-run to check that what will be pushed is
+what you intend.
+
+To create a new personal branch, the following sequence of steps can be
+used:
+
+  git push me :refs/users//heads/
+  git fetch me
+  git checkout -b me/ remotes/me/
+
+If you've used a different personal prefix to 'me' then use that
+  in the sequence described above.
+
+The script also defines a few useful aliases that can be used with the
+repository:
+
+
+  svn-rev - Find the commit in the git repository that was
+created from a particular revision number from the SVN era.  The
+first parameter must be the revision number you wish to look up,
+an initial 'r' prefix is optional.  You can then pass other
+options that are accepted by git log to modify the format of the
+output.  Of particular use is the
+--oneline option to summarize the commit on a single line.
+  
+  gcc-descr - Undocumented
+  gcc-undescr - Undocumented
+
+
+The final customization that the script makes is to add a diff rule so
+that running git diff on a machine description file (a file
+with the suffix .md) will annotate
+the dif

Re: [PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Szabolcs Nagy
On 21/01/2020 17:51, Fāng-ruì Sòng via gcc-patches wrote:
> The Clang inconvenience is in the other way...
> 
> (My previous Clang patch series do not implement M>0.
>  https://reviews.llvm.org/D73070 will add support for M>0.)
> 
> AsmPrinter (assembly printer/object file emitter) does the following:
> 
> 1. Emit data before the function entry
> 2. Emit the function entry label and the label for 
> __patchable_function_entries
> 3. Emit the function body
> 
> The function body has already been constructed before AsmPrinter. Among
> late-stage machine code passes, -fpatchable-function-entry=,
> -mbranch-protection= (AArch64 BTI) and -fcf-protection= (Intel Indirect
> Branch Tracking) are implemented as passes which can prepend
> instructions to the function body.
> 
> To do (b), step 2 needs to be split. The code generator should somehow
> know the label for __patchable_function_entries is after bti
> c/endbr32/endbr64.

or you can do a hack in 'emit function entry label'
(assuming there is a hook for that) so instead of
printing 'foo:' you print 'foo: bti c'

> Before forming a decision, I think we should also consider whether M>0
> will be possible in the future. Taking M>0 into consideration, is (a) or
> (b) more consistent?
> 
> Linux/arch/arm64/Kconfig currently uses -fpatchable-function-entry=2,0
> but M>0 could be useful for other architectures (arch/parisc already uses 
> 1,1).

i think then checking for the bti is unavoidable:
some functions may miss the bti c, so simple logic
to jump over the bti at fixed offset is not enough.

but this complication is only needed if the function
label is in the middle of the patch area.


Re: Fix updating of call_stmt_site_hash

2020-01-21 Thread Jan Hubicka
> On January 21, 2020 4:37:31 PM GMT+01:00, Jan Hubicka  wrote:
> >Hi,
> >this patch fixes ICE causes by call stmt site hash going out of sync. 
> >For
> >speculative edges it is assumed to contain a direct call so if we are
> >removing it hashtable needs to be updated.  I realize that the code is
> >ugly
> >but I will leave cleanup for next stage1.
> >
> >Bootstrapped/regtested x86_64-linux. This patch makes it possible to
> >build
> >Firefox again.
> 
> It even looks quadratic? Can't you simply lookup the stmt? 

I am not sure what you mean by looking up the stmt.  To go from stmt to
call edge you either walk the callees chain or use call_site_hash that
populates itself when you have more than 100 callees.  The situation
this code solves is that with speculative calls have multiple direct
edges, one indirect and multiple refs associated with the speculated
call stmt. Now one of direct edges gets removed and at this moment one
needs to look up different one to re-populate the hash.

I meant to fix the ICE first, but indeed resolve_speuclations and
redirect_stmt_to_callee is O(num_callees+num_refs) in worst case here
which needs to be fixed (though the case is quite rare in practice).
One can keep direct edges in a block in the lists (with verifier help),
but still looking up referneces is bit hard since they are in vector and
that is re-numbered on inserts and deletes.  I plan to fix this
incrementally (we want also refs hash and verifier for the caches, plus
this all is memory use critical)

Honza


Re: [GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-21 Thread Vladimir Makarov

On 1/21/20 12:20 PM, Joel Hutton wrote:

Hi all,

A previous change to simplify LRA introduced in 11b809 (From-SVN:
r279550) disabled hard register splitting for -O0. This causes a problem
on aarch64 in cases where parameters are passed in multiple registers
(in the bug report an OI passed in 2 V4SI registers). This is mandated
by the AAPCS.


I see.  I suspected that it can cause the problem but I thought the 
probability is very small for this.  I was wrong.  We should revert it 
at least for now.  Correct work of GCC is more important than saving 
cycles in -O0 mode.



Vlad, Eric, do you have a preferred alternate solution to reverting the
patch?
I am in favour of reverting the patch now.  But may be Eric can provide 
another version of the patch not causing the arm problem.  I am ready to 
reconsider this too.  So I guess the decision is upto Eric.

Previously discussed here:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01414.html
Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221

Bootstrapped and regression tested on aarch64

Changelog:

2020-01-21  Joel Hutton  

      * ira.c (ira): Revert use of simplified LRA algorithm.

gcc/testsuite/ChangeLog:

2020-01-21  Joel Hutton  

      PR bug/93221
      * gcc.target/aarch64/pr93221.c: New test.





Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-21 Thread H.J. Lu
On Tue, Jan 21, 2020 at 2:29 AM Uros Bizjak  wrote:
>
> On Tue, Jan 21, 2020 at 9:47 AM Uros Bizjak  wrote:
> >
> > On Mon, Jan 20, 2020 at 10:46 PM H.J. Lu  wrote:
> >
> > > > > OK. Let's go with this version, but please investigate if we need to
> > > > > calculate TLS address in ptr_mode instead of Pmode. Due to quite some
> > > > > zero-extension from ptr_mode to Pmode hacks in this area, it looks to
> > > > > me that the whole calculation should be performed in ptr_mode (SImode
> > > > > in case of x32), and the result zero-extended to Pmode in case when
> > > > > Pmode = DImode.
> > > > >
> > > >
> > > > I checked it in.  I will investigate if we can use ptr_mode for TLS.
> > >
> > > Here is a patch to perform GNU2 TLS address computation in ptr_mode
> > > and zero-extend result to Pmode.
> >
> >  case TLS_MODEL_GLOBAL_DYNAMIC:
> > -  dest = gen_reg_rtx (Pmode);
> > +  dest = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);
> >
> > Please put these in their respective arms of "if (TARGET_GNU2_TLS).
> >
> >  case TLS_MODEL_LOCAL_DYNAMIC:
> > -  base = gen_reg_rtx (Pmode);
> > +  base = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);
> >
> > Also here.
> >
> > A question: Do we need to emit the following part in Pmode?
>
> To answer my own question: Yes. Linker doesn't like SImode relocs fox
> x86_64 and for
>
> addl$foo@dtpoff, %eax
>
> errors out with:
>
> pr93319-1a.s: Assembler messages:
> pr93319-1a.s:20: Error: relocated field and relocation type differ in 
> signedness
>
> So, the part below is OK, except:
>
> -  tp = get_thread_pointer (Pmode, true);
> -  set_unique_reg_note (get_last_insn (), REG_EQUAL,
> -   gen_rtx_MINUS (Pmode, tmp, tp));
> +  tp = get_thread_pointer (ptr_mode, true);
> +  tmp = gen_rtx_MINUS (ptr_mode, tmp, tp);
> +  if (GET_MODE (tmp) != Pmode)
> +tmp = gen_rtx_ZERO_EXTEND (Pmode, tmp);
> +  set_unique_reg_note (get_last_insn (), REG_EQUAL, tmp);
>
> I don't think we should attach this note to the thread pointer
> initialization. I have removed this part from the patch, but please
> review the decision.
>
> and
>
> -dest = gen_rtx_PLUS (Pmode, tp, dest);
> +dest = gen_rtx_PLUS (ptr_mode, tp, dest);
>
> Please leave Pmode here. ptr_mode == Pmode at this point, but Pmode
> better documents the mode selection logic.
>
> Also, the tests fail for me with:
>
> /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> file or directory
>
> so either use __builtin_printf or some other approach that doesn't
> need to #include stdio.h.
>
> A patch that implements above changes is attached to the message.
>

Here is the updated patch.  OK for master?

Thanks.

-- 
H.J.
From 01b20630518882fa3952962b26bfbb2465e08036 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 20 Jan 2020 13:30:04 -0800
Subject: [PATCH] i386: Do GNU2 TLS address computation in ptr_mode

Since GNU2 TLS address from glibc run-time is in ptr_mode, we should do
GNU2 TLS address computation in ptr_mode and zero-extend result to Pmode.

2020-01-21  H.J. Lu  
	Uros Bizjak

gcc/

	PR target/93319
	* config/i386/i386.c (ix86_tls_module_base): Replace Pmode
	with ptr_mode.
	(legitimize_tls_address): Do GNU2 TLS address computation in
	ptr_mode and zero-extend result to Pmode.
	*  config/i386/i386.md (@tls_dynamic_gnu2_64_): Replace
	:P with :PTR and Pmode with ptr_mode.
	(*tls_dynamic_gnu2_lea_64_): Likewise.
	(*tls_dynamic_gnu2_call_64_): Likewise.
	(*tls_dynamic_gnu2_combine_64_): Likewise.

gcc/testsuite/

2020-01-21  Uros Bizjak

	PR target/93319
	* gcc.target/i386/pr93319-1a.c: Don't include .
	(test1): Replace printf with __builtin_printf.
---
 gcc/config/i386/i386.c | 43 ---
 gcc/config/i386/i386.md| 48 +++---
 gcc/testsuite/gcc.target/i386/pr93319-1a.c |  6 +--
 3 files changed, 42 insertions(+), 55 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0b8a4b9ee4f..ffe60baa72a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10717,7 +10717,7 @@ ix86_tls_module_base (void)
   if (!ix86_tls_module_base_symbol)
 {
   ix86_tls_module_base_symbol
-	= gen_rtx_SYMBOL_REF (Pmode, "_TLS_MODULE_BASE_");
+	= gen_rtx_SYMBOL_REF (ptr_mode, "_TLS_MODULE_BASE_");
 
   SYMBOL_REF_FLAGS (ix86_tls_module_base_symbol)
 	|= TLS_MODEL_GLOBAL_DYNAMIC << SYMBOL_FLAG_TLS_SHIFT;
@@ -10748,8 +10748,6 @@ legitimize_tls_address (rtx x, enum tls_model model, bool for_mov)
   switch (model)
 {
 case TLS_MODEL_GLOBAL_DYNAMIC:
-  dest = gen_reg_rtx (Pmode);
-
   if (!TARGET_64BIT)
 	{
 	  if (flag_pic && !TARGET_PECOFF)
@@ -10763,24 +10761,16 @@ legitimize_tls_address (rtx x, enum tls_model model, bool for_mov)
 
   if (TARGET_GNU2_TLS)
 	{
+	  dest = gen_reg_rtx (ptr_mode);
 	  if (TARGET_64BIT)
-	emit_insn (gen_tls_dynamic_gnu2_64 (Pmode, dest, x));
+	emit_insn (gen_

Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-21 Thread Segher Boessenkool
Hi!

Thanks for doing this.

On Tue, Jan 21, 2020 at 02:52:00PM +, Richard Earnshaw (lists) wrote:
> This patch proposes some new (additional) rules for email subject lines
> when contributing to GCC.  The goal is to make sure that, as far as
> possible, the subject for a patch will form a good summary when the
> message is committed to the repository if applied with 'git am'.

> +A high-quality e-mail subject line for a contribution contains the
> +following elements:

> +  A brief summary

You could stress that this is the one thing that really matters.  And
it's not a summary, it's much too brief for that (at most ~50 chars),
but yup it should be something that allows *a human* to identify what
this is.

Everything else is just convention.

> +A component tag is a short identifier that identifies the part of
> +the compiler being modified.  This highlights to the relevant
> +maintainers that the patch may need their attention.  Multiple
> +components may be listed if necessary.  Each component tag should be
> +followed by a colon.

Often people use aaa/bbb: if drilling down a bit that way helps keep the
subject short (which is the *point* of all this: keep things better
consumable for humans).

> +The brief summary encapsulates in a few words the intent of the
> +change.  For example: cleanup check_field_decls.

It should start with a capital though.  "Clean up blablal".  (So no
dot to end the sentence, this isn't a sentence).  A capital helps
the reader to quickly identify what is what, separate fluff from the
core parts.

> +Some large patch sets benefit from an introductory e-mail that
> +provides more context for the patch series and describes how the
> +patches have been broken up to provide for review.

All non-trivial series, yeah.

Maybe we should mention how v2 etc. of patch series should show what is
changed?  If there is a good standard practice for that at all :-)


Segher


Re: [GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-21 Thread Jakub Jelinek
On Tue, Jan 21, 2020 at 05:20:51PM +, Joel Hutton wrote:
> 2020-01-21  Joel Hutton  
> 
>      * ira.c (ira): Revert use of simplified LRA algorithm.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-01-21  Joel Hutton  
> 
>      PR bug/93221
>      * gcc.target/aarch64/pr93221.c: New test.

Not a review, just nitpicking.  Please avoid the backslash before >.
The PR number should go to both gcc/ and gcc/testsuite/ ChangeLog
entry and we don't have bug/ category; the bug is target category,
so it should be PR target/93221, or could be reclassified first in
bugzilla e.g. to middle-end and then written as PR middle-end/93221.

> @@ -0,0 +1,10 @@
> +/* PR bug/93221 */

Likewise here.

> +/* { dg-do compile } */
> +/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */
> +
> +struct S { __Int32x4_t b[2]; };
> +
> +void __attribute__((optimize (0)))

Not sure if I understand why do you need optimize (0) attribute when the
whole test is compiled with -O0.  Doesn't it ICE without the attribute
too?

> +foo (struct S x)
> +{
> +}

Jakub



[committed, obvious] Fix line terminator pattern in testcase.

2020-01-21 Thread Sandra Loosemore
I observed this testcase was failing on nios2-elf.  The more general
regexp was copied from other tests using dg-output.

2020-01-21  Sandra Loosemore  

gcc/testsuite/
* g++.dg/coroutines/torture/mid-suspend-destruction-0.C: Generalize
line terminators in patterns.
---
 gcc/testsuite/ChangeLog   | 5 +
 .../g++.dg/coroutines/torture/mid-suspend-destruction-0.C | 8 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 4bea57e..b164f31 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2020-01-21  Sandra Loosemore  
+
+   * g++.dg/coroutines/torture/mid-suspend-destruction-0.C: Generalize
+   line terminators in patterns.
+
 2020-01-21  Richard Sandiford  
 
* gcc.target/aarch64/sve/acle/general-c/load_1.c (f1): Cast to
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/mid-suspend-destruction-0.C 
b/gcc/testsuite/g++.dg/coroutines/torture/mid-suspend-destruction-0.C
index 934fb19..505bfa3 100644
--- a/gcc/testsuite/g++.dg/coroutines/torture/mid-suspend-destruction-0.C
+++ b/gcc/testsuite/g++.dg/coroutines/torture/mid-suspend-destruction-0.C
@@ -1,8 +1,8 @@
 // { dg-do run }
-// { dg-output "main: returning\n" }
-// { dg-output "Destroyed coro1\n" }
-// { dg-output "Destroyed suspend_always_prt\n" }
-// { dg-output "Destroyed Promise\n" }
+// { dg-output "main: returning(\n|\r\n|\r)" }
+// { dg-output "Destroyed coro1(\n|\r\n|\r)" }
+// { dg-output "Destroyed suspend_always_prt(\n|\r\n|\r)" }
+// { dg-output "Destroyed Promise(\n|\r\n|\r)" }
 
 // Check that we still get the right DTORs run when we let a suspended coro
 // go out of scope.
-- 
2.8.1



Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-21 Thread Martin Liška

On 1/21/20 4:27 PM, Jan Hubicka wrote:

If distro build takes seriously FDO and reproducibility it will
eventually run into the limitation for single-threaded programs and we
will need to do reproducible runs for multithread apps which means
putting indirect call profiles into thread local storage I think, and
giving up on time profiling.


Well, a distro should take it seriously, but a reasonable level of reliability
is enough. Right now, we're using gcc9 compiler with backport of -1 invalidation
patch and people in distro seems fine.



I think non-reproducible merging could be implemneted easier atop of
current infrastructure by simply adding
HIST_TYPE_INDIR_CALL_REPRODUCIBLE, HIST_TYPE_INDR_CALL_NONRERPODUCIBGL
and HIST_TYPE_TOPN_VALUE_REPRODUCIBLE,
HIST_TYPE_TOPN_VALUE_NONREPRODUCIBLE and use appropriate merging
function (i.e. one which invalidates full counter versus one whic does
not)


So I would not invent one another level of complexity unless it's really needed.



I wonder if we want to teach one of our autotesters to verify that
profiledbootstrap is reproducible? I guess we could-re-use our exisitng
bootstrap comparsion, we just need to produce two sets of train data and
then do the comparsion.


I'll setup a Buildbot builder that will do that on weekly bases. I'm planning
to do both PGO and PGO+LTO, where I'll do 2 (or maybe 3) builds with comparison
based on md5sum of .text section of cc1plus and some other FE.

Martin


Re: [GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-21 Thread Joel Hutton
On 21/01/2020 19:26, Jakub Jelinek wrote:
> Not a review, just nitpicking.  Please avoid the backslash before >.
> The PR number should go to both gcc/ and gcc/testsuite/ ChangeLog
> entry and we don't have bug/ category; the bug is target category,
> so it should be PR target/93221, or could be reclassified first in
> bugzilla e.g. to middle-end and then written as PR middle-end/93221.
>
Done
>> @@ -0,0 +1,10 @@
>> +/* PR bug/93221 */
> Likewise here.
Done
> Not sure if I understand why do you need optimize (0) attribute when the
> whole test is compiled with -O0.  Doesn't it ICE without the attribute
> too?
Done. It's not really necessary, belt and braces.

Updated patch attached

From 1a2980ef6eeb76dbf0556f806a85a4f49ad3ebdd Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Tue, 21 Jan 2020 09:37:48 +
Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb

11b8091fb introduced a simplified LRA algorithm for -O0 that turned off
hard register splitting, this causes a problem for parameters passed in
multiple registers on aarch64. This fixes bug 93221.
---
 gcc/ira.c  | 38 +-
 gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++
 2 files changed, 25 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c

diff --git a/gcc/ira.c b/gcc/ira.c
index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5205,35 +5205,27 @@ ira (FILE *f)
   /* Perform target specific PIC register initialization.  */
   targetm.init_pic_reg ();
 
-  if (optimize)
-{
-  ira_conflicts_p = true;
-
-  /* Determine the number of pseudos actually requiring coloring.  */
-  unsigned int num_used_regs = 0;
-  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
-	if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
-	  num_used_regs++;
-
-  /* If there are too many pseudos and/or basic blocks (e.g. 10K
-	 pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
-	 use simplified and faster algorithms in LRA.  */
-  lra_simple_p
-	= ira_use_lra_p
-	  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
-}
-  else
-{
-  ira_conflicts_p = false;
-  lra_simple_p = ira_use_lra_p;
-}
+  ira_conflicts_p = optimize > 0;
+
+  /* Determine the number of pseudos actually requiring coloring.  */
+  unsigned int num_used_regs = 0;
+  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
+if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
+  num_used_regs++;
+
+  /* If there are too many pseudos and/or basic blocks (e.g. 10K
+ pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
+ use simplified and faster algorithms in LRA.  */
+  lra_simple_p
+= ira_use_lra_p
+  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
 
   if (lra_simple_p)
 {
   /* It permits to skip live range splitting in LRA.  */
   flag_caller_saves = false;
   /* There is no sense to do regional allocation when we use
-	 simplified LRA.  */
+	simplified LRA.  */
   flag_ira_region = IRA_REGION_ONE;
   ira_conflicts_p = false;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c
new file mode 100644
index ..4dc2c3d0149423dd3d666f7428277ffa9eb765c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c
@@ -0,0 +1,10 @@
+/* PR target/93221 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */
+
+struct S { __Int32x4_t b[2]; };
+
+void
+foo (struct S x)
+{
+}
-- 
2.17.1



Re: [GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-21 Thread Joel Hutton
Updated changelog:

Changelog:

2020-01-21  Joel Hutton  

PR target/93221
* ira.c (ira): Revert use of simplified LRA algorithm.

gcc/testsuite/ChangeLog:

2020-01-21  Joel Hutton  

PR target/93221
* gcc.target/aarch64/pr93221.c: New test.


Re: [GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-21 Thread Joel Hutton
Changelog was mangled by thunderbird, updated changelog:

Changelog:

2020-01-21  Joel Hutton  

 PR target/93221
 * ira.c (ira): Revert use of simplified LRA algorithm.

gcc/testsuite/ChangeLog:

2020-01-21  Joel Hutton  

 PR target/93221
 * gcc.target/aarch64/pr93221.c: New test.



Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-21 Thread Uros Bizjak
On Tue, Jan 21, 2020 at 8:16 PM H.J. Lu  wrote:
>
> On Tue, Jan 21, 2020 at 2:29 AM Uros Bizjak  wrote:
> >
> > On Tue, Jan 21, 2020 at 9:47 AM Uros Bizjak  wrote:
> > >
> > > On Mon, Jan 20, 2020 at 10:46 PM H.J. Lu  wrote:
> > >
> > > > > > OK. Let's go with this version, but please investigate if we need to
> > > > > > calculate TLS address in ptr_mode instead of Pmode. Due to quite 
> > > > > > some
> > > > > > zero-extension from ptr_mode to Pmode hacks in this area, it looks 
> > > > > > to
> > > > > > me that the whole calculation should be performed in ptr_mode 
> > > > > > (SImode
> > > > > > in case of x32), and the result zero-extended to Pmode in case when
> > > > > > Pmode = DImode.
> > > > > >
> > > > >
> > > > > I checked it in.  I will investigate if we can use ptr_mode for TLS.
> > > >
> > > > Here is a patch to perform GNU2 TLS address computation in ptr_mode
> > > > and zero-extend result to Pmode.
> > >
> > >  case TLS_MODEL_GLOBAL_DYNAMIC:
> > > -  dest = gen_reg_rtx (Pmode);
> > > +  dest = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);
> > >
> > > Please put these in their respective arms of "if (TARGET_GNU2_TLS).
> > >
> > >  case TLS_MODEL_LOCAL_DYNAMIC:
> > > -  base = gen_reg_rtx (Pmode);
> > > +  base = gen_reg_rtx (TARGET_GNU2_TLS ? ptr_mode : Pmode);
> > >
> > > Also here.
> > >
> > > A question: Do we need to emit the following part in Pmode?
> >
> > To answer my own question: Yes. Linker doesn't like SImode relocs fox
> > x86_64 and for
> >
> > addl$foo@dtpoff, %eax
> >
> > errors out with:
> >
> > pr93319-1a.s: Assembler messages:
> > pr93319-1a.s:20: Error: relocated field and relocation type differ in 
> > signedness
> >
> > So, the part below is OK, except:
> >
> > -  tp = get_thread_pointer (Pmode, true);
> > -  set_unique_reg_note (get_last_insn (), REG_EQUAL,
> > -   gen_rtx_MINUS (Pmode, tmp, tp));
> > +  tp = get_thread_pointer (ptr_mode, true);
> > +  tmp = gen_rtx_MINUS (ptr_mode, tmp, tp);
> > +  if (GET_MODE (tmp) != Pmode)
> > +tmp = gen_rtx_ZERO_EXTEND (Pmode, tmp);
> > +  set_unique_reg_note (get_last_insn (), REG_EQUAL, tmp);
> >
> > I don't think we should attach this note to the thread pointer
> > initialization. I have removed this part from the patch, but please
> > review the decision.
> >
> > and
> >
> > -dest = gen_rtx_PLUS (Pmode, tp, dest);
> > +dest = gen_rtx_PLUS (ptr_mode, tp, dest);
> >
> > Please leave Pmode here. ptr_mode == Pmode at this point, but Pmode
> > better documents the mode selection logic.
> >
> > Also, the tests fail for me with:
> >
> > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> > file or directory
> >
> > so either use __builtin_printf or some other approach that doesn't
> > need to #include stdio.h.
> >
> > A patch that implements above changes is attached to the message.
> >
>
> Here is the updated patch.  OK for master?

OK if it passes regtests.

Thanks,
Uros.


Home care Centers

2020-01-21 Thread Olivia Martin
Hi,

 

Just wanted to check your interest in acquiring the records of Home care
Centers/Agencies.

 

If you are interested please let me know your target Geography, so that I
can revert with further information on counts and pricing.

 

Thank you and I look forward for your response.

 

Regards,

Olivia Martin | Manager Demand Generation|

 

To opt out, please reply with Opt Out in the Subject Line.

 

 



[COMMITTED] PR c++/90732 - ICE with VLA capture and generic lambda.

2020-01-21 Thread Jason Merrill
We were failing to handle VLA capture in tsubst_lambda_expr; initially
building a DECLTYPE_TYPE for the capture and then tsubsting it doesn't give
the special VLA handling.  So with this patch we call add_capture again for
VLAs.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (tsubst_lambda_expr): Repeat add_capture for VLAs.
---
 gcc/cp/pt.c  | 37 +++-
 gcc/testsuite/g++.dg/cpp1y/lambda-vla1.C | 16 ++
 2 files changed, 46 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-vla1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index cad97514cdc..4520c995028 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18751,6 +18751,36 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
cap = TREE_CHAIN (cap))
 {
   tree ofield = TREE_PURPOSE (cap);
+  tree init = TREE_VALUE (cap);
+  if (PACK_EXPANSION_P (init))
+   init = tsubst_pack_expansion (init, args, complain, in_decl);
+  else
+   init = tsubst_copy_and_build (init, args, complain, in_decl,
+ /*fn*/false, /*constexpr*/false);
+
+  if (init == error_mark_node)
+   return error_mark_node;
+
+  if (init && TREE_CODE (init) == TREE_LIST)
+   init = build_x_compound_expr_from_list (init, ELK_INIT, complain);
+
+  if (!processing_template_decl
+ && init && TREE_CODE (init) != TREE_VEC
+ && variably_modified_type_p (TREE_TYPE (init), NULL_TREE))
+   {
+ /* For a VLA, simply tsubsting the field type won't work, we need to
+go through add_capture again.  XXX do we want to do this for all
+captures?  */
+ tree name = (get_identifier
+  (IDENTIFIER_POINTER (DECL_NAME (ofield)) + 2));
+ tree ftype = TREE_TYPE (ofield);
+ bool by_ref = (TYPE_REF_P (ftype)
+|| (TREE_CODE (ftype) == DECLTYPE_TYPE
+&& DECLTYPE_FOR_REF_CAPTURE (ftype)));
+ add_capture (r, name, init, by_ref, !DECL_NORMAL_CAPTURE_P (ofield));
+ continue;
+   }
+
   if (PACK_EXPANSION_P (ofield))
ofield = PACK_EXPANSION_PATTERN (ofield);
   tree field = tsubst_decl (ofield, args, complain);
@@ -18765,13 +18795,6 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
   if (field == error_mark_node)
return error_mark_node;
 
-  tree init = TREE_VALUE (cap);
-  if (PACK_EXPANSION_P (init))
-   init = tsubst_pack_expansion (init, args, complain, in_decl);
-  else
-   init = tsubst_copy_and_build (init, args, complain, in_decl,
- /*fn*/false, /*constexpr*/false);
-
   if (TREE_CODE (field) == TREE_VEC)
{
  int len = TREE_VEC_LENGTH (field);
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-vla1.C 
b/gcc/testsuite/g++.dg/cpp1y/lambda-vla1.C
new file mode 100644
index 000..c9025c79aa7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-vla1.C
@@ -0,0 +1,16 @@
+// PR c++/90732
+// { dg-do compile { target c++14 } }
+// { dg-additional-options "-Wno-vla" }
+
+/*const*/ int SIZE = 100;
+
+template
+int foo(T t) {
+  char buf[SIZE] = { 24 };
+  return [&buf](auto x){ return buf[x]; }(t);
+}
+
+int main() {
+  if (foo(0) != 24)
+__builtin_abort();
+}

base-commit: 8158a4640819dbb3210326e37786fb874f450272
-- 
2.18.1



[COMMITTED] PR c++/60855 - ICE with sizeof VLA capture.

2020-01-21 Thread Jason Merrill
For normal captures we usually look through them within unevaluated context,
but that doesn't work here; trying to take the sizeof of the array in the
enclosing scope tries and fails to evaluate a SAVE_EXPR from the enclosing
scope.

Tested x86_64-pc-linux-gnu, applying to trunk.

* lambda.c (is_lambda_ignored_entity): Don't look past VLA capture.
---
 gcc/cp/lambda.c |  5 +++--
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-vla4.C | 12 
 2 files changed, 15 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-vla4.C

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 50fb144142b..4f39f99756b 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -1327,8 +1327,9 @@ lambda_static_thunk_p (tree fn)
 bool
 is_lambda_ignored_entity (tree val)
 {
-  /* Look past normal capture proxies.  */
-  if (is_normal_capture_proxy (val))
+  /* Look past normal, non-VLA capture proxies.  */
+  if (is_normal_capture_proxy (val)
+  && !variably_modified_type_p (TREE_TYPE (val), NULL_TREE))
 return true;
 
   /* Always ignore lambda fields, their names are only for debugging.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-vla4.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-vla4.C
new file mode 100644
index 000..3e9cf076d14
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-vla4.C
@@ -0,0 +1,12 @@
+// PR c++/60855
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-Wno-vla" }
+
+int main() {
+  unsigned count = 5;
+  bool array[count];
+  [&array] () {
+array[0] = sizeof(array) > 5;
+  }();
+  return 0;
+}

base-commit: 8158a4640819dbb3210326e37786fb874f450272
-- 
2.18.1



[patch, testsuite] Add -fdelete-null-pointer-checks to some C++ testcases

2020-01-21 Thread Sandra Loosemore
In doing some nios2-elf testing, I ran into a bunch of failures in 
constexpr-related tests in the C++ testsuite.  This target defaults to 
-fno-delete-null-pointer-checks at the request of Altera/Intel, in order 
to support some of their BSPs where 0 is a legitimate memory address. 
Some other bare-metal targets also default to 
-fno-delete-null-pointer-checks.


This patch makes the dependence of these tests on 
-fdelete-null-pointer-checks explicit.  I've previously fixed some other 
tests that failed on nios2-elf in the same way so this is borderline 
obvious, but it's a little troubling to me that the correct semantics of 
some of these testcases seems to depend on what we document in the 
manual as an optimization option.  :-S  Maybe there is some other bug here?


Anyway, if nobody has any objections or better ideas, I will go ahead 
and commit this in a few days.


-Sandra
2020-01-21  Sandra Loosemore  

	gcc/testsuite/
	* g++.dg/cpp0x/constexpr-odr1.C: Add -fdelete-null-pointer-checks.
	* g++.dg/cpp0x/constexpr-odr2.C: Likewise.
	* g++.dg/cpp0x/nontype4.C: Likewise.
	* g++.dg/cpp1y/constexpr-new.C: Likewise.
	* g++.dg/cpp1y/new1.C: Likewise.
	* g++.dg/cpp1y/new2.C: Likewise.
	* g++.dg/cpp2a/constexpr-dynamic11.C: Likewise.
	* g++.dg/cpp2a/constexpr-dynamic17.C: Likewise.
	* g++.dg/cpp2a/constexpr-dynamic4.C: Likewise.
	* g++.dg/cpp2a/constexpr-new1.C: Likewise.
	* g++.dg/cpp2a/constexpr-new10.C: Likewise.
	* g++.dg/cpp2a/constexpr-new2.C: Likewise.
	* g++.dg/cpp2a/constexpr-new3.C: Likewise.
	* g++.dg/cpp2a/constexpr-new4.C: Likewise.
	* g++.dg/cpp2a/constexpr-new8.C: Likewise.
	* g++.dg/cpp2a/constexpr-new9.C: Likewise.
	* g++.dg/cpp2a/nontype-class1.C: Likewise.
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C
index cf3f95f..d00baae 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C
@@ -1,5 +1,6 @@
 // PR c++/92062 - ODR-use ignored for static member of class template.
 // { dg-do run { target c++11 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 template struct A {
   static const bool x;
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C
index 0927488..dd569a9 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C
@@ -1,5 +1,6 @@
 // PR c++/92062 - ODR-use ignored for static member of class template.
 // { dg-do run { target c++11 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 template struct A {
   static const bool x;
diff --git a/gcc/testsuite/g++.dg/cpp0x/nontype4.C b/gcc/testsuite/g++.dg/cpp0x/nontype4.C
index 2c552d0..b6a1ae7 100644
--- a/gcc/testsuite/g++.dg/cpp0x/nontype4.C
+++ b/gcc/testsuite/g++.dg/cpp0x/nontype4.C
@@ -1,5 +1,6 @@
 // PR c++/56428
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 struct A { };
 
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-new.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-new.C
index 6316ff2..d0ca0b7 100644
--- a/gcc/testsuite/g++.dg/cpp1y/constexpr-new.C
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-new.C
@@ -1,4 +1,5 @@
 // { dg-do compile { target c++14 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 constexpr int *f4(bool b) {
   if (b) {
diff --git a/gcc/testsuite/g++.dg/cpp1y/new1.C b/gcc/testsuite/g++.dg/cpp1y/new1.C
index b9ad64d..7016951 100644
--- a/gcc/testsuite/g++.dg/cpp1y/new1.C
+++ b/gcc/testsuite/g++.dg/cpp1y/new1.C
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-cddce-details" } */
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 #include 
 
diff --git a/gcc/testsuite/g++.dg/cpp1y/new2.C b/gcc/testsuite/g++.dg/cpp1y/new2.C
index 926e796..97f4001 100644
--- a/gcc/testsuite/g++.dg/cpp1y/new2.C
+++ b/gcc/testsuite/g++.dg/cpp1y/new2.C
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -std=c++17 -fdump-tree-cddce-details" } */
+/* { dg-additional-options "-fdelete-null-pointer-checks" } */
 
 #include 
 #include 
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic11.C b/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic11.C
index 6069fbf..8dfa03a 100644
--- a/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic11.C
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic11.C
@@ -1,5 +1,6 @@
 // PR c++/88337 - Implement P1327R1: Allow dynamic_cast/typeid in constexpr.
 // { dg-do compile { target c++2a } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 // dynamic_cast in a constructor.
 // [class.cdtor]#6: "If the operand of the dynamic_cast refers to the object
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic17.C b/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic17.C
index 6b443d2..c574e75 100644
--- a/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic17.C
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-dynamic17.C
@@ -1,5 +1,6 @@
 // PR c++/88337 - Implement P1327R1: 

[PATCH] libstdc++: Always return a sentinel from __gnu_test::test_range::end()

2020-01-21 Thread Patrick Palka
It seems that in practice std::sentinel_for is always true, and so the
test_range container doesn't help us detect bugs in ranges code in which we
wrongly assume that a sentinel can be manipulated like an iterator.  Make the
test_range container more strict by having end() unconditionally return a
sentinel.

Is this OK to commit after bootstrap+regtesting succeeds on x86_64-pc-linux-gnu?

libstdc++-v3/ChangeLog:

* testsuite/util/testsuite_iterators.h (__gnu_test::test_range::end):
Always return a sentinel.
---
 libstdc++-v3/testsuite/util/testsuite_iterators.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/libstdc++-v3/testsuite/util/testsuite_iterators.h 
b/libstdc++-v3/testsuite/util/testsuite_iterators.h
index eb15257bf6a..6667a3af93a 100644
--- a/libstdc++-v3/testsuite/util/testsuite_iterators.h
+++ b/libstdc++-v3/testsuite/util/testsuite_iterators.h
@@ -702,10 +702,7 @@ namespace __gnu_test
   auto end() &
   {
using I = decltype(get_iterator(bounds.last));
-   if constexpr (std::sentinel_for)
- return get_iterator(bounds.last);
-   else
- return sentinel{bounds.last};
+   return sentinel{bounds.last};
   }
 
   typename Iter::ContainerType bounds;
-- 
2.25.0.rc0



testsuite: More uses of effective-target march_option for cris

2020-01-21 Thread Hans-Peter Nilsson
gcc/testsuite:
* gcc.target/cris/asm-v8.S, gcc.target/cris/inasm-v8.c,
gcc.target/cris/sync-1.c: Apply effective_target_march_option.

Oops.  A few stragglers, same as recent update: differing
-march=... options is an error, noticed with e.g.
"make check RUNTESTFLAGS=--target_board=cris-sim/arch=v10"

diff --git a/gcc/testsuite/gcc.target/cris/asm-v8.S 
b/gcc/testsuite/gcc.target/cris/asm-v8.S
index 3fba3188454..8946ba916a1 100644
--- a/gcc/testsuite/gcc.target/cris/asm-v8.S
+++ b/gcc/testsuite/gcc.target/cris/asm-v8.S
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-DOTHER_ISA=8 -march=v8" } */
+/* { dg-options "-DOTHER_ISA=8 -march=v8" { target { ! march_option } } } */
 
 /* Check that -march=v8 is also recognized.  */
 
diff --git a/gcc/testsuite/gcc.target/cris/inasm-v8.c 
b/gcc/testsuite/gcc.target/cris/inasm-v8.c
index b2fb3053c40..f8f15e8cf60 100644
--- a/gcc/testsuite/gcc.target/cris/inasm-v8.c
+++ b/gcc/testsuite/gcc.target/cris/inasm-v8.c
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-DOTHER_ISA=8 -march=v8" } */
+/* { dg-options "-DOTHER_ISA=8 -march=v8" { target { ! march_option } } } */
 
 /* Check that -march=v8 is also recognized.  */
 
diff --git a/gcc/testsuite/gcc.target/cris/sync-1.c 
b/gcc/testsuite/gcc.target/cris/sync-1.c
index 1bc9a674c3b..bf080174e56 100644
--- a/gcc/testsuite/gcc.target/cris/sync-1.c
+++ b/gcc/testsuite/gcc.target/cris/sync-1.c
@@ -1,6 +1,6 @@
 /* Check that we can assemble both base atomic variants, for v0.  */
 /* { dg-do assemble } */
-/* { dg-options "-O2 -march=v0" } */
+/* { dg-options "-O2 -march=v0" { target { ! march_option } } } */
 
 #ifndef type
 #define type char
-- 
2.11.0

brgds, H-P


[PATCH] RISC-V: Fix rtl checking enabled failure with -msave-restore.

2020-01-21 Thread Jim Wilson
Found with an rtl checking enabled build and check.  This triggered failures
in the gcc.target/riscv/save-restore* tests.  We are using XINT to access an
XWINT value; INTVAL is the preferred solution.

Since existing tests trigger it we don't need a new one.  Tested with
riscv32-elf and riscv64-linux cross builds and checks with no regressions.

Committed.

Jim

gcc/
* config/riscv/riscv-sr.c (riscv_sr_match_prologue): Use INTVAL
instead of XINT.
---
 gcc/config/riscv/riscv-sr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-sr.c b/gcc/config/riscv/riscv-sr.c
index e3180efcbcc..744d0c48c33 100644
--- a/gcc/config/riscv/riscv-sr.c
+++ b/gcc/config/riscv/riscv-sr.c
@@ -115,7 +115,7 @@ riscv_sr_match_prologue (rtx_insn **body)
   && GET_CODE (XVECEXP (PATTERN (insn), 0, 0)) == UNSPEC_VOLATILE
   && (GET_CODE (XVECEXP (XVECEXP (PATTERN (insn), 0, 0), 0, 0))
  == CONST_INT)
-  && XINT (XVECEXP (XVECEXP (PATTERN (insn), 0, 0), 0, 0), 0) == 2)
+  && INTVAL (XVECEXP (XVECEXP (PATTERN (insn), 0, 0), 0, 0)) == 2)
 return insn;
 
   return NULL;
-- 
2.17.1



Re: [patch] contrib: script to create a new vendor branch

2020-01-21 Thread Hans-Peter Nilsson
> From: "Richard Earnshaw (lists)" 
> Date: Tue, 21 Jan 2020 14:36:32 +0100

> Correction, the branch should be named /, so the push 
> should be
> 
> git push vendors/ /
> 
> For example, for the ARM vendor, the push would be
> 
> git push vendors/ARM ARM/
> 
> R.
> 
> > will work as expected.
> > 
> > Run the script as
> > 
> > contrib/git-add-vendor-branch.sh / 
> > 
> > the  space must have previously been set up in the way 
> > git-fetch-vendor.sh expects.
> > 
> >  * git-add-vendor-bransh.sh: New file.

(typo "bransh")

Thanks, this and your previous reply certainly helped!

Any chance of a git-add-user-branch.sh too, while you're at it?
Or maybe it's just the corresponding push command in there,
that's needed.

brgds, H-P


[COMMITTED] analyzer: fix qsort issue with array_region keys (PR 93352)

2020-01-21 Thread David Malcolm
PR analyzer/93352 reports a qsort failure
  "comparator not anti-symmetric: -2147483648, -2147483648)"
within the analyzer on code involving an array access of [0x7fff + 1].

The issue is that array_region (which uses int for keys into known
values in the array) uses subtraction to implement int_cmp for sorting
the keys, which isn't going to work for boundary values.

Potentially a wider type should be used, but for now this patch fixes
the ICE by using explicit comparisons rather than subtraction to
implement the qsort callback.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
pushed to master as r10-6127-g4f01e5778689977c9569477947b8062d8d87.

gcc/analyzer/ChangeLog:
PR analyzer/93352
* region-model.cc (int_cmp): Rename to...
(array_region::key_cmp): ...this, using key_t rather than int.
Rewrite in terms of comparisons rather than subtraction to
ensure qsort is anti-symmetric when handling extreme values.
(array_region::walk_for_canonicalization): Update for above
renaming.
* region-model.h (array_region::key_cmp): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/93352
* gcc.dg/analyzer/pr93352.c: New test.
---
 gcc/analyzer/region-model.cc| 19 ---
 gcc/analyzer/region-model.h |  2 ++
 gcc/testsuite/gcc.dg/analyzer/pr93352.c | 12 
 3 files changed, 26 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr93352.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 7e995701dce..0bca5c0fd71 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -2389,15 +2389,20 @@ array_region::get_key_for_child_region (region_id 
child_rid, key_t *out) const
   return false;
 }
 
-/* qsort comparator for int.  */
+/* qsort comparator for array_region's keys.  */
 
-static int
-int_cmp (const void *p1, const void *p2)
+int
+array_region::key_cmp (const void *p1, const void *p2)
 {
-  int i1 = *(const int *)p1;
-  int i2 = *(const int *)p2;
+  key_t i1 = *(const key_t *)p1;
+  key_t i2 = *(const key_t *)p2;
 
-  return i1 - i2;
+  if (i1 > i2)
+return 1;
+  else if (i1 < i2)
+return -1;
+  else
+return 0;
 }
 
 /* Implementation of region::walk_for_canonicalization vfunc for
@@ -2414,7 +2419,7 @@ array_region::walk_for_canonicalization (canonicalization 
*c) const
   int key_a = (*iter).first;
   keys.quick_push (key_a);
 }
-  keys.qsort (int_cmp);
+  keys.qsort (key_cmp);
 
   unsigned i;
   int key;
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 6a63a9336b8..1090b96bfaa 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -1375,6 +1375,8 @@ public:
   static key_t key_from_constant (tree cst);
 
  private:
+  static int key_cmp (const void *, const void *);
+
   /* Mapping from tree to child region.  */
   map_t m_map;
 };
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr93352.c 
b/gcc/testsuite/gcc.dg/analyzer/pr93352.c
new file mode 100644
index 000..ccb96d0eaed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr93352.c
@@ -0,0 +1,12 @@
+/* { dg-additional-options "-Wno-overflow" } */
+
+struct yc {
+  int c0;
+  char di[];
+};
+
+void
+qt (struct yc *ab)
+{
+  ab->di[0x7fff + 1] = ab->di[0];
+}
-- 
2.21.0



tolerate padding in mbstate_t

2020-01-21 Thread Alexandre Oliva


Padding in mbstate_t objects may get the memcmp to fail.
Attempt to avoid the failure with zero initialization.


Regstrapped on x86_64-linux-gnu, and also tested on a platform that used
to fail because of padding in std::mbstate_t.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/27_io/fpos/mbstate_t/1.cc: Zero-init mbstate_t.
---
 testsuite/27_io/fpos/mbstate_t/1.cc |   28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc 
libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
index f92d68f..559bd8d 100644
--- libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
+++ libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
@@ -28,8 +28,24 @@
 void test01()
 {
   typedef std::mbstate_t state_type;
-  state_type state01 = state_type();
-  state_type state02 = state_type();
+  // Use zero-initialization of the underlying memory so that padding
+  // bytes, if any, stand a better chance of comparing the same.
+  // Zero-initialized memory is guaranteed to be a valid initial
+  // state.  This doesn't quite guarantee that any padding bits won't
+  // be overwritten when copying from other instances that haven't
+  // been fully initialized: this data type is compatible with C, so
+  // it is likely plain old data, but it could have a default ctor
+  // that initializes only the relevant fields, whereas copy-ctor and
+  // operator= could be implemented as a full-object memcpy, including
+  // padding bits, rather than fieldwise copying.  However, since
+  // we're comparing two values copied from the same state_type
+  // instance (or was this meant to take one of them from pos02 rather
+  // than both from pos01?), if padding bits are copied, we'll get the
+  // same for both of them, and if they aren't, we'll keep the values
+  // we initialized them with, so this should be good.
+  state_type state[2];
+  std::memset(state, 0, sizeof (state));
+
 
   std::streampos pos01(0);
   std::streampos pos02(0);
@@ -39,13 +55,13 @@ void test01()
   // state_type state();
 
   // XXX Need to have better sanity checking for the mbstate_t type,
-  // or whatever the insantiating type for class fpos happens to be
+  // or whatever the instantiating type for class fpos happens to be
   // for streampos, as things like equality operators and assignment
   // operators, increment and deincrement operators need to be in
   // place.
-  pos01.state(state02);
-  state01 = pos01.state();
-  VERIFY( std::memcmp(&state01, &state02, sizeof(state_type)) == 0 );
+  pos01.state(state[1]);
+  state[0] = pos01.state();
+  VERIFY( std::memcmp(&state[0], &state[1], sizeof(state_type)) == 0 );
 }
 
 int main() 

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


Re: Define HAVE_ for math long double functions declared in vxworks headers

2020-01-21 Thread Alexandre Oliva
On Jan  3, 2020, Jonathan Wakely  wrote:

>> +#include 
>> +#ifdef HAVE_IEEEFP_H
>> +# include 
>> +#endif
>> +], [
>> +  void (*f)(void) = (void (*)(void))$1;

> I wondered whether using ($1) here instead of just $1 would give any
> benefit. It would mean that function-like macros are ignored.

The lack of parentheses after a functional macro name is enough to avoid
its use as a macro.  However, the test I used wouldn't avoid aliasing
macros, such as #define foo bar, and it would pass even if the name was
defined as an object rather than as a function.

I've also come up with a way to test, during C compilation, that $1 is a
function rather than anything else.

If the test was in C++, we could presumably use template type resolution
and static asserts to recognize function-typed expressions, though a
possibility of overloading could make it fall apart.

In theory the closer we are to the environment in which the test result
will be used, the more likely we are to get the desired result, but I'm
a little concerned about switching to a C++ test, precisely because of
the math.h header that we might or might not override, and of unexpected
overloading that system math headers might offer if they are C++-ready,
which could then get us wrong test results.  Any thoughts on this
possibility?


Meanwhile, here's what I'm testing now...
It's supposed to be a strict incremental improvement, unless $1-&$1
unexpectedly compiles for some case I couldn't think of.

Ok to install?


reject macros in math decl check

From: Alexandre Oliva 

The C++ headers #undef the functions we are testing for, just in case
they're implemented as macros.  Do that as well, so as to reject
macros of the form #define foo bar, where bar is the actual
implementation, since those wouldn't work in C++ after the #undef.

While at that, tighten the test so that it doesn't take objects of
pointer types as if they were functions.


for  libstdc++-v3/ChangeLog

* crossconfig.m4 (GLIBCXX_CHECK_MATH_DECL): Reject macros and
data objects.
* configure: Rebuild.
---
 configure  |   66 +---
 crossconfig.m4 |   11 -
 2 files changed, 54 insertions(+), 23 deletions(-)

diff --git libstdc++-v3/crossconfig.m4 libstdc++-v3/crossconfig.m4
index 2a0cb04..221b59d 100644
--- libstdc++-v3/crossconfig.m4
+++ libstdc++-v3/crossconfig.m4
@@ -322,6 +322,14 @@ dnl argument 1 is name of function to check
 dnl
 dnl ASSUMES argument is a math function
 dnl
+dnl Test for $1 - &$1: it's well-formed iff $1 names a function,
+dnl because $1's type decays to that of &$1.
+dnl For any object type, even arrays and void*, types won't match.
+dnl
+dnl Undefine $1, so that we don't accept an alias macro, e.g.
+dnl say define foo bar, when bar is the function.  Our C++ headers
+dnl undefine math function names, so such a macro wouldn't do.
+dnl
 dnl GLIBCXX_CHECK_MATH_DECL
 AC_DEFUN([GLIBCXX_CHECK_MATH_DECL], [
   AC_CACHE_CHECK([for $1 declaration],
@@ -333,8 +341,9 @@ AC_DEFUN([GLIBCXX_CHECK_MATH_DECL], [
 #ifdef HAVE_IEEEFP_H
 # include 
 #endif
+#undef $1
 ], [
-  void (*f)(void) = (void (*)(void))$1;
+  (void)($1 - &$1);
 ], [glibcxx_cv_func_$1_use=yes
 ], [glibcxx_cv_func_$1_use=no])])
   if test "x$glibcxx_cv_func_$1_use" = xyes; then


-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


Re: tolerate padding in mbstate_t

2020-01-21 Thread Jonathan Wakely

On 21/01/20 21:36 -0300, Alexandre Oliva wrote:


Padding in mbstate_t objects may get the memcmp to fail.
Attempt to avoid the failure with zero initialization.


Regstrapped on x86_64-linux-gnu, and also tested on a platform that used
to fail because of padding in std::mbstate_t.  Ok to install?


Under what conditions does this fail? Only for -std=gnu++98 and not
later standards, I assume?

Because since C++11 state_type() does perform zero-initialization of
the whole object (including padding) even if it has a default
constructor.




for  libstdc++-v3/ChangeLog

* testsuite/27_io/fpos/mbstate_t/1.cc: Zero-init mbstate_t.
---
testsuite/27_io/fpos/mbstate_t/1.cc |   28 ++--
1 file changed, 22 insertions(+), 6 deletions(-)

diff --git libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc 
libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
index f92d68f..559bd8d 100644
--- libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
+++ libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
@@ -28,8 +28,24 @@
void test01()
{
  typedef std::mbstate_t state_type;
-  state_type state01 = state_type();
-  state_type state02 = state_type();
+  // Use zero-initialization of the underlying memory so that padding
+  // bytes, if any, stand a better chance of comparing the same.
+  // Zero-initialized memory is guaranteed to be a valid initial
+  // state.  This doesn't quite guarantee that any padding bits won't
+  // be overwritten when copying from other instances that haven't
+  // been fully initialized: this data type is compatible with C, so
+  // it is likely plain old data, but it could have a default ctor
+  // that initializes only the relevant fields, whereas copy-ctor and
+  // operator= could be implemented as a full-object memcpy, including
+  // padding bits, rather than fieldwise copying.  However, since
+  // we're comparing two values copied from the same state_type
+  // instance (or was this meant to take one of them from pos02 rather
+  // than both from pos01?),


I don't think so, that wouldn't work. I think pos02 could just be
removed from the test.



  1   2   >