date:20180118

Re: Go patch committed: Update to Go1.10beta1

2018-01-18 Thread Rainer Orth

Hi Ian,

> On Thu, Jan 11, 2018 at 1:46 AM, Rainer Orth
>  wrote:
>>
>>> On Wed, Jan 10, 2018 at 5:42 AM, Ian Lance Taylor  wrote:

 Whoops, there's a bug on big-endian 32-bit systems.  I'm testing
 https://golang.org/cl/87135.
>>>
>>> Committed as follows.
>>
>> thanks, that fixed quite a lot of the failures.
>>
>> However, many others remain, too many to report here.  I've filed PR
>> go/83787 to capture those.
>
> Thanks.  I found the problem: there is a new function makechan that
> takes a size argument of type int, and the old makechan, that took
> int64, is now makechan64.  Since the size argument was the last one,
> this worked fine except on 32-bit big-endian systems.  Fixed with this
> patch.  Bootstrapped and tested on x86_64-pc-linux-gnu and
> sparc-sun-solaris2.12.  Committed to mainline.

great, thanks for fixing this.

One related question: I mentioned in passing in the PR that four go test
invocations in gotools would hang indefinitely.  I noticed that they
weren't called with -test.timeout, unlike libgo's gotest.  However, in
exceptional cases like this this probably can't be helped short of
adding an external timeout mechanism.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Richard Biener

On Wed, 17 Jan 2018, Jakub Jelinek wrote:

> Hi!
> 
> PR83882 complains that PR81715 testcase fails on callee copies parameter
> targets.  The following patch ought to fix that, but I have only
> bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase
> with hppa.

Looks reasonable.

> John, do you think you could test this on hppa without the callee copies
> default change?
> 
> Or should we not care anymore if there aren't any similar targets left?

How's that communicated to the middle-end anyways?

Richard.

> 2018-01-17  Jakub Jelinek  
> 
>   PR sanitizer/81715
>   PR testsuite/83882
>   * function.h (gimplify_parameters): Add gimple_seq * argument.
>   * function.c: Include gimple.h and options.h.
>   (gimplify_parameters): Add cleanup argument, add CLOBBER stmts
>   for the added local temporaries if needed.
>   * gimplify.c (gimplify_body): Adjust gimplify_parameters caller,
>   if there are any parameter cleanups, wrap whole body into a
>   try/finally with the cleanups.
> 
> --- gcc/function.h.jj 2018-01-03 10:19:53.858533740 +0100
> +++ gcc/function.h2018-01-16 14:31:21.409972177 +0100
> @@ -607,7 +607,7 @@ extern bool initial_value_entry (int i,
>  extern void instantiate_decl_rtl (rtx x);
>  extern int aggregate_value_p (const_tree, const_tree);
>  extern bool use_register_for_decl (const_tree);
> -extern gimple_seq gimplify_parameters (void);
> +extern gimple_seq gimplify_parameters (gimple_seq *);
>  extern void locate_and_pad_parm (machine_mode, tree, int, int, int,
>tree, struct args_size *,
>struct locate_and_pad_arg_data *);
> --- gcc/function.c.jj 2018-01-12 11:35:48.901222595 +0100
> +++ gcc/function.c2018-01-16 15:13:22.165665047 +0100
> @@ -79,6 +79,8 @@ along with GCC; see the file COPYING3.
>  #include "tree-ssa.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "gimple.h"
> +#include "options.h"
>  
>  /* So we can assign to cfun in this file.  */
>  #undef cfun
> @@ -3993,7 +3995,7 @@ gimplify_parm_type (tree *tp, int *walk_
> statements to add to the beginning of the function.  */
>  
>  gimple_seq
> -gimplify_parameters (void)
> +gimplify_parameters (gimple_seq *cleanup)
>  {
>struct assign_parm_data_all all;
>tree parm;
> @@ -4058,6 +4060,16 @@ gimplify_parameters (void)
> else if (TREE_CODE (type) == COMPLEX_TYPE
>  || TREE_CODE (type) == VECTOR_TYPE)
>   DECL_GIMPLE_REG_P (local) = 1;
> +
> +   if (!is_gimple_reg (local)
> +   && flag_stack_reuse != SR_NONE)
> + {
> +   tree clobber = build_constructor (type, NULL);
> +   gimple *clobber_stmt;
> +   TREE_THIS_VOLATILE (clobber) = 1;
> +   clobber_stmt = gimple_build_assign (local, clobber);
> +   gimple_seq_add_stmt (cleanup, clobber_stmt);
> + }
>   }
> else
>   {
> --- gcc/gimplify.c.jj 2018-01-16 12:21:15.895859416 +0100
> +++ gcc/gimplify.c2018-01-16 14:41:27.643872081 +0100
> @@ -12589,7 +12589,7 @@ gbind *
>  gimplify_body (tree fndecl, bool do_parms)
>  {
>location_t saved_location = input_location;
> -  gimple_seq parm_stmts, seq;
> +  gimple_seq parm_stmts, parm_cleanup = NULL, seq;
>gimple *outer_stmt;
>gbind *outer_bind;
>struct cgraph_node *cgn;
> @@ -12628,7 +12628,7 @@ gimplify_body (tree fndecl, bool do_parm
>  
>/* Resolve callee-copies.  This has to be done before processing
>   the body so that DECL_VALUE_EXPR gets processed correctly.  */
> -  parm_stmts = do_parms ? gimplify_parameters () : NULL;
> +  parm_stmts = do_parms ? gimplify_parameters (&parm_cleanup) : NULL;
>  
>/* Gimplify the function's body.  */
>seq = NULL;
> @@ -12657,6 +12657,13 @@ gimplify_body (tree fndecl, bool do_parm
>tree parm;
>  
>gimplify_seq_add_seq (&parm_stmts, gimple_bind_body (outer_bind));
> +  if (parm_cleanup)
> + {
> +   gtry *g = gimple_build_try (parm_stmts, parm_cleanup,
> +   GIMPLE_TRY_FINALLY);
> +   parm_stmts = NULL;
> +   gimple_seq_add_stmt (&parm_stmts, g);
> + }
>gimple_bind_set_body (outer_bind, parm_stmts);
>  
>for (parm = DECL_ARGUMENTS (current_function_decl);
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Avoid creating overflows in match.pd (P + A) - (P + B) POINTER_DIFF_EXPR optimization (PR c/61240)

2018-01-18 Thread Richard Biener

On Wed, 17 Jan 2018, Jakub Jelinek wrote:

> Hi!
> 
> POINTER_DIFF_EXPR returns a signed integer, but sadly POINTER_PLUS_EXPR
> second arguments are unsigned integers, so if we are adding negative numbers
> to pointers, those are very large numbers and we get TREE_OVERFLOW which the
> C FE then during c_fully_fold diagnoses.

Bah...

> We want to treat the numbers as signed integers for this purpose, using
> VCE seems easiest way to get rid of the unwanted overflows.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Or do you prefer manual fold_converts in the (with { }) block + test
> for INTEGER_CST && TREE_OVERFLOW + drop_tree_overflow instead?

I think the patch is ok as-is.  I really hope to make the
POINTER_PLUS_EXPR offset operand signed in next stage1.

Thanks,
Richard.


> 2018-01-17  Jakub Jelinek  
> 
>   PR c/61240
>   * match.pd ((P + A) - P, P - (P + A), (P + A) - (P + B)): For
>   pointer_diff optimizations use view_convert instead of convert.
> 
>   * gcc.dg/pr61240.c: New test.
> 
> --- gcc/match.pd.jj   2018-01-15 10:02:04.0 +0100
> +++ gcc/match.pd  2018-01-17 17:10:54.855061485 +0100
> @@ -1832,7 +1832,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* The second argument of pointer_plus must be interpreted as signed, and
> thus sign-extended if necessary.  */
>  (with { tree stype = signed_type_for (TREE_TYPE (@1)); }
> - (convert (convert:stype @1
> + /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR
> + second arg is unsigned even when we need to consider it as signed,
> + we don't want to diagnose overflow here.  */
> + (convert (view_convert:stype @1
>  
>/* (T)P - (T)(P + A) -> -(T) A */
>(simplify
> @@ -1876,7 +1879,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* The second argument of pointer_plus must be interpreted as signed, and
> thus sign-extended if necessary.  */
>  (with { tree stype = signed_type_for (TREE_TYPE (@1)); }
> - (negate (convert (convert:stype @1)
> + /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR
> + second arg is unsigned even when we need to consider it as signed,
> + we don't want to diagnose overflow here.  */
> + (negate (convert (view_convert:stype @1)
>  
>/* (T)(P + A) - (T)(P + B) -> (T)A - (T)B */
>(simplify
> @@ -1927,7 +1933,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* The second argument of pointer_plus must be interpreted as signed, and
> thus sign-extended if necessary.  */
>  (with { tree stype = signed_type_for (TREE_TYPE (@1)); }
> - (minus (convert (convert:stype @1)) (convert (convert:stype @2)))
> + /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR
> + second arg is unsigned even when we need to consider it as signed,
> + we don't want to diagnose overflow here.  */
> + (minus (convert (view_convert:stype @1))
> + (convert (view_convert:stype @2)))
>  
>  
>  /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax().  */
> --- gcc/testsuite/gcc.dg/pr61240.c.jj 2018-01-17 17:25:45.821030898 +0100
> +++ gcc/testsuite/gcc.dg/pr61240.c2018-01-17 17:26:24.118029550 +0100
> @@ -0,0 +1,20 @@
> +/* PR c/61240 */
> +/* { dg-do compile } */
> +
> +typedef __PTRDIFF_TYPE__ ptrdiff_t;
> +
> +ptrdiff_t
> +foo (ptrdiff_t a[4])
> +{
> +  int i[4];
> +  int *p = i + 2;
> +  static ptrdiff_t b = p - (p - 1);  /* { dg-bogus "integer overflow in 
> expression" } */
> +  static ptrdiff_t c = (p - 1) - p;  /* { dg-bogus "integer overflow in 
> expression" } */
> +  static ptrdiff_t d = (p - 2) - (p - 1);/* { dg-bogus "integer overflow in 
> expression" } */
> +  static ptrdiff_t e = (p - 1) - (p - 2);/* { dg-bogus "integer overflow in 
> expression" } */
> +  a[0] = p - (p - 1);/* { dg-bogus "integer overflow 
> in expression" } */
> +  a[1] = (p - 1) - p;/* { dg-bogus "integer overflow 
> in expression" } */
> +  a[2] = (p - 2) - (p - 1);  /* { dg-bogus "integer overflow in 
> expression" } */
> +  a[3] = (p - 1) - (p - 2);  /* { dg-bogus "integer overflow in 
> expression" } */
> +  return b + c + d + e;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-18 Thread Woodhouse, David

On Thu, 2018-01-18 at 08:52 +0100, Uros Bizjak wrote:
> This puts an extra burden on the developer, which has to use correct
> thunk name in their code. Sure, this can be solved trivially with
> #ifdef __x86_64__, so the issue is minor, but I thought it has to be
> mentioned before the name is set in stone.

Except the developer can mostly use the inline thunk and not have to
worry their pretty little heads about it. And the kernel developers who
*have* chosen to implement their own thunk (and who requested the
thunk-extern variant in the first place) have explicitly asked for the
symbol name to be as it is.

I spent a happy few hours on Sunday night looking at this. Seriously,
in neither the thunk implementation nor the call sites (in explicit
asm) was it easier. It was horrid — just look at the patch I sent out
with the footnote "let's not do this".

In the thunk I still had the actual register name with the 'e' or the
'r' in a macro argument anyway, because I had to deal with the value
from that register. So the thunk macro ended up taking *two* arguments
because its name now didn't match what it had to call the register.

And in the call sites I was putting the target into a register again
using either 'eax' or 'rax' form, then having to branch to the thunk
using the different '_ax' form. It just hurt.

smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Jakub Jelinek

On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote:
> On Wed, 17 Jan 2018, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > PR83882 complains that PR81715 testcase fails on callee copies parameter
> > targets.  The following patch ought to fix that, but I have only
> > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase
> > with hppa.
> 
> Looks reasonable.
> 
> > John, do you think you could test this on hppa without the callee copies
> > default change?
> > 
> > Or should we not care anymore if there aren't any similar targets left?
> 
> How's that communicated to the middle-end anyways?

You mean the callee copies stuff?  gimplify_parameters creates new
temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the
parameters to the temporary and at the end of function clears
DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well.

Jakub

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Richard Biener

On Thu, 18 Jan 2018, Jakub Jelinek wrote:

> On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote:
> > On Wed, 17 Jan 2018, Jakub Jelinek wrote:
> > 
> > > Hi!
> > > 
> > > PR83882 complains that PR81715 testcase fails on callee copies parameter
> > > targets.  The following patch ought to fix that, but I have only
> > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase
> > > with hppa.
> > 
> > Looks reasonable.
> > 
> > > John, do you think you could test this on hppa without the callee copies
> > > default change?
> > > 
> > > Or should we not care anymore if there aren't any similar targets left?
> > 
> > How's that communicated to the middle-end anyways?
> 
> You mean the callee copies stuff?  gimplify_parameters creates new
> temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the
> parameters to the temporary and at the end of function clears
> DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well.

No, I meant whether the target wants this or not.

Richard.

Re: [PATCH] i386: Use const reference of struct ix86_frame to avoid copy

2018-01-18 Thread Uros Bizjak

On Wed, Jan 17, 2018 at 5:00 PM, H.J. Lu  wrote:
> We can use const reference of struct ix86_frame to avoid making a local
> copy of ix86_frame.  ix86_expand_epilogue makes a local copy of struct
> ix86_frame and uses the reg_save_offset field as a local variable.  This
> patch uses a separate local variable for reg_save_offset.
>
> Tested on x86-64 with ada.  OK for trunk?

OK.

Thanks,
Uros.

> H.J.
> --
> PR target/83905
> * config/i386/i386.c (ix86_expand_prologue): Use cost reference
> of struct ix86_frame.
> (ix86_expand_epilogue): Likewise.  Add a local variable for
> the reg_save_offset field in struct ix86_frame.
> ---
>  gcc/config/i386/i386.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index a301e18ed70..340eca42449 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -13385,7 +13385,6 @@ ix86_expand_prologue (void)
>  {
>struct machine_function *m = cfun->machine;
>rtx insn, t;
> -  struct ix86_frame frame;
>HOST_WIDE_INT allocate;
>bool int_registers_saved;
>bool sse_registers_saved;
> @@ -13413,7 +13412,7 @@ ix86_expand_prologue (void)
>m->fs.sp_valid = true;
>m->fs.sp_realigned = false;
>
> -  frame = m->frame;
> +  const struct ix86_frame &frame = cfun->machine->frame;
>
>if (!TARGET_64BIT && ix86_function_ms_hook_prologue 
> (current_function_decl))
>  {
> @@ -14291,7 +14290,6 @@ ix86_expand_epilogue (int style)
>  {
>struct machine_function *m = cfun->machine;
>struct machine_frame_state frame_state_save = m->fs;
> -  struct ix86_frame frame;
>bool restore_regs_via_mov;
>bool using_drap;
>bool restore_stub_is_tail = false;
> @@ -14304,7 +14302,7 @@ ix86_expand_epilogue (int style)
>  }
>
>ix86_finalize_stack_frame_flags ();
> -  frame = m->frame;
> +  const struct ix86_frame &frame = cfun->machine->frame;
>
>m->fs.sp_realigned = stack_realign_fp;
>m->fs.sp_valid = stack_realign_fp
> @@ -14348,11 +14346,13 @@ ix86_expand_epilogue (int style)
>   + UNITS_PER_WORD);
>  }
>
> +  HOST_WIDE_INT reg_save_offset = frame.reg_save_offset;
> +
>/* Special care must be taken for the normal return case of a function
>   using eh_return: the eax and edx registers are marked as saved, but
>   not restored along this path.  Adjust the save location to match.  */
>if (crtl->calls_eh_return && style != 2)
> -frame.reg_save_offset -= 2 * UNITS_PER_WORD;
> +reg_save_offset -= 2 * UNITS_PER_WORD;
>
>/* EH_RETURN requires the use of moves to function properly.  */
>if (crtl->calls_eh_return)
> @@ -14368,11 +14368,11 @@ ix86_expand_epilogue (int style)
>else if (TARGET_EPILOGUE_USING_MOVE
>&& cfun->machine->use_fast_prologue_epilogue
>&& (frame.nregs > 1
> -  || m->fs.sp_offset != frame.reg_save_offset))
> +  || m->fs.sp_offset != reg_save_offset))
>  restore_regs_via_mov = true;
>else if (frame_pointer_needed
>&& !frame.nregs
> -  && m->fs.sp_offset != frame.reg_save_offset)
> +  && m->fs.sp_offset != reg_save_offset)
>  restore_regs_via_mov = true;
>else if (frame_pointer_needed
>&& TARGET_USE_LEAVE
> @@ -14440,7 +14440,7 @@ ix86_expand_epilogue (int style)
>rtx t;
>
>if (frame.nregs)
> -   ix86_emit_restore_regs_using_mov (frame.reg_save_offset, style == 2);
> +   ix86_emit_restore_regs_using_mov (reg_save_offset, style == 2);
>
>/* eh_return epilogues need %ecx added to the stack pointer.  */
>if (style == 2)
> @@ -14535,19 +14535,19 @@ ix86_expand_epilogue (int style)
>  in epilogues.  */
>if (!m->fs.sp_valid || m->fs.sp_realigned
>   || (TARGET_SEH
> - && (m->fs.sp_offset - frame.reg_save_offset
> + && (m->fs.sp_offset - reg_save_offset
>   >= SEH_MAX_FRAME_SIZE)))
> {
>   pro_epilogue_adjust_stack (stack_pointer_rtx, 
> hard_frame_pointer_rtx,
>  GEN_INT (m->fs.fp_offset
> - - frame.reg_save_offset),
> + - reg_save_offset),
>  style, false);
> }
> -  else if (m->fs.sp_offset != frame.reg_save_offset)
> +  else if (m->fs.sp_offset != reg_save_offset)
> {
>   pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
>  GEN_INT (m->fs.sp_offset
> - - frame.reg_save_offset),
> + - reg_save_offset),
>  style,
>  m->fs.cfa_reg == stack_pointer_rtx);
> }
> --
> 2.14.3
>
>

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Jakub Jelinek

On Thu, Jan 18, 2018 at 09:18:21AM +0100, Richard Biener wrote:
> On Thu, 18 Jan 2018, Jakub Jelinek wrote:
> 
> > On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote:
> > > On Wed, 17 Jan 2018, Jakub Jelinek wrote:
> > > 
> > > > Hi!
> > > > 
> > > > PR83882 complains that PR81715 testcase fails on callee copies parameter
> > > > targets.  The following patch ought to fix that, but I have only
> > > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the 
> > > > testcase
> > > > with hppa.
> > > 
> > > Looks reasonable.
> > > 
> > > > John, do you think you could test this on hppa without the callee copies
> > > > default change?
> > > > 
> > > > Or should we not care anymore if there aren't any similar targets left?
> > > 
> > > How's that communicated to the middle-end anyways?
> > 
> > You mean the callee copies stuff?  gimplify_parameters creates new
> > temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the
> > parameters to the temporary and at the end of function clears
> > DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well.
> 
> No, I meant whether the target wants this or not.

Ah, that.  Through targetm.calls.callee_copies hook.

Jakub

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Richard Biener

On Thu, 18 Jan 2018, Jakub Jelinek wrote:

> On Thu, Jan 18, 2018 at 09:18:21AM +0100, Richard Biener wrote:
> > On Thu, 18 Jan 2018, Jakub Jelinek wrote:
> > 
> > > On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote:
> > > > On Wed, 17 Jan 2018, Jakub Jelinek wrote:
> > > > 
> > > > > Hi!
> > > > > 
> > > > > PR83882 complains that PR81715 testcase fails on callee copies 
> > > > > parameter
> > > > > targets.  The following patch ought to fix that, but I have only
> > > > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the 
> > > > > testcase
> > > > > with hppa.
> > > > 
> > > > Looks reasonable.
> > > > 
> > > > > John, do you think you could test this on hppa without the callee 
> > > > > copies
> > > > > default change?
> > > > > 
> > > > > Or should we not care anymore if there aren't any similar targets 
> > > > > left?
> > > > 
> > > > How's that communicated to the middle-end anyways?
> > > 
> > > You mean the callee copies stuff?  gimplify_parameters creates new
> > > temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the
> > > parameters to the temporary and at the end of function clears
> > > DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well.
> > 
> > No, I meant whether the target wants this or not.
> 
> Ah, that.  Through targetm.calls.callee_copies hook.

So there are quite a few more callee copies targets then, notably
unconditionally epiphany, mmix, mn10300 and v850 and conditionally
a few more, for example mips.

Richard.

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-18 Thread Tom de Vries


On 01/17/2018 06:51 PM, Jakub Jelinek wrote:

On Wed, Jan 17, 2018 at 06:42:33PM +0100, Tom de Vries wrote:

+static rtx
+expand_builtin_goacc_parlevel_id_size (tree exp, rtx target, int ignore)
+{
+  tree fndecl = get_callee_fndecl (exp);
+
+  const char *name;
+  switch (DECL_FUNCTION_CODE (fndecl))
+{
+case BUILT_IN_GOACC_PARLEVEL_ID:
+  name = "__builtin_goacc_parlevel_id";
+  break;
+case BUILT_IN_GOACC_PARLEVEL_SIZE:
+  name = "__builtin_goacc_parlevel_size";
+  break;


Can you avoid that many switches on DECL_FUNCTION_CODE?
Like initialize in this one not just name, but also the
fallback_retval and gen_fn variables and just use them later?


+default:
+  gcc_unreachable ();
+}
+
+  if (oacc_get_fn_attrib (current_function_decl) == NULL_TREE)
+{
+  error ("%s only supported in openacc code", name);


OpenACC ?


+  return const0_rtx;
+}
+
+  tree arg = CALL_EXPR_ARG (exp, 0);
+  if (TREE_CODE (arg) != INTEGER_CST)
+{
+  error ("non-constant argument 0 to %s", name);


%qs instead of %s, 2 times.


+  return const0_rtx;
+}
+
+  int dim = TREE_INT_CST_LOW (arg);
+  switch (dim)
+{
+case GOMP_DIM_GANG:
+case GOMP_DIM_WORKER:
+case GOMP_DIM_VECTOR:
+  break;
+default:
+  error ("illegal argument 0 to %s", name);
+  return const0_rtx;
+}
+
+  if (ignore)
+return target;
+
+  if (!targetm.have_oacc_dim_size ())
+{
+  rtx retval;
+  switch (DECL_FUNCTION_CODE (fndecl))
+   {
+   case BUILT_IN_GOACC_PARLEVEL_ID:
+ retval = const0_rtx;
+ break;
+   case BUILT_IN_GOACC_PARLEVEL_SIZE:
+ retval = GEN_INT (1);


In addition to moving these assignments to a single switch,
this one can be fallback_retval = const1_rtx;

Otherwise LGTM for stage1.


Followed up on the comments above.

Retested as before.

Will commit as attached in stage 1.

Thanks,
- Tom
Add __builtin_goacc_parlevel_{id,size}

2018-01-06  Tom de Vries  

	PR libgomp/82428
	* builtins.def (DEF_GOACC_BUILTIN_ONLY): Define.
	* omp-builtins.def (BUILT_IN_GOACC_PARLEVEL_ID)
	(BUILT_IN_GOACC_PARLEVEL_SIZE): New builtin.
	* builtins.c (expand_builtin_goacc_parlevel_id_size): New function.
	(expand_builtin): Call expand_builtin_goacc_parlevel_id_size.
	* doc/extend.texi (Other Builtins): Add __builtin_goacc_parlevel_id and
	__builtin_goacc_parlevel_size.

	* f95-lang.c (DEF_GOACC_BUILTIN_ONLY): Define.

	* c-c++-common/goacc/builtin-goacc-parlevel-id-size-2.c: New test.
	* c-c++-common/goacc/builtin-goacc-parlevel-id-size.c: New test.

	* testsuite/libgomp.oacc-c-c++-common/gang-static-2.c: Use
	__builtin_goacc_parlevel_{id,size}.
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/tile-1.c: Same.

---
 gcc/builtins.c | 71 +++
 gcc/builtins.def   |  4 ++
 gcc/doc/extend.texi| 10 +++
 gcc/fortran/f95-lang.c |  4 ++
 gcc/omp-builtins.def   |  5 ++
 .../goacc/builtin-goacc-parlevel-id-size-2.c   | 37 ++
 .../goacc/builtin-goacc-parlevel-id-size.c | 79 ++
 .../libgomp.oacc-c-c++-common/gang-static-2.c  | 21 +++---
 .../libgomp.oacc-c-c++-common/loop-auto-1.c| 18 ++---
 .../libgomp.oacc-c-c++-common/loop-dim-default.c   | 14 ++--
 .../testsuite/libgomp.oacc-c-c++-common/loop-g-1.c | 17 ++---
 .../testsuite/libgomp.oacc-c-c++-common/loop-g-2.c | 16 ++---
 .../libgomp.oacc-c-c++-common/loop-gwv-1.c | 17 +++--
 .../libgomp.oacc-c-c++-common/loop-red-g-1.c   | 16 ++---
 .../libgomp.oacc-c

Re: [PATCH][arm] Fix gcc.target/arm/pr40887.c directives

2018-01-18 Thread Christophe Lyon

Hi Kyrill,


On 17 January 2018 at 12:20, Kyrill  Tkachov
 wrote:
> Hi all,
>
> This patch converts gcc.target/arm/pr40887.c to use the proper effective
> target check and dg-add-options for armv5te
> so that we avoid situations where we end up trying to compile the test with
> a Thumb1 hard-float ABI, which makes the
> compiler complain.
>
> This allows the test to pass gracefully for me for my compiler configured
> with:
> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard
> --with-mode=thumb
>
> Committing to trunk.
>
> Thanks,
> Kyrill
>
> 2018-01-17  Kyrylo Tkachov  
>
> * gcc.target/arm/pr40887.c: Add armv5te effective target checks and
> directives.

I think you had to swap dg-options and dg-add-options directives, so I
committed the attached patch as obvious.

Christophe
2018-01-18  Christophe Lyon  

* gcc.target/arm/pr40887.c: Fix dg-options and dg-add-options
order.

Index: gcc/testsuite/gcc.target/arm/pr40887.c
===
--- gcc/testsuite/gcc.target/arm/pr40887.c  (revision 256838)
+++ gcc/testsuite/gcc.target/arm/pr40887.c  (revision 256839)
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-skip-if "need at least armv5" { *-*-* } { "-march=armv[234]*" } { "" } 
} */
 /* { dg-require-effective-target arm_arch_v5te_ok } */
+/* { dg-options "-O2" }  */
 /* { dg-add-options arm_arch_v5te } */
-/* { dg-options "-O2" }  */
 /* { dg-final { scan-assembler "blx" } } */
 
 int (*indirect_func)(int x);

[PATCH][arm] PR target/65578: Fix builtin-bswap16-1.c and builtin-bswap-1.c

2018-01-18 Thread Kyrill Tkachov


Hi all,

The builtin-bswap-1.c and builtin-bswap16-1.c are pretty annoying at the moment.
They force an explicit armv6 option that is a thumb1 target, so if you're 
testing a toolchain
configured with something like --with-cpu=cortex-a15 --with-float=hard 
--with-mode=thumb
you'll get those pesky errors about Thumb1 hard-float not being implemented, 
even though
the tests don't relate to floating-point functionality at all. I *think* this 
is also due
to the wrong order of dg-options and dg-require-effective-target directives 
that might
end up not doing a proper effective target check.

The solution in this patch is to commonise the code and create a couple of 
tests for each.
One tests an armv6t2 target. This allows us to test an ARM or a Thumb2 target.
The second one sets an armv6-m target, which is a Thumb1 target.
The dg-add-options machinery for arm_arch_v6m knows how to add the right 
-mfloat-abi=soft option.

With this patch we end up testing all of ARM, Thumb1, Thumb2 codegen whereas 
before we only
ever tried testing Thumb1, if the multilib options happened to line up just 
right, and would
give an ugly error otherwise.  Now, if the multilib options don't allow the 
test it should just
appear as UNSUPPORTED.

Committing to trunk.

Thanks,
Kyrill

2018-01-18  Kyrylo Tkachov  

PR target/65578
* gcc.target/arm/builtin-bswap.x: New file.
* gcc.target/arm/builtin-bswap-1.c: Include the above.  Add checks
and options for armv6t2.
* gcc.target/arm/builtin-bswap-2.c: Include the above.  Add checks
and options for Thumb1.
* gcc.target/arm/builtin-bswap16.x: New file.
* gcc.target/arm/builtin-bswap16-1.c: Include the above.  Add checks
and options for armv6t2.
* gcc.target/arm/builtin-bswap16-2.c: Include the above.  Add checks
and options for Thumb1.
diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
index 43195bd8235b492a3c4783d88bf744ac92ee9f68..c1e7740d14d3ca4e93a71e38b12f82c19791a204 100644
--- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
+++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c
@@ -1,7 +1,9 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
-/* { dg-require-effective-target arm_arch_v6_ok } */
-/* { dg-add-options arm_arch_v6 } */
+/* { dg-require-effective-target arm_arch_v6t2_ok } */
+/* { dg-add-options arm_arch_v6t2 } */
+/* This test depends on if-conversion creating the conditional forms of
+   of the instructions.  Add an -mtune option known to facilitate that.  */
+/* { dg-additional-options "-O2 -mtune=cortex-a53" } */
 /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
 /* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } } }  */
 /* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } } } }  */
@@ -13,69 +15,4 @@
 /* { dg-final { scan-assembler-times "revne\\t" 2 { target { arm_nothumb } } } }  */
 /* { dg-final { scan-assembler-times "rev\\t" 4 { target { ! arm_nothumb } } } }  */
 
-/* revsh */
-short swaps16 (short x)
-{
-  return __builtin_bswap16 (x);
-}
-
-extern short foos16 (short);
-
-/* revshne */
-short swaps16_cond (short x, int y)
-{
-  short z = x;
-  if (y)
-z = __builtin_bswap16 (x);
-  return foos16 (z);
-}
-
-/* rev16 */
-unsigned short swapu16 (unsigned short x)
-{
-  return __builtin_bswap16 (x);
-}
-
-extern unsigned short foou16 (unsigned short);
-
-/* rev16ne */
-unsigned short swapu16_cond (unsigned short x, int y)
-{
-  unsigned short z = x;
-  if (y)
-z = __builtin_bswap16 (x);
-  return foou16 (z);
-}
-
-/* rev */
-int swaps32 (int x) {
-  return __builtin_bswap32 (x);
-}
-
-extern int foos32 (int);
-
-/* revne */
-int swaps32_cond (int x, int y)
-{
-  int z = x;
-  if (y)
-z = __builtin_bswap32 (x);
-  return foos32 (z);
-}
-
-/* rev */
-unsigned int swapu32 (unsigned int x)
-{
-  return __builtin_bswap32 (x);
-}
-
-extern unsigned int foou32 (unsigned int);
-
-/* revne */
-unsigned int swapsu2 (unsigned int x, int y)
-{
-  int z = x;
-  if (y)
-z = __builtin_bswap32 (x);
-  return foou32 (z);
-}
+#include "builtin-bswap.x"
diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c b/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c
new file mode 100644
index ..4ba79aa322dc72b521dbff7e5f5ac3f77767988c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v6m_ok } */
+/* { dg-add-options arm_arch_v6m } */
+/* { dg-additional-options "-O2" } */
+/* { dg-final { scan-assembler-not "orr\[ \t\]" } } */
+/* { dg-final { scan-assembler-times "revsh\\t" 2  } }  */
+/* { dg-final { scan-assembler-times "rev16\\t" 2 } }  */
+/* { dg-final { scan-assembler-times "rev\\t" 4 } }  */
+
+#include "builtin-bswap.x"
diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap.x b/gcc/testsuite/gcc.target/arm/builtin-bswap.x
new file mode 100644
index 0

Re: [PATCH][arm] Fix gcc.target/arm/pr40887.c directives

2018-01-18 Thread Kyrill Tkachov



On 18/01/18 09:15, Christophe Lyon wrote:

Hi Kyrill,


On 17 January 2018 at 12:20, Kyrill  Tkachov
 wrote:

Hi all,

This patch converts gcc.target/arm/pr40887.c to use the proper effective
target check and dg-add-options for armv5te
so that we avoid situations where we end up trying to compile the test with
a Thumb1 hard-float ABI, which makes the
compiler complain.

This allows the test to pass gracefully for me for my compiler configured
with:
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard
--with-mode=thumb

Committing to trunk.

Thanks,
Kyrill

2018-01-17  Kyrylo Tkachov  

 * gcc.target/arm/pr40887.c: Add armv5te effective target checks and
 directives.

I think you had to swap dg-options and dg-add-options directives, so I
committed the attached patch as obvious.


Thanks Christophe,
I had spotted it and was planning to fix it by using dg-additional-options,
but it's the same in the end.

Kyrill


Christophe

Re: [PATCH][arm] Convert gcc.target/arm/stl-cond.c into an RTL test

2018-01-18 Thread Christophe Lyon

Hi Kyrill,


On 17 January 2018 at 12:27, Kyrill  Tkachov
 wrote:
> Hi all,
>
> This is an awkward testsuite failure. The original bug was that we were
> failing to put out
> the conditional code in the conditional form of the STL instruction (oops!).
> So we wanted to output STLNE, but instead output STL.
> The testacase relies on if-conversion to conditionalise the insn for STL.
> However, ever since r251643 the expansion of a non-relaxed atomic store
> always includes a compiler barrier. That blocks if-conversion in all cases.
>
> So there's no easy to get to a conditional STL instruction from a C program.
> But we do want to test for the original bug fix that if the RTL insn for STL
> is conditionalised
> it should output the conditional code.
>
> The solution in this patch is to convert the test into an RTL test with the
> COND_EXEC form
> of the STL insn and scan the assembly output there.
> This seems to work fine, and gives us an opportunity to create a
> gcc.dg/rtl/arm directory
> in the RTL tests.
>
> This now makes the gcc.target/arm/stl-cond.c disappear (as the test is
> deleted) and
> the new test in gcc.dg/rtl/arm/stl-cond.c passes.
>
> Committing to trunk.
> Thanks,
> Kyrill
>
> 2018-01-17  Kyrylo Tkachov  
>
> * gcc.dg/rtl/arm/stl-cond.c: New test.
> * gcc.target/arm/stl-cond.c: Delete.

I've noticed that the new test is unsupported on armeb, is this intentional?

Christophe

Re: [PATCH][arm] Convert gcc.target/arm/stl-cond.c into an RTL test

2018-01-18 Thread Kyrill Tkachov


Hi Christophe,

On 18/01/18 09:25, Christophe Lyon wrote:

Hi Kyrill,


On 17 January 2018 at 12:27, Kyrill  Tkachov
 wrote:

Hi all,

This is an awkward testsuite failure. The original bug was that we were
failing to put out
the conditional code in the conditional form of the STL instruction (oops!).
So we wanted to output STLNE, but instead output STL.
The testacase relies on if-conversion to conditionalise the insn for STL.
However, ever since r251643 the expansion of a non-relaxed atomic store
always includes a compiler barrier. That blocks if-conversion in all cases.

So there's no easy to get to a conditional STL instruction from a C program.
But we do want to test for the original bug fix that if the RTL insn for STL
is conditionalised
it should output the conditional code.

The solution in this patch is to convert the test into an RTL test with the
COND_EXEC form
of the STL insn and scan the assembly output there.
This seems to work fine, and gives us an opportunity to create a
gcc.dg/rtl/arm directory
in the RTL tests.

This now makes the gcc.target/arm/stl-cond.c disappear (as the test is
deleted) and
the new test in gcc.dg/rtl/arm/stl-cond.c passes.

Committing to trunk.
Thanks,
Kyrill

2018-01-17  Kyrylo Tkachov  

 * gcc.dg/rtl/arm/stl-cond.c: New test.
 * gcc.target/arm/stl-cond.c: Delete.

I've noticed that the new test is unsupported on armeb, is this intentional?


I think it can PASS on armeb. I guess it's just a matter of changing the target 
in
/* { dg-do compile { target arm-*-* } } */
to arm*-*-*.

Such a patch is pre-approved.
Kyrill


Christophe

Re: [PATCH] PR82964: Fix 128-bit immediate ICEs

2018-01-18 Thread Christophe Lyon

Hi Wilco,


On 17 January 2018 at 17:22, Wilco Dijkstra  wrote:
> James Greenhalgh wrote:
>
>> -  /* Do not allow wide int constants - this requires support in movti.  */
>> +  /* Only allow simple 128-bit immediates.  */
>>if (CONST_WIDE_INT_P (x))
>> -return false;
>> +return aarch64_mov128_immediate (x);
>
>> I can see why this could be correct, but it is unclear why it is neccessary
>> to fix the bug. What goes wrong if we leave this as "return false".
>
> It's not necessary, things only go wrong if you return true for a wider set of
> immediates than those directly supported by the movti pattern - and that may
> be a regalloc issue.
>
> However removing it (returning false in all cases) actually improves code 
> quality
> due to a bug in memset expansion. Therefore I'll commit it as returning false
> for now (there was no change in test results) and update it once memset is 
> fixed
> and inlining works as expected.
>
> Returning true means memset(p, 32, 63) expands as:
>
> mov x2, 2314885530818453536
> mov x3, 2314885530818453536
> mov x6, 2314885530818453536
> mov w5, 538976288
> mov w4, 8224
> mov w1, 32
> stp x2, x3, [x0]
> stp x2, x3, [x0, 16]
> stp x2, x3, [x0, 32]
> str x6, [x0, 48]
> str w5, [x0, 56]
> strhw4, [x0, 60]
> strbw1, [x0, 62]
> ret
>
> Wilco
>
>

After this patch (r256800), I have noticed new failures on aarch64:
gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times
mov\tw[0-9]+, #?19520 3 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times
movi\tv[0-9]+\\.2s, 0x4c, lsl 8 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times
movi\tv[0-9]+\\.2s, 0xbc, lsl 8 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times
fmov\th[0-9], w[0-9]+ 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times
mov\tw[0-9]+, 19520 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times
movi\tv[0-9]+\\.2s, 0x5c, lsl 8 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times
movi\tv[0-9]+\\.2s, 0x7c, lsl 8 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times
movi\tv[0-9]+\\.2s, 0x80, lsl 8 1 (found 0 times)
gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times
movi\tv[0-9]+\\.4h, ?#0 1 (found 0 times)

Christophe

Re: [PATCH] lto, testsuite: Fix ICE in -Wodr (PR lto/83121)

2018-01-18 Thread Christophe Lyon

Hi David,


On 15 January 2018 at 11:09, Richard Biener  wrote:
> On Mon, Jan 8, 2018 at 8:36 PM, David Malcolm  wrote:
>> On Sat, 2018-01-06 at 08:44 +0100, Richard Biener wrote:
>>> On January 5, 2018 11:55:11 PM GMT+01:00, David Malcolm >> hat.com> wrote:
>>> > On Fri, 2018-01-05 at 10:36 +0100, Richard Biener wrote:
>>> > > On Thu, Jan 4, 2018 at 10:52 PM, David Malcolm >> > > om>
>>> > > wrote:
>>> > > > PR lto/83121 reports an ICE deep inside the linemap code when
>>> > > > -Wodr
>>> > > > reports on a type mismatch.
>>> > > >
>>> > > > The root cause is that the warning can access the
>>> > > > DECL_SOURCE_LOCATION
>>> > > > of a streamed-in decl before the lto_location_cache has been
>>> > > > applied.
>>> > > >
>>> > > > lto_location_cache::input_location stores
>>> > > > RESERVED_LOCATION_COUNT
>>> > > > (==2)
>>> > > > as a poison value until the cache is applied:
>>> > > > 250   /* Keep value RESERVED_LOCATION_COUNT in *loc as
>>> > > > linemap
>>> > > > lookups will
>>> > > > 251  ICE on it.  */
>>> > > >
>>> > > > The fix is relatively simple: apply the cache before reading
>>> > > > the
>>> > > > DECL_SOURCE_LOCATION.
>>> > > >
>>> > > > (I wonder if we should instead have a INVALID_LOCATION value to
>>> > > > handle
>>> > > > this case more explicitly?  e.g. 0x?  or reserve 2 in
>>> > > > libcpp for
>>> > > > that purpose, and have the non-reserved locations start at
>>> > > > 3?  Either
>>> > > > would be more invasive, though)
>>> > > >
>>> > > > Triggering the ICE was fiddly: it seems to be affected by many
>>> > > > things,
>>> > > > including the order of files, and (I think) by filenames.  My
>>> > > > theory is
>>> > > > that it's affected by the ordering of the tree nodes in the LTO
>>> > > > stream:
>>> > > > for the ICE to occur, the types in question need to be compared
>>> > > > before
>>> > > > some other operation flushes the lto_location_cache.  This
>>> > > > ordering
>>> > > > is affected by the hash-based ordering in DFS in lto-streamer-
>>> > > > out.c, which
>>> > > > might explain why r255066 seemed to trigger the bug; the only
>>> > > > relevant
>>> > > > change to LTO there seemed to be:
>>> > > >   * lto-streamer-out.c (hash_tree): Hash TYPE_EMPTY_P and
>>> > > > DECL_PADDING_P.
>>> > > > If so, then the bug was presumably already present, but hidden.
>>> > > >
>>> > > > The patch also adds regression test coverage for the ICE, which
>>> > > > is
>>> > > > more
>>> > > > involved - as far as I can tell, we don't have an existing way
>>> > > > to
>>> > > > verify
>>> > > > diagnostics emitted during link-time optimization.
>>> > > >
>>> > > > Hence the patch adds some machinery to lib/lto.exp to support
>>> > > > two
>>> > > > new
>>> > > > directives: dg-lto-warning and dg-lto-message, corresponding to
>>> > > > dg-warning and dg-message respectively, where the diagnostics
>>> > > > are
>>> > > > expected to be emitted at link-time.
>>> > > >
>>> > > > The test case includes examples of LTO warnings and notes in
>>> > > > both
>>> > > > the
>>> > > > primary and secondary source files
>>> > > >
>>> > > > Doing so required reusing the logic from DejaGnu for handling
>>> > > > diagnostics.
>>> > > > Unfortunately the pertinent code is a 50 line loop within a
>>> > > > ~200
>>> > > > line Tcl
>>> > > > function in dg.exp (dg-test), so I had to copy it from DejaGnu,
>>> > > > making
>>> > > > various changes as necessary (see
>>> > > > lto_handle_diagnostics_for_file
>>> > > > in the
>>> > > > patch; for example the LTO version supports multiple source
>>> > > > files,
>>> > > > identifying which source file emitted a diagnostic).
>>> > > >
>>> > > > For non-LTO diagnostics we currently ignore surplus "note"
>>> > > > diagnostics.
>>> > > > This patch updates lto_prune_warns to follow this behavior
>>> > > > (since
>>> > > > otherwise we'd need numerous dg-lto-message directives for the
>>> > > > motivating
>>> > > > test case).
>>> > > >
>>> > > > The patch adds these PASS results to g++.sum:
>>> > > >
>>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_0.o assemble, -O0 -flto
>>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_1.o assemble, -O0 -flto
>>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_0.C
>>> > > > line
>>> > > > 6)
>>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_0.C
>>> > > > line
>>> > > > 8)
>>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_1.C
>>> > > > line
>>> > > > 2)
>>> > > > PASS: g++.dg/lto/pr83121  (test for LTO warnings, pr83121_1.C
>>> > > > line
>>> > > > 3)
>>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_0.o-cp_lto_pr83121_1.o
>>> > > > link, -O0 -flto
>>> > > >
>>> > > > The output for dg-lto-message above refers to "warnings",
>>> > > > rather
>>> > > > than
>>> > > > "messages" but that's the same as for the non-LTO case, where
>>> > > > dg-
>>> > > > message
>>> > > > also refers to "warnings".
>>> > > >
>>> > > > Successfully bootstrapped®rtested on x86_64-pc-l

Re: [PR c++/83160] local ref to capture

2018-01-18 Thread Nathan Sidwell


On 01/17/2018 01:44 PM, Jason Merrill wrote:


This makes sense to me.  But I think we'd want also that flag set on
the ck_identity inside the ck_base that direct_reference_binding
creates, so setting it first rather than in an else.


Ah yes, you spotted the bit I failed to mention.  (I think we get away 
with it because class objects always have sufficient lvalueness, but why 
lie?)


nathan

--
Nathan Sidwell
2018-01-18  Nathan Sidwell  

	PR c++/83160
	* cp-tree.h (mark_use): Declare.
	* expr.c (mark_use): Make extern.
	* call.c (direct_reference_binding): Set inner conv's
	rvaluedness_matches_p, if it is an identity.
	(convert_like_real): Mark lvalue or rvalue use for identity as
	rvaledness_matches_p demands.

	PR c++/83160
	* g++.dg/cpp0x/pr83160.C: New.

Index: cp/call.c
===
--- cp/call.c	(revision 256841)
+++ cp/call.c	(working copy)
@@ -102,7 +102,8 @@ struct conversion {
  being bound to an lvalue expression or an rvalue reference is
  being bound to an rvalue expression.  If KIND is ck_rvalue,
  true when we are treating an lvalue as an rvalue (12.8p33).  If
- KIND is ck_base, always false.  */
+ KIND is ck_base, always false.  If ck_identity, we will be
+ binding a reference directly.  */
   BOOL_BITFIELD rvaluedness_matches_p: 1;
   BOOL_BITFIELD check_narrowing: 1;
   /* The type of the expression resulting from the conversion.  */
@@ -1476,6 +1477,10 @@ direct_reference_binding (tree type, con
 
   t = TREE_TYPE (type);
 
+  if (conv->kind == ck_identity)
+/* Mark the identity conv as to not decay to rvalue.  */
+conv->rvaluedness_matches_p = true;
+
   /* [over.ics.rank]
 
  When a parameter of reference type binds directly
@@ -1501,6 +1506,7 @@ direct_reference_binding (tree type, con
 	 That way, convert_like knows not to generate a temporary.  */
   conv->need_temporary_p = false;
 }
+
   return build_conv (ck_ref_bind, type, conv);
 }
 
@@ -6800,7 +6806,9 @@ convert_like_real (conversion *convs, tr
 	  else
 	gcc_unreachable ();
 	}
-  expr = mark_rvalue_use (expr);
+  expr = mark_use (expr, /*rvalue_p=*/!convs->rvaluedness_matches_p,
+		   /*read_p=*/true, UNKNOWN_LOCATION,
+		   /*reject_builtin=*/true);
 
   if (type_unknown_p (expr))
 	expr = instantiate_type (totype, expr, complain);
Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 256841)
+++ cp/cp-tree.h	(working copy)
@@ -6328,6 +6328,9 @@ extern tree create_try_catch_expr
 
 /* in expr.c */
 extern tree cplus_expand_constant		(tree);
+extern tree mark_use (tree expr, bool rvalue_p, bool read_p,
+		  location_t = UNKNOWN_LOCATION,
+		  bool reject_builtin = true);
 extern tree mark_rvalue_use			(tree,
  location_t = UNKNOWN_LOCATION,
  bool reject_builtin = true);
Index: cp/expr.c
===
--- cp/expr.c	(revision 256841)
+++ cp/expr.c	(working copy)
@@ -89,7 +89,7 @@ cplus_expand_constant (tree cst)
 /* We've seen an actual use of EXPR.  Possibly replace an outer variable
reference inside with its constant value or a lambda capture.  */
 
-static tree
+tree
 mark_use (tree expr, bool rvalue_p, bool read_p,
 	  location_t loc /* = UNKNOWN_LOCATION */,
 	  bool reject_builtin /* = true */)
Index: testsuite/g++.dg/cpp0x/pr83160.C
===
--- testsuite/g++.dg/cpp0x/pr83160.C	(revision 0)
+++ testsuite/g++.dg/cpp0x/pr83160.C	(working copy)
@@ -0,0 +1,33 @@
+// { dg-do run { target c++11 } }
+// PR c++/83160 failed to capture as lvalue
+
+int main ()
+{
+  const int a = 0;
+
+  if (![&a] (const int *p)
+  {
+	const int &b = a;
+	// We should bind to the outer a
+	return &b == p;
+  } (&a))
+return 1;
+
+  if (![&] (const int *p)
+  {
+	const int &b = a;
+	// We should bind to the outer a
+	return &b == p;
+  } (&a))
+return 2;
+
+  if ([=] (const int *p)
+  {
+	const int &b = a;
+	// We should bind to the captured instance
+	return &b == p;
+  }(&a))
+return 3;
+
+  return 0;
+}

[COMMITTED] Add myself to MAINTAINERS (write after approval)

2018-01-18 Thread Boris Kolpackov

ChangeLog:

2018-01-18  Boris Kolpackov  

* MAINTAINERS (write after approval): Add myself.

--- MAINTAINERS (revision 256843)
+++ MAINTAINERS (working copy)
@@ -454,6 +454,7 @@
 Michael Koch   
 Nicolas Koenig 
 Kaz Kojima 
+Boris Kolpackov 
 Dave Korn  
 Julia Koval
 Matt Kraai

Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()

2018-01-18 Thread Richard Biener

On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw
 wrote:
>
> This patch adds generic support for the new builtin
> __builtin_speculation_safe_load.  It provides the overloading of the
> different access sizes and a default fall-back expansion for targets
> that do not support a mechanism for inhibiting speculation.

+  if (ignore)
+{
+  warning_at (input_location, 0,
+ "result of __builtin_speculation_safe_load must be used to "
+ "ensure correct operation");
+  target = NULL;
+}

This warning cannot be disabled but these kind of cases could appear via
path isolation or missed optimizations since memory optimizations do not
recognize such speculative loads.  So - should it not be emitted way
earlier instead, like from the FEs or during some early warning pass?

In which case it could be an error as well, no?

+  set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode),
+  get_pointer_alignment (arg0)));

err...  so there's no way to do an unaligned speculation safe load?  Using
just get_pointer_alignment would be safe here.   Other builtins/IFNs explicitely
pass down the alignment and FEs generate such alignment from their
language constraints.

+  set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0;

sorry, but pointer types are arbitrary in GIMPLE so this is clearly
wrong.  Without
more information you have to use zero.  Thus,

__builtin_speculation_safe_load ((int *)p, ...);

will do the wrong thing if the type of p is not int *.

+  /* Mark the memory access as volatile.  We don't want the optimizers to
+ move it or otherwise substitue an alternative value.  */
+  MEM_VOLATILE_P (mem) = 1;

how is moving or substituting in any way related to inhibiting speculation?
Why's this done for all targets rather than only those that need any such
mitigation?

I btw miss implementation for x86_64 (at least) where the agreed upon mitigation
is to insert 'lfence' before the load.

Any reason the builtin expansion is not fully left to the targets?

+/* Suppressing speculation.  Users are expected to use the first (N)
+   variant, which will be translated internally into one of the other
+   types.  */
+DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load",
+BT_FN_VOID_VAR, ATTR_NULL)
+
+DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, "speculation_safe_load_1",
+BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
+DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, "speculation_safe_load_2",
+BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)

any reason you are not simply using an internal function for the
non-_N variants?
Looks like you're closely following the atomic/sync builtins but those
predate IFNs I think.
They also have more constraints on alignment and use a restrictive alias set.

+rtx
+default_speculation_safe_load (machine_mode mode ATTRIBUTE_UNUSED,
+  rtx result, rtx mem, rtx lower_bound,
+  rtx upper_bound, rtx cmpptr, bool warn)
+{
+  rtx_code_label *done_label = gen_label_rtx ();
+  rtx_code_label *inrange_label = gen_label_rtx ();
+
+  if (warn)
+warning_at
+  (input_location, 0,
+   "this target does not support anti-speculation operations.  "
+   "Your program will still execute correctly, but speculation "
+   "will not be inhibited");

so there's no way to inhibit this warning than changing all targets?

Iff it is correct for a target to expand this to an unconditional load with a
preceeding fence why not have the generic target hook implementation
do exactly that - emit an unconditional move?  Maybe I misunderstand
the clear words in the hook docs, but the builtin docs also say
the behavior is undefined if the builtin is executed when the condition
is false - which means traps are allowed.

Richard.

> * builtin-types.def (BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR):
> New builtin type signature.
> (BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
> (BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
> (BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
> (BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
> * builtins.def (BUILT_IN_SPECULATION_SAFE_LOAD_N): New builtin.
> (BUILT_IN_SPECULATION_SAFE_LOAD_1): Likewise.
> (BUILT_IN_SPECULATION_SAFE_LOAD_2): Likewise.
> (BUILT_IN_SPECULATION_SAFE_LOAD_4): Likewise.
> (BUILT_IN_SPECULATION_SAFE_LOAD_8): Likewise.
> (BUILT_IN_SPECULATION_SAFE_LOAD_16): Likewise.
> * target.def (speculation_safe_load): New hook.
> * doc/tm.texi.in (TARGET_SPECULATION_SAFE_LOAD): Add to
> documentation.
> * doc/tm.texi: Regenerated.
> * doc/cpp.texi: Document __HAVE_SPECULATION_SAFE_LOAD.
> * doc/extend.texi: Document __builtin_speculation_safe_

Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()

2018-01-18 Thread Richard Biener

On Thu, Jan 18, 2018 at 1:44 PM, Richard Biener
 wrote:
> On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw
>  wrote:
>>
>> This patch adds generic support for the new builtin
>> __builtin_speculation_safe_load.  It provides the overloading of the
>> different access sizes and a default fall-back expansion for targets
>> that do not support a mechanism for inhibiting speculation.
>
> +  if (ignore)
> +{
> +  warning_at (input_location, 0,
> + "result of __builtin_speculation_safe_load must be used to "
> + "ensure correct operation");
> +  target = NULL;
> +}
>
> This warning cannot be disabled but these kind of cases could appear via
> path isolation or missed optimizations since memory optimizations do not
> recognize such speculative loads.  So - should it not be emitted way
> earlier instead, like from the FEs or during some early warning pass?
>
> In which case it could be an error as well, no?
>
> +  set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode),
> +  get_pointer_alignment (arg0)));
>
> err...  so there's no way to do an unaligned speculation safe load?  Using
> just get_pointer_alignment would be safe here.   Other builtins/IFNs 
> explicitely
> pass down the alignment and FEs generate such alignment from their
> language constraints.
>
> +  set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0;
>
> sorry, but pointer types are arbitrary in GIMPLE so this is clearly
> wrong.  Without
> more information you have to use zero.  Thus,
>
> __builtin_speculation_safe_load ((int *)p, ...);
>
> will do the wrong thing if the type of p is not int *.
>
> +  /* Mark the memory access as volatile.  We don't want the optimizers to
> + move it or otherwise substitue an alternative value.  */
> +  MEM_VOLATILE_P (mem) = 1;
>
> how is moving or substituting in any way related to inhibiting speculation?
> Why's this done for all targets rather than only those that need any such
> mitigation?
>
> I btw miss implementation for x86_64 (at least) where the agreed upon 
> mitigation
> is to insert 'lfence' before the load.
>
> Any reason the builtin expansion is not fully left to the targets?
>
> +/* Suppressing speculation.  Users are expected to use the first (N)
> +   variant, which will be translated internally into one of the other
> +   types.  */
> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load",
> +BT_FN_VOID_VAR, ATTR_NULL)
> +
> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, "speculation_safe_load_1",
> +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, "speculation_safe_load_2",
> +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
>
> any reason you are not simply using an internal function for the
> non-_N variants?
> Looks like you're closely following the atomic/sync builtins but those
> predate IFNs I think.
> They also have more constraints on alignment and use a restrictive alias set.
>
> +rtx
> +default_speculation_safe_load (machine_mode mode ATTRIBUTE_UNUSED,
> +  rtx result, rtx mem, rtx lower_bound,
> +  rtx upper_bound, rtx cmpptr, bool warn)
> +{
> +  rtx_code_label *done_label = gen_label_rtx ();
> +  rtx_code_label *inrange_label = gen_label_rtx ();
> +
> +  if (warn)
> +warning_at
> +  (input_location, 0,
> +   "this target does not support anti-speculation operations.  "
> +   "Your program will still execute correctly, but speculation "
> +   "will not be inhibited");
>
> so there's no way to inhibit this warning than changing all targets?
>
> Iff it is correct for a target to expand this to an unconditional load with a
> preceeding fence why not have the generic target hook implementation
> do exactly that - emit an unconditional move?  Maybe I misunderstand
> the clear words in the hook docs, but the builtin docs also say
> the behavior is undefined if the builtin is executed when the condition
> is false - which means traps are allowed.

Oh, and I think I'd like to see the builtins folded to regular MEMs very early
if the target doesn't provide TARGET_SPECULATION_SAFE_LOAD.  Ideally
directly from where _n is disambiguated in the FE.

Richard.

> Richard.
>
>
>> * builtin-types.def (BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR):
>> New builtin type signature.
>> (BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
>> (BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
>> (BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
>> (BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise.
>> * builtins.def (BUILT_IN_SPECULATION_SAFE_LOAD_N): New builtin.
>> (BUILT_IN_SPECULATION_SAFE_LOAD_1): Likewise.
>> (BUILT_IN_SPECULATION_SAFE_LOAD_2): Likewise.
>> (BUILT_IN_SPECULATION_SAF

[PATCH] Small retpoline clean-up.

2018-01-18 Thread Martin Liška

Hello.

The patch is small clean-up of retpoline stuff. I know that H.J. is planning
to introduce an option driving which retpoline loop filler is selected.
Hopefully the suggested patch clarifies current default selection of
the filler.

Patch survives make check -k -j10 RUNTESTFLAGS="i386.exp".

Ready to install?
Thanks,
Martin


gcc/ChangeLog:

2018-01-18  Martin Liska  

* config/i386/i386.c (indirect_thunk_name): Document that also
lfence is emitted.
(output_indirect_thunk): Document why both instructions
(pause and lfence) are generated.

gcc/testsuite/ChangeLog:

2018-01-18  Martin Liska  

* gcc.target/i386/indirect-thunk-3.c: Remove duplicate options.
* gcc.target/i386/indirect-thunk-4.c: Likewise.
* gcc.target/i386/indirect-thunk-6.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
* gcc.target/i386/ret-thunk-10.c: Likewise.
* gcc.target/i386/ret-thunk-11.c: Likewise.
* gcc.target/i386/ret-thunk-12.c: Likewise.
* gcc.target/i386/ret-thunk-15.c: Likewise.
* gcc.target/i386/ret-thunk-9.c: Likewise.
---
 gcc/config/i386/i386.c   | 5 -
 gcc/testsuite/gcc.target/i386/indirect-thunk-3.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-4.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-6.c | 2 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c | 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-10.c | 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-11.c | 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-12.c | 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-15.c | 2 +-
 gcc/testsuite/gcc.target/i386/ret-thunk-9.c  | 2 +-
 10 files changed, 13 insertions(+), 10 deletions(-)


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index cb0f4ed5e6a..72d25ae4f72 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10826,6 +10826,7 @@ indirect_thunk_name (char name[32], unsigned int regno,
 	call	L2
L1:
 	pause
+	lfence
 	jmp	L1
L2:
 	mov	%REG, (%sp)
@@ -10837,6 +10838,7 @@ indirect_thunk_name (char name[32], unsigned int regno,
 	call L2
   L1:
 	pause
+	lfence
 	jmp L1
   L2:
 	lea WORD_SIZE(%sp), %sp
@@ -10864,7 +10866,8 @@ output_indirect_thunk (bool need_bnd_p, unsigned int regno)
 
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
 
-  /* Pause + lfence.  */
+  /* AMD and Intel CPUs prefer each a different instruction as loop filler.
+ Usage of both pause + lfence is compromise solution.  */
   fprintf (asm_out_file, "\tpause\n\tlfence\n");
 
   /* Jump.  */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
index 9e24a385387..dab7ac2ef25 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mno-indirect-branch-register -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
 
 typedef void (*dispatch_t)(long offset);
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
index 127b5d94523..44cc5f52f45 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mno-indirect-branch-register -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
 
 typedef void (*dispatch_t)(long offset);
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
index e4649283d10..1490f5336ca 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target *-*-linux* } } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mno-indirect-branch-register -mno-indirect-branch-register -mfunction-return=keep -fpic -fno-plt -mindirect-branch=thunk" } */
+/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -fpic -fno-plt -mindirect-branch=thunk" } */
 
 extern void bar (void);
 
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c
index 0e19830de4d..a5fc4a23351 100644
--- a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { *-*-linux* && { ! x32 } } } } */
-/* { dg-options "-O2 -mno-indirect-branch-register -mno-in

Re: [PATCH] Small retpoline clean-up.

2018-01-18 Thread Jan Hubicka

> Hello.
> 
> The patch is small clean-up of retpoline stuff. I know that H.J. is planning
> to introduce an option driving which retpoline loop filler is selected.
> Hopefully the suggested patch clarifies current default selection of
> the filler.
> 
> Patch survives make check -k -j10 RUNTESTFLAGS="i386.exp".
> 
> Ready to install?
> Thanks,
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2018-01-18  Martin Liska  
> 
>   * config/i386/i386.c (indirect_thunk_name): Document that also
>   lfence is emitted.
>   (output_indirect_thunk): Document why both instructions
>   (pause and lfence) are generated.

OK,
thanks!
Honza
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-01-18  Martin Liska  
> 
>   * gcc.target/i386/indirect-thunk-3.c: Remove duplicate options.
>   * gcc.target/i386/indirect-thunk-4.c: Likewise.
>   * gcc.target/i386/indirect-thunk-6.c: Likewise.
>   * gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
>   * gcc.target/i386/ret-thunk-10.c: Likewise.
>   * gcc.target/i386/ret-thunk-11.c: Likewise.
>   * gcc.target/i386/ret-thunk-12.c: Likewise.
>   * gcc.target/i386/ret-thunk-15.c: Likewise.
>   * gcc.target/i386/ret-thunk-9.c: Likewise.
> ---
>  gcc/config/i386/i386.c   | 5 -
>  gcc/testsuite/gcc.target/i386/indirect-thunk-3.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-4.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-6.c | 2 +-
>  gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c | 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-10.c | 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-11.c | 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-12.c | 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-15.c | 2 +-
>  gcc/testsuite/gcc.target/i386/ret-thunk-9.c  | 2 +-
>  10 files changed, 13 insertions(+), 10 deletions(-)
> 
> 

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index cb0f4ed5e6a..72d25ae4f72 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -10826,6 +10826,7 @@ indirect_thunk_name (char name[32], unsigned int 
> regno,
>   callL2
> L1:
>   pause
> + lfence
>   jmp L1
> L2:
>   mov %REG, (%sp)
> @@ -10837,6 +10838,7 @@ indirect_thunk_name (char name[32], unsigned int 
> regno,
>   call L2
>L1:
>   pause
> + lfence
>   jmp L1
>L2:
>   lea WORD_SIZE(%sp), %sp
> @@ -10864,7 +10866,8 @@ output_indirect_thunk (bool need_bnd_p, unsigned int 
> regno)
>  
>ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
>  
> -  /* Pause + lfence.  */
> +  /* AMD and Intel CPUs prefer each a different instruction as loop filler.
> + Usage of both pause + lfence is compromise solution.  */
>fprintf (asm_out_file, "\tpause\n\tlfence\n");
>  
>/* Jump.  */
> diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c 
> b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
> index 9e24a385387..dab7ac2ef25 100644
> --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
> +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -mno-indirect-branch-register 
> -mno-indirect-branch-register -mno-indirect-branch-register 
> -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
> +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep 
> -mindirect-branch=thunk -fno-pic" } */
>  
>  typedef void (*dispatch_t)(long offset);
>  
> diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c 
> b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
> index 127b5d94523..44cc5f52f45 100644
> --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
> +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -mno-indirect-branch-register 
> -mno-indirect-branch-register -mno-indirect-branch-register 
> -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */
> +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep 
> -mindirect-branch=thunk -fno-pic" } */
>  
>  typedef void (*dispatch_t)(long offset);
>  
> diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c 
> b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
> index e4649283d10..1490f5336ca 100644
> --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
> +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target *-*-linux* } } */
> -/* { dg-options "-O2 -mno-indirect-branch-register 
> -mno-indirect-branch-register -mno-indirect-branch-register 
> -mfunction-return=keep -fpic -fno-plt -mindirect-branch=thunk" } */
> +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep 
> -fpic -fno-plt -mindirect-branch=thunk" } */
>  
>  extern void bar (void);
>  
> diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c 
> b/gcc/testsuite/gcc.target/i3

Re: [PATCH][arm] Fix gcc.target/arm/xor-and.c

2018-01-18 Thread Christophe Lyon

On 17 January 2018 at 12:11, Kyrill  Tkachov
 wrote:
> Hi all,
>
> This test is naughty because it doesn't use the proper effective target
> checks
> and add-options mechanisms for setting a Thumb1 target, which leads to
> Thumb1 hard-float errors
> when testing a toolchain configured with --with-cpu=cortex-a15
> --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb.
>
> This patch fixes that in the obvious way.
>
> Committing to trunk.
> Thanks,
> Kyrill
>
> 2018-01-17  Kyrylo Tkachov 
>
> * gcc.target/arm/xor-and.c: Fix armv6 effective target checks
> and options.

There was a problem with the order of dg-options and dg-add-options,
which I fixed as a obvious at r256845.

Christophe
2018-01-18  Christophe Lyon  

* gcc.target/arm/xor-and.c: Fix dg-options and dg-add-options
order.
Index: gcc/testsuite/gcc.target/arm/xor-and.c
===
--- gcc/testsuite/gcc.target/arm/xor-and.c  (revision 256844)
+++ gcc/testsuite/gcc.target/arm/xor-and.c  (revision 256845)
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v6_ok } */
+/* { dg-options "-O" }  */
 /* { dg-add-options arm_arch_v6 } */
-/* { dg-options "-O" }  */
 
 unsigned short foo (unsigned short x)
 {

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread John David Anglin


On 2018-01-17 3:07 PM, Jakub Jelinek wrote:

PR83882 complains that PR81715 testcase fails on callee copies parameter
targets.  The following patch ought to fix that, but I have only
bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase
with hppa.

John, do you think you could test this on hppa without the callee copies
default change?

On hppa2.0w-hp-hpux11.11, pr81715.C now passes.

However, there are some new fails in the gcc and g++ suites:

FAIL: g++.dg/torture/stackalign/throw-1.C   -O0  execution test
FAIL: g++.dg/torture/stackalign/throw-1.C   -O0 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-1.C   -O1 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-1.C   -O2 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-1.C   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test

FAIL: g++.dg/torture/stackalign/throw-1.C   -O3 -g -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-1.C   -Os -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-2.C   -O0  execution test
FAIL: g++.dg/torture/stackalign/throw-2.C   -O0 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-2.C   -O1 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-2.C   -O2 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-2.C   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test

FAIL: g++.dg/torture/stackalign/throw-2.C   -O3 -g -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-2.C   -Os -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-4.C   -O0  execution test
FAIL: g++.dg/torture/stackalign/throw-4.C   -O0 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-4.C   -O1 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-4.C   -O2 -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-4.C   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test

FAIL: g++.dg/torture/stackalign/throw-4.C   -O3 -g -fpic execution test
FAIL: g++.dg/torture/stackalign/throw-4.C   -Os -fpic execution test

FAIL: gcc.dg/torture/pr52451.c   -O0  execution test
FAIL: gcc.dg/torture/pr52451.c   -O1  execution test
FAIL: gcc.dg/torture/pr52451.c   -O2  execution test
FAIL: gcc.dg/torture/pr52451.c   -O3 -g  execution test
FAIL: gcc.dg/torture/pr52451.c   -Os  execution test

FAIL: gcc.dg/compat/pr83487-1 c_compat_y_tst.o compile
FAIL: gcc.dg/compat/pr83487-2 c_compat_y_tst.o compile

Dave

--
John David Anglin  dave.ang...@bell.net

Re: [PATCH,NVPTX] Fix PR83920

2018-01-18 Thread Tom de Vries


On 01/18/2018 12:40 AM, Cesar Philippidis wrote:

In PR83920, I encountered a nvptx bug where live predicate variables
were clobbered before their value was broadcasted. 


Hi,

I've managed to reproduce the problem based on the description in the PR.


Apparently, there
were problems in certain version of the CUDA driver where the JIT would
generate wrong code for shfl broadcasts. 


Correct. And there's a work around committed for the JIT problem, which 
you refer to in the next line (without introducing it first).



The attached patch teaches
nvptx_single not to apply that workaround if the predicate register is live.
 > Tom, does this patch look sane to you?


The fact that the cond register is live at the start of the from bb does 
not mean that the register can't be set inside the bb.


Furthermore, the live info does not make a distinction between 
live-for-lane-0 and line-for-warp. So, if the condition reg is not set 
in the bb, but set only for lane-0 in a previous bb, then we still need 
to initialize lanes 1-31.


So, I don't think this is the way to address this bug.


I'm not sure if it defeats the
purpose of your original patch. 


In test cases mentioned above, it does.


Regardless, the live predicate registers
shouldn't be clobbered before they are used.



There is a bug in the workaround, that's correct.

I think the way to address it is using a tmp .pred reg like so:
...
{
  .reg .u32 %x;
  mov.u32 %x,%tid.x;
  setp.ne.u32 %rnotvzero,%x,0;
}

{
  .reg .pred %rcond2;
  setp.eq.u32 %rcond2, 1, 0; // workaround

  @%rnotvzero bra Lskip;
  ...
  setp.. %rcond,op1,op2; // could be here, could be earlier
  mov.b1 %rcond2, %rcond; // used pseudo opcode mov.b1 for convenience
 Lskip:
  selp.u32 %rcondu32,1,0,%rcond2;
  shfl.idx.b32 %rcondu32,%rcondu32,0,31;
  setp.ne.u32 %rcond,%rcondu32,0;
}
...


Unfortunately, I cannot reproduce the runtime failure with gemm example
in the PR, so I didn't include it in the patch.


I'm managed to modify the test-case such that it reproduces the failure 
with trunk (see PR). So, that test-case should be included.


Thanks,
- Tom


However, this patch does
fix the failure with da-1.c in og7.
This patch does not cause any
regressions.

Is it OK for trunk?

Thanks,
Cesar


nvptx-jit-relax.diff


diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 55c7e3c..698c574 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -3957,6 +3957,7 @@ bb_first_real_insn (basic_block bb)
  static void
  nvptx_single (unsigned mask, basic_block from, basic_block to)
  {
+  bitmap live = DF_LIVE_IN (from);
rtx_insn *head = BB_HEAD (from);
rtx_insn *tail = BB_END (to);
unsigned skip_mask = mask;
@@ -4126,8 +4127,9 @@ nvptx_single (unsigned mask, basic_block from, 
basic_block to)
 There is nothing in the PTX spec to suggest that this is wrong, or
 to explain why the extra initialization is needed.  So, we classify
 it as a JIT bug, and the extra initialization as workaround.  */
- emit_insn_before (gen_movbi (pvar, const0_rtx),
-   bb_first_real_insn (from));
+ if (!bitmap_bit_p (live, REGNO (pvar)))
+   emit_insn_before (gen_movbi (pvar, const0_rtx),
+ bb_first_real_insn (from));
  #endif
  emit_insn_before (nvptx_gen_vcast (pvar), tail);
}

[C++ Patch] PR 81013 ("[7/8 Regression] ICE with invalid union in class hierarchy")

2018-01-18 Thread Paolo Carlini


Hi,

this error recovery regression is caused by my work for c++/70202 back 
in 2016. At the time I remember wondering whether in the easy cases of 
errors right at beginning of xref_basetypes (not those in the loop below 
handled via dropped_base) we really wanted to proceed or immediately 
bail out, as far as error recovery in concerned. This new testcase shows 
that if we don't bail out early we may end up crashing much later, in 
cgraph.c. Thus the straightforward fix below. Tested x86_64-linux.


Thanks, Paolo.

/

/cp
2018-01-18  Paolo Carlini  

PR c++/81013
* decl.c (xref_basetypes): Early return upon error about derived
union.

/testsuite
2018-01-18  Paolo Carlini  

PR c++/81013
* g++.dg/inherit/union3.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 256840)
+++ cp/decl.c   (working copy)
@@ -13801,7 +13801,10 @@ xref_basetypes (tree ref, tree base_list)
   CLASSTYPE_NON_LAYOUT_POD_P (ref) = true;
 
   if (TREE_CODE (ref) == UNION_TYPE)
-   error ("derived union %qT invalid", ref);
+   {
+ error ("derived union %qT invalid", ref);
+ return;
+   }
 }
 
   if (max_bases > 1)
Index: testsuite/g++.dg/inherit/union3.C
===
--- testsuite/g++.dg/inherit/union3.C   (nonexistent)
+++ testsuite/g++.dg/inherit/union3.C   (working copy)
@@ -0,0 +1,16 @@
+// PR c++/81013
+
+struct A
+{
+  virtual void foo() const;
+};
+
+union B : A  // { dg-error "derived union 'B' invalid" }
+{
+  void foo() const;
+};
+
+void bar(const B& b)
+{
+  b.foo();
+}

Re: [PATCH v3] Ability to remap file names in FILE, etc (PR other/70268)

2018-01-18 Thread Ximin Luo

Hello, I see Boris' patch was accepted with some further changes.

Any chance my additions for reproducible builds can be included as well? They 
are described below.

It should still apply on top, minus the strchr -> strrchr change which was 
folded into Boris' accepted version.

Ximin

Ximin Luo:
> Boris Kolpackov:
>> Thanks for the review. Third revision of the patch attached.
>>
>> [..]
> Here is a follow-up patch, meant to be applied on top of Boris' patch, to 
> address some issues we found based on discussions and experiments at the 
> Reproducible Builds project.
> 
> Most of this patch is re-implementing the patch from [1] originally by Daniel 
> Kahn Gillmor. Back then it was rejected in favour of a simpler approach 
> because 
> its necessity was not realised. However, more experiments since then have 
> revealed that its basic approach is useful in more general scenarios:
> 
> Higher-level build scripts sometimes like to save CFLAGS etc into the build 
> output, making the overall build output unreproducible even if GCC is playing 
> nicely. Rather than add logic to strip -f{file,debug,macro,...}-prefix-map, 
> into all possible higher-level programs that might want to save CFLAGS, it is
> simpler if GCC could read it from a user-specified environment variable. The
> fact that the name of the environment variable is still present on the
> command-line, should make it obvious what is being done, and avoid confusion
> that might arise if an implicit "magic" environment variable is used.
> 
> In the patch thread (see [1] and follow-ups) there was concern that the 
> prefix 
> '$' would cause awkward escaping behaviour, so the prefix 'ENV:' was chosen. 
> A 
> concern about this conflicting with AmigaOS paths was voiced but not 
> explored; 
> of course it can be changed to something else if needed.
> 
> (NetBSD is already carrying this patch in their system GCC [2], presumably for
> the same reason mentioned above.)
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01168.html
> [2] 
> http://cvsweb.netbsd.org/bsdweb.cgi/src/external/gpl3/gcc/dist/gcc/final.c?rev=1.2&content-type=text/x-cvsweb-markup
> 
> The other thing that this patch does, is to split the old and new prefixs by
> the last '=' sign rather than the first. This allows arbitrary old prefixes to
> be mapped, which would be more flexible than mapping to arbitrary new 
> prefixes.
> The rust compiler is adopting this strategy too: [3].
> 
> [3] https://github.com/rust-lang/rust/issues/41555#issuecomment-321078174
> 
> ChangeLogs
> --
> 
> gcc/ChangeLog:
> 
> 2017-12-19  Ximin Luo  
> 
>   PR other/70268
>   * file-prefix-map.c (add_prefix_map): Support reading old prefix from 
> the
>   environment. Split old and new parts by the last '=' rather than the 
> first.
>   * file-prefix-map.h: Define macros for the ENV: prefix.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-12-19  Ximin Luo  
> 
>   PR other/70268
>   * c-c++-common/ffile-prefix-map-env.c: New test.
> 
> Index: gcc-8-20171217/gcc/file-prefix-map.c
> ===
> --- gcc-8-20171217.orig/gcc/file-prefix-map.c
> +++ gcc-8-20171217/gcc/file-prefix-map.c
> @@ -41,16 +41,40 @@ add_prefix_map (file_prefix_map *&maps,
>  {
>file_prefix_map *map;
>const char *p;
> +  char *env;
> +  const char *old;
> +  size_t oldlen;
>  
> -  p = strchr (arg, '=');
> +  p = strrchr (arg, '=');
>if (!p)
>  {
>error ("invalid argument %qs to %qs", arg, opt);
>return;
>  }
> +  if (0 == strncmp(FILE_PREFIX_MAP_ENV_PREFIX, arg, 
> FILE_PREFIX_MAP_ENV_PREFIX_OFFSET))
> +{
> +  const char *arg_offset = arg + FILE_PREFIX_MAP_ENV_PREFIX_OFFSET;
> +  env = xstrndup (arg_offset, p - arg_offset);
> +  old = getenv (env);
> +  if (!old)
> + {
> +   warning (0, "environment variable %qs not set in argument to %s",
> +env, opt);
> +   free (env);
> +   return;
> + }
> +  oldlen = strlen (old);
> +  free (env);
> +}
> +  else
> +{
> +  old = xstrndup (arg, p - arg);
> +  oldlen = p - arg;
> +}
> +
>map = XNEW (file_prefix_map);
> -  map->old_prefix = xstrndup (arg, p - arg);
> -  map->old_len = p - arg;
> +  map->old_prefix = old;
> +  map->old_len = oldlen;
>p++;
>map->new_prefix = xstrdup (p);
>map->new_len = strlen (p);
> Index: gcc-8-20171217/gcc/file-prefix-map.h
> ===
> --- gcc-8-20171217.orig/gcc/file-prefix-map.h
> +++ gcc-8-20171217/gcc/file-prefix-map.h
> @@ -18,6 +18,9 @@
>  #ifndef GCC_FILE_PREFIX_MAP_H
>  #define GCC_FILE_PREFIX_MAP_H
>  
> +#define FILE_PREFIX_MAP_ENV_PREFIX "ENV:"
> +#define FILE_PREFIX_MAP_ENV_PREFIX_OFFSET 
> (sizeof(FILE_PREFIX_MAP_ENV_PREFIX) - 1)
> +
>  void add_macro_prefix_map (const char *);
>  void add_debug_prefix_map (const char *);
>  void add_file_prefix_map (const char *);
> Index: gcc

Re: [PATCH][arm] Fix gcc.target/arm/pr40887.c directives

2018-01-18 Thread Christophe Lyon

On 18 January 2018 at 10:25, Kyrill  Tkachov
 wrote:
>
> On 18/01/18 09:15, Christophe Lyon wrote:
>>
>> Hi Kyrill,
>>
>>
>> On 17 January 2018 at 12:20, Kyrill  Tkachov
>>  wrote:
>>>
>>> Hi all,
>>>
>>> This patch converts gcc.target/arm/pr40887.c to use the proper effective
>>> target check and dg-add-options for armv5te
>>> so that we avoid situations where we end up trying to compile the test
>>> with
>>> a Thumb1 hard-float ABI, which makes the
>>> compiler complain.
>>>
>>> This allows the test to pass gracefully for me for my compiler configured
>>> with:
>>> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard
>>> --with-mode=thumb
>>>
>>> Committing to trunk.
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2018-01-17  Kyrylo Tkachov  
>>>
>>>  * gcc.target/arm/pr40887.c: Add armv5te effective target checks and
>>>  directives.
>>
>> I think you had to swap dg-options and dg-add-options directives, so I
>> committed the attached patch as obvious.
>
>
> Thanks Christophe,
> I had spotted it and was planning to fix it by using dg-additional-options,
> but it's the same in the end.
>

Hmmm after my commit, I see again the "unimplemented: Thumb-1
hard-float VFP ABI"
error, because the arm_arch_v5te_ok effective target test pass as it
uses a testcase
containing no actual code. My old patch for
check_effective_target_arm_arch_FUNC_ok
would be helpful :-) (I'm posting it again separately)

Christophe

> Kyrill
>
>> Christophe
>
>

Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()

2018-01-18 Thread Richard Earnshaw (lists)

On 18/01/18 12:44, Richard Biener wrote:
> On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw
>  wrote:
>>
>> This patch adds generic support for the new builtin
>> __builtin_speculation_safe_load.  It provides the overloading of the
>> different access sizes and a default fall-back expansion for targets
>> that do not support a mechanism for inhibiting speculation.
> 
> +  if (ignore)
> +{
> +  warning_at (input_location, 0,
> + "result of __builtin_speculation_safe_load must be used to "
> + "ensure correct operation");
> +  target = NULL;
> +}
> 
> This warning cannot be disabled but these kind of cases could appear via
> path isolation or missed optimizations since memory optimizations do not
> recognize such speculative loads.  So - should it not be emitted way
> earlier instead, like from the FEs or during some early warning pass?

If you think so, and can recommend where to put it.  I'm not
particularly familiar with the front-end code, being mostly a back-end
maintainer.  Otherwise we could probably drop this entirely, but that
gives scope for users to do the Wrong Thing(tm)!

> 
> In which case it could be an error as well, no?
> 
> +  set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode),
> +  get_pointer_alignment (arg0)));
> 
> err...  so there's no way to do an unaligned speculation safe load?  Using
> just get_pointer_alignment would be safe here.   Other builtins/IFNs 
> explicitely
> pass down the alignment and FEs generate such alignment from their
> language constraints.

No, last time I looked GCC didn't support taking the address of an
unaligned object and casting pointer to char to pointer to int implies
you know that it is correctly aligned.

> 
> +  set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0;
> 
> sorry, but pointer types are arbitrary in GIMPLE so this is clearly
> wrong.  Without
> more information you have to use zero.  Thus,
> 
> __builtin_speculation_safe_load ((int *)p, ...);
> 
> will do the wrong thing if the type of p is not int *.

Ok, I'll fix that.

> 
> +  /* Mark the memory access as volatile.  We don't want the optimizers to
> + move it or otherwise substitue an alternative value.  */
> +  MEM_VOLATILE_P (mem) = 1;
> 
> how is moving or substituting in any way related to inhibiting speculation?
> Why's this done for all targets rather than only those that need any such
> mitigation?

I can move that, but it seemed right to do it here.

> 
> I btw miss implementation for x86_64 (at least) where the agreed upon 
> mitigation
> is to insert 'lfence' before the load.

/I'm/ not planning on doing implementations for other targets.  I don't
know those back-ends.  Sorry.  It really needs the attention of a port
expert.

> 
> Any reason the builtin expansion is not fully left to the targets?

Because that would be a significant impediment to deploying the builtin
in generic code.  The compiler would have to error out if it couldn't
handle the builtin at all.  By providing a generic implementation we can
support everything, even if you do get a warning.

> 
> +/* Suppressing speculation.  Users are expected to use the first (N)
> +   variant, which will be translated internally into one of the other
> +   types.  */
> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load",
> +BT_FN_VOID_VAR, ATTR_NULL)
> +
> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, "speculation_safe_load_1",
> +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, "speculation_safe_load_2",
> +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
> 
> any reason you are not simply using an internal function for the
> non-_N variants?
> Looks like you're closely following the atomic/sync builtins but those
> predate IFNs I think.
> They also have more constraints on alignment and use a restrictive alias set.

Wouldn't that make it more difficult to do backports to older versions
of GCC?  I know some folk are interested in going back as far as
gcc-4.8, and maybe even earlier.

> 
> +rtx
> +default_speculation_safe_load (machine_mode mode ATTRIBUTE_UNUSED,
> +  rtx result, rtx mem, rtx lower_bound,
> +  rtx upper_bound, rtx cmpptr, bool warn)
> +{
> +  rtx_code_label *done_label = gen_label_rtx ();
> +  rtx_code_label *inrange_label = gen_label_rtx ();
> +
> +  if (warn)
> +warning_at
> +  (input_location, 0,
> +   "this target does not support anti-speculation operations.  "
> +   "Your program will still execute correctly, but speculation "
> +   "will not be inhibited");
> 
> so there's no way to inhibit this warning than changing all targets?

No.  It's trivial to bind this to a hook that just passes all the other
arguments through to the generic code with the warning suppressed - we
could even add

[ARM,testsuite] Make arm_arch_FUNC_ok more robust

2018-01-18 Thread Christophe Lyon

Hi,

I'm resurrecting a patch I posted in 2012-09 :)

effective_target_arm_arch_FUNC_ok currently contains only
#if !defined (DEF)
#error FOO
#endif
which is not sufficient for GCC to complain in case of an
unsupported flags combination.
For instance, -mfloat-abi=hard -mthumb -march=armv5te leads to:
sorry, unimplemented: Thumb-1 hard-float VFP ABI
but only if the compiler had actual code to compile.

Adding a dummy main() body in the arm_arch_FUNC_ok
effective target does the trick: with this patch, this effective
target test would fail, making the test unsupported rather
than failing later.

OK?

Christophe
2018-01-18  Christophe Lyon 

gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add function body to force error messages in some configurations.

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index f597316..bc7d451 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2211,6 +2211,11 @@ foreach { armfunc armflag armdef } { v4 "-march=armv4 
-marm" __ARM_ARCH_4__
#if !defined (DEF)
#error FOO
#endif
+   int
+   main (void)
+   {
+   return 0;
+   }
} "FLAG" ]
}

Re: [ARM,testsuite] Make arm_arch_FUNC_ok more robust

2018-01-18 Thread Kyrill Tkachov


Hi Christophe,

On 18/01/18 14:08, Christophe Lyon wrote:

Hi,

I'm resurrecting a patch I posted in 2012-09 :)



That's the month I first started working on GCC :)


effective_target_arm_arch_FUNC_ok currently contains only
#if !defined (DEF)
#error FOO
#endif
which is not sufficient for GCC to complain in case of an
unsupported flags combination.
For instance, -mfloat-abi=hard -mthumb -march=armv5te leads to:
sorry, unimplemented: Thumb-1 hard-float VFP ABI
but only if the compiler had actual code to compile.

Adding a dummy main() body in the arm_arch_FUNC_ok
effective target does the trick: with this patch, this effective
target test would fail, making the test unsupported rather
than failing later.

OK?


Ok. The arm__ok checks should really be able to tell us
that we can at least get to an assembly result if we add the option
in question. The quirk with the Thumb1 hard-float error is that it only
appears when compiling a function, because it's part of the ABI-conformance
code that picks where to put the result and arguments of a function.
If we have a source-file that only checks preprocessor macros it will not 
trigger.

Anyway, thanks for sending this, it will clean up some ugly FAILs in arm.exp.
Kyrill



Christophe

[PATCH][arm] Fix gcc.target/arm/pr70278.c

2018-01-18 Thread Kyrill Tkachov


Hi all,

This test needs an armv4t Thumb1 target but doesn't add the right effective 
target checks.
This patch adds them so the test is skipped appropriately on Thumb2 hard-float 
tarets.

Committing to trunk.
Thanks,
Kyrill

2018-01-18  Kyrylo Tkachov  

* gcc.target/arm/pr70278.c: Add effective target checks for armv4t
and arm_thumb1_ok.
diff --git a/gcc/testsuite/gcc.target/arm/pr70278.c b/gcc/testsuite/gcc.target/arm/pr70278.c
index c44c07b7b59f906e27ee3046a09988afb665eed5..97034166f46ee7d19b7c6819a67c58d7ab580b1a 100644
--- a/gcc/testsuite/gcc.target/arm/pr70278.c
+++ b/gcc/testsuite/gcc.target/arm/pr70278.c
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv4t" } } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
+/* { dg-require-effective-target arm_arch_v4t_ok } */
+/* { dg-require-effective-target arm_thumb1_ok } */
 /* { dg-options "-mthumb" } */
 /* { dg-add-options arm_arch_v4t } */
 /*

Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()

2018-01-18 Thread Richard Biener

On Thu, Jan 18, 2018 at 3:07 PM, Richard Earnshaw (lists)
 wrote:
> On 18/01/18 12:44, Richard Biener wrote:
>> On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw
>>  wrote:
>>>
>>> This patch adds generic support for the new builtin
>>> __builtin_speculation_safe_load.  It provides the overloading of the
>>> different access sizes and a default fall-back expansion for targets
>>> that do not support a mechanism for inhibiting speculation.
>>
>> +  if (ignore)
>> +{
>> +  warning_at (input_location, 0,
>> + "result of __builtin_speculation_safe_load must be used to 
>> "
>> + "ensure correct operation");
>> +  target = NULL;
>> +}
>>
>> This warning cannot be disabled but these kind of cases could appear via
>> path isolation or missed optimizations since memory optimizations do not
>> recognize such speculative loads.  So - should it not be emitted way
>> earlier instead, like from the FEs or during some early warning pass?
>
> If you think so, and can recommend where to put it.  I'm not
> particularly familiar with the front-end code, being mostly a back-end
> maintainer.  Otherwise we could probably drop this entirely, but that
> gives scope for users to do the Wrong Thing(tm)!

I'd say the same place you do the __builtin_X -> __builtin_X_{1,2,4,8}
disambiguation.

>>
>> In which case it could be an error as well, no?
>>
>> +  set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode),
>> +  get_pointer_alignment (arg0)));
>>
>> err...  so there's no way to do an unaligned speculation safe load?  Using
>> just get_pointer_alignment would be safe here.   Other builtins/IFNs 
>> explicitely
>> pass down the alignment and FEs generate such alignment from their
>> language constraints.
>
> No, last time I looked GCC didn't support taking the address of an
> unaligned object and casting pointer to char to pointer to int implies
> you know that it is correctly aligned.

Sure, just take the address of a packed struct member.

__builtin_speculation_safe_load (&p->x, ...);

where's the cast here?

>
>>
>> +  set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0;
>>
>> sorry, but pointer types are arbitrary in GIMPLE so this is clearly
>> wrong.  Without
>> more information you have to use zero.  Thus,
>>
>> __builtin_speculation_safe_load ((int *)p, ...);
>>
>> will do the wrong thing if the type of p is not int *.
>
> Ok, I'll fix that.
>
>>
>> +  /* Mark the memory access as volatile.  We don't want the optimizers to
>> + move it or otherwise substitue an alternative value.  */
>> +  MEM_VOLATILE_P (mem) = 1;
>>
>> how is moving or substituting in any way related to inhibiting speculation?
>> Why's this done for all targets rather than only those that need any such
>> mitigation?
>
> I can move that, but it seemed right to do it here.
>
>>
>> I btw miss implementation for x86_64 (at least) where the agreed upon 
>> mitigation
>> is to insert 'lfence' before the load.
>
> /I'm/ not planning on doing implementations for other targets.  I don't
> know those back-ends.  Sorry.  It really needs the attention of a port
> expert.

Ok, fair enough.  But I'd not like to backport and release an
incomplete feature.

>>
>> Any reason the builtin expansion is not fully left to the targets?
>
> Because that would be a significant impediment to deploying the builtin
> in generic code.  The compiler would have to error out if it couldn't
> handle the builtin at all.  By providing a generic implementation we can
> support everything, even if you do get a warning.

I meant you are doing part of the expansion in generic code, including
expanding the MEM, setting it volatile, expanding the bounds (even if
a target only emits a barrier and the load).

So I'd expected expand_speculation_safe_load to just call the hook.
_Maybe_ do some initial MEM expansion.

>>
>> +/* Suppressing speculation.  Users are expected to use the first (N)
>> +   variant, which will be translated internally into one of the other
>> +   types.  */
>> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load",
>> +BT_FN_VOID_VAR, ATTR_NULL)
>> +
>> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, 
>> "speculation_safe_load_1",
>> +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
>> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, 
>> "speculation_safe_load_2",
>> +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL)
>>
>> any reason you are not simply using an internal function for the
>> non-_N variants?
>> Looks like you're closely following the atomic/sync builtins but those
>> predate IFNs I think.
>> They also have more constraints on alignment and use a restrictive alias set.
>
> Wouldn't that make it more difficult to do backports to older versions
> of GCC?  I know some folk are interested in going back as far as
> gcc-4.8, and maybe even earlier.

Possibly.

>>
>> +rtx
>> +default_sp

[PATCH,AIX] Optimize parsing of include files.

2018-01-18 Thread REIX, Tony


Description:
 * This patch optimizes the time required for parsing the include files.

Tests:
 * AIX: Build: SUCCESS
   - build made by means of gmake on AIX.

ChangeLog:
  * xcoff.c: Optimize parsing of include files.



Cordialement,

Tony Reix

Bull - ATOS
IBM Coop Architect & Technical Leader

Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net
Index: libbacktrace/xcoff.c
===
--- libbacktrace/xcoff.c	(revision 256837)
+++ libbacktrace/xcoff.c	(working copy)
@@ -760,6 +760,40 @@ xcoff_fileline (struct backtrace_state *state, uin
   return callback (data, pc, NULL, 0, NULL);
 }
 
+/* Compare struct xcoff_incl for qsort.  */
+
+static int
+xcoff_incl_compare (const void *v1, const void *v2)
+{
+  const struct xcoff_incl *in1 = (const struct xcoff_incl *) v1;
+  const struct xcoff_incl *in2 = (const struct xcoff_incl *) v2;
+
+  if (in1->begin < in2->begin)
+return -1;
+  else if (in1->begin > in2->begin)
+return 1;
+  else
+return 0;
+}
+
+/* Find a lnnoptr in an include file.  */
+
+static int
+xcoff_incl_search (const void *vkey, const void *ventry)
+{
+  const uintptr_t *key = (const uintptr_t *) vkey;
+  const struct xcoff_incl *entry = (const struct xcoff_incl *) ventry;
+  uintptr_t lnno;
+
+  lnno = *key;
+  if (lnno < entry->begin)
+return -1;
+  else if (lnno > entry->end)
+return 1;
+  else
+return 0;
+}
+
 /* Add a new mapping to the vector of line mappings that we are
building.  Returns 1 on success, 0 on failure.  */
 
@@ -809,7 +843,6 @@ xcoff_process_linenos (struct backtrace_state *sta
   uintptr_t pc;
   uint32_t lnno;
   int begincl;
-  size_t i;
 
   aux = (const b_xcoff_auxent *) (fsym + 1);
   lnnoptr = aux->x_fcn.x_lnnoptr;
@@ -839,15 +872,13 @@ xcoff_process_linenos (struct backtrace_state *sta
   /* If part of a function other than the beginning comes from an
 	 include file, the line numbers are absolute, rather than
 	 relative to the beginning of the function.  */
-  for (i = 0; i < vec->count; ++i)
-	{
-	  incl = (struct xcoff_incl *) vec->vec.base + i;
-	  if (incl->begin <= lnnoptr && lnnoptr <= incl->end)
-	break;
-	}
+  incl = (struct xcoff_incl *) bsearch (&lnnoptr, vec->vec.base,
+	vec->count,
+	sizeof (struct xcoff_incl),
+	xcoff_incl_search);
   if (begincl == -1)
-	begincl = (i < vec->count);
-  if (i < vec->count)
+	begincl = incl != NULL;
+  if (incl != NULL)
 	{
 	  filename = incl->filename;
 	  if (begincl == 1)
@@ -935,6 +966,9 @@ xcoff_initialize_fileline (struct backtrace_state
   i += asym->n_numaux;
 }
 
+  backtrace_qsort (vec.vec.base, vec.count,
+		   sizeof (struct xcoff_incl), xcoff_incl_compare);
+
   filename = NULL;
   fsym = NULL;
   for (i = 0; i < nsyms; ++i)

[PATCH] Fix profile_quality sanity check.

2018-01-18 Thread Martin Liška

Hi.

Following patch adds a new enum value so that we don't see following warning:
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01211.html

Apart from that I decided to not to number values of the enum as it uses
default number. Is it welcome?

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-01-18  Martin Liska  

* profile-count.h (enum profile_quality): Add
profile_uninitialized as the first value. Do not number values
as they are zero based.
---
 gcc/profile-count.h | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)


diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 7a43917ebbc..e899963118b 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -26,34 +26,36 @@ struct function;
 /* Quality of the profile count.  Because gengtype does not support enums
inside of classes, this is in global namespace.  */
 enum profile_quality {
+  /* Uninitialized value.  */
+  profile_uninitialized,
   /* Profile is based on static branch prediction heuristics and may
  or may not match reality.  It is local to function and can not be compared
  inter-procedurally.  Never used by probabilities (they are always local).
*/
-  profile_guessed_local = 1,
+  profile_guessed_local,
   /* Profile was read by feedback and was 0, we used local heuristics to guess
  better.  This is the case of functions not run in profile fedback.
  Never used by probabilities.  */
-  profile_guessed_global0 = 2,
+  profile_guessed_global0,
 
   /* Same as profile_guessed_global0 but global count is adjusted 0.  */
-  profile_guessed_global0adjusted = 3,
+  profile_guessed_global0adjusted,
 
   /* Profile is based on static branch prediction heuristics.  It may or may
  not reflect the reality but it can be compared interprocedurally
  (for example, we inlined function w/o profile feedback into function
   with feedback and propagated from that).
  Never used by probablities.  */
-  profile_guessed = 4,
+  profile_guessed,
   /* Profile was determined by autofdo.  */
-  profile_afdo = 5,
+  profile_afdo,
   /* Profile was originally based on feedback but it was adjusted
  by code duplicating optimization.  It may not precisely reflect the
  particular code path.  */
-  profile_adjusted = 6,
+  profile_adjusted,
   /* Profile was read from profile feedback or determined by accurate static
  method.  */
-  profile_precise = 7
+  profile_precise
 };
 
 /* The base value for branch probability notes and edge probabilities.  */

New code merge optimization?

2018-01-18 Thread Sebastian Perta

Hello,

I am interested in implementing a new pass in gcc to merge identical
sequences of code in GCC to be used mainly for RL78. 
The commercial RL78 compilers have such algorithms implemented and they make
quite good use of it.
Opportunities arise from the limited capabilities of RL78, for other targets
this might be a lot less useful.

A while ago I found the following:
https://www.gnu.org/software/gcc/projects/cfo.html
And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and
this is what I found out:
For RX: no visible improvements with any of them 
For RL78: some minor improvements only with -frtl-seqbastr:
Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c  with
"-Os" and "-Os  -frtl-seqabstr" (using the modified gcc 4.9.2)
The algorithm was effective only in 60 files(out of 1643 files, that's only
0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute)
On those 60 files I got an average of 6.5% improvement with the best
improvement for pr58574.c (36.4%).

What do you think: is it worthwhile porting this to the trunk or I will just
waste my time?
Or should I start fresh? Maybe start from here:
http://llvm.org/docs/MergeFunctions.html?

Gimple or rtl?
I suppose the preferred way will be to do this in gimple; however based on
what I did so far,  it's more likely to find identical sequences in rtl (for
RL78).

Any thoughts? Thank you!

Best Regards,
Sebastian

Re: New code merge optimization?

2018-01-18 Thread Martin Jambor

Hi,

On Thu, Jan 18 2018, Sebastian Perta wrote:
> Hello,
>
> I am interested in implementing a new pass in gcc to merge identical
> sequences of code in GCC to be used mainly for RL78. 
> The commercial RL78 compilers have such algorithms implemented and they make
> quite good use of it.
> Opportunities arise from the limited capabilities of RL78, for other targets
> this might be a lot less useful.
>
> A while ago I found the following:
> https://www.gnu.org/software/gcc/projects/cfo.html
> And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and
> this is what I found out:
> For RX: no visible improvements with any of them 
> For RL78: some minor improvements only with -frtl-seqbastr:
> Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c  with
> "-Os" and "-Os  -frtl-seqabstr" (using the modified gcc 4.9.2)
> The algorithm was effective only in 60 files(out of 1643 files, that's only
> 0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute)
> On those 60 files I got an average of 6.5% improvement with the best
> improvement for pr58574.c (36.4%).
>
> What do you think: is it worthwhile porting this to the trunk or I will just
> waste my time?
> Or should I start fresh? Maybe start from here:
> http://llvm.org/docs/MergeFunctions.html?

Martin Liška contributed identical code folding to GCC quite a few years
ago now.  Look up the -fipa-icf option.  If that is not enough for you,
starting from that seems more natural.

Martin

>
> Gimple or rtl?
> I suppose the preferred way will be to do this in gimple; however based on
> what I did so far,  it's more likely to find identical sequences in rtl (for
> RL78).
>
> Any thoughts? Thank you!
>
> Best Regards,
> Sebastian

Re: [C++ Patch] PR 81013 ("[7/8 Regression] ICE with invalid union in class hierarchy")

2018-01-18 Thread Jason Merrill

OK.

On Thu, Jan 18, 2018 at 8:43 AM, Paolo Carlini  wrote:
> Hi,
>
> this error recovery regression is caused by my work for c++/70202 back in
> 2016. At the time I remember wondering whether in the easy cases of errors
> right at beginning of xref_basetypes (not those in the loop below handled
> via dropped_base) we really wanted to proceed or immediately bail out, as
> far as error recovery in concerned. This new testcase shows that if we don't
> bail out early we may end up crashing much later, in cgraph.c. Thus the
> straightforward fix below. Tested x86_64-linux.
>
> Thanks, Paolo.
>
> /
>

Re: [PATCH] Fix profile_quality sanity check.

2018-01-18 Thread Tom de Vries


On 01/18/2018 03:59 PM, Martin Liška wrote:

Hi.

Following patch adds a new enum value so that we don't see following warning:
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01211.html



Hi,

with the patch, I still see the same warning.

And not surprisingly, given that profile_precise is still 7 and 
m_quality is still a 3 bits wide bitfield.


So, I don't quite understand why you think that the patch would address 
the warning.


Thanks,
- Tom


Apart from that I decided to not to number values of the enum as it uses
default number. Is it welcome?

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-01-18  Martin Liska  

* profile-count.h (enum profile_quality): Add
profile_uninitialized as the first value. Do not number values
as they are zero based.
---
  gcc/profile-count.h | 16 +---
  1 file changed, 9 insertions(+), 7 deletions(-)



0001-Fix-profile_quality-sanity-check.patch


diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 7a43917ebbc..e899963118b 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -26,34 +26,36 @@ struct function;
  /* Quality of the profile count.  Because gengtype does not support enums
 inside of classes, this is in global namespace.  */
  enum profile_quality {
+  /* Uninitialized value.  */
+  profile_uninitialized,
/* Profile is based on static branch prediction heuristics and may
   or may not match reality.  It is local to function and can not be 
compared
   inter-procedurally.  Never used by probabilities (they are always local).
 */
-  profile_guessed_local = 1,
+  profile_guessed_local,
/* Profile was read by feedback and was 0, we used local heuristics to guess
   better.  This is the case of functions not run in profile fedback.
   Never used by probabilities.  */
-  profile_guessed_global0 = 2,
+  profile_guessed_global0,
  
/* Same as profile_guessed_global0 but global count is adjusted 0.  */

-  profile_guessed_global0adjusted = 3,
+  profile_guessed_global0adjusted,
  
/* Profile is based on static branch prediction heuristics.  It may or may

   not reflect the reality but it can be compared interprocedurally
   (for example, we inlined function w/o profile feedback into function
with feedback and propagated from that).
   Never used by probablities.  */
-  profile_guessed = 4,
+  profile_guessed,
/* Profile was determined by autofdo.  */
-  profile_afdo = 5,
+  profile_afdo,
/* Profile was originally based on feedback but it was adjusted
   by code duplicating optimization.  It may not precisely reflect the
   particular code path.  */
-  profile_adjusted = 6,
+  profile_adjusted,
/* Profile was read from profile feedback or determined by accurate static
   method.  */
-  profile_precise = 7
+  profile_precise
  };
  
  /* The base value for branch probability notes and edge probabilities.  */

Fix ICE with profile info mismatch

2018-01-18 Thread Jan Hubicka

Hi,
this patch ICE where the profile in cgraph mismatch profile in BB. This is 
becuase
of expansion of speculative devirtualization where we get some roundoff issues.

Bootstrapped/regtested x86_64-linux, comitted.
Honza

PR ipa/83619
* g++.dg/torture/pr83619.C: New testcase.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge
frequencies.
Index: cgraph.c
===
--- cgraph.c(revision 256847)
+++ cgraph.c(working copy)
@@ -1327,6 +1327,7 @@ cgraph_edge::redirect_call_stmt_to_calle
  e->speculative = false;
  e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
 false);
+ e->count = gimple_bb (e->call_stmt)->count;
 
  /* Fix edges for BUILT_IN_CHKP_BNDRET calls attached to the
 processed call stmt.  */
@@ -1346,6 +1347,7 @@ cgraph_edge::redirect_call_stmt_to_calle
}
 
  e2->speculative = false;
+ e2->count = gimple_bb (e2->call_stmt)->count;
  ref->speculative = false;
  ref->stmt = NULL;
  /* Indirect edges are not both in the call site hash.
Index: testsuite/g++.dg/torture/pr83619.C
===
--- testsuite/g++.dg/torture/pr83619.C  (revision 0)
+++ testsuite/g++.dg/torture/pr83619.C  (working copy)
@@ -0,0 +1,67 @@
+// { dg-do compile }
+int a;
+class d
+{
+public:
+  virtual unsigned c ();
+};
+class e;
+class i
+{
+  void h ();
+
+public:
+  void
+  operator= (e *f)
+  {
+j::c (f);
+h ();
+  }
+  template  struct j
+  {
+static void
+c (e *g)
+{
+  g->c ();
+}
+  };
+};
+class k;
+class l
+{
+public:
+  l (int);
+  k *operator-> ();
+};
+class e : public d
+{
+};
+class m final : e
+{
+  unsigned c ();
+};
+class k
+{
+public:
+  virtual int o (e *) = 0;
+};
+class H : d, k
+{
+  int o (e *);
+  i n;
+};
+unsigned
+m::c ()
+{
+  l b = 0;
+  b->o (this);
+  return a;
+}
+int
+H::o (e *p)
+{
+  n = p;
+  return a;
+}
+
+

RE: New code merge optimization?

2018-01-18 Thread Sebastian Perta

Hi,

Thank you!
As the description says this finds equivalent functions, I would like to find 
identical sequences inside
functions but at least this will provide all the up to date tools to compare 
code sequences, thank you again!

Best Regards,
Sebastian


> -Original Message-
> From: Martin Jambor [mailto:mjam...@suse.cz]
> Sent: 18 January 2018 15:18
> To: Sebastian Perta ; gcc-
> patc...@gcc.gnu.org
> Cc: l...@gcc.gnu.org
> Subject: Re: New code merge optimization?
>
> Hi,
>
> On Thu, Jan 18 2018, Sebastian Perta wrote:
> > Hello,
> >
> > I am interested in implementing a new pass in gcc to merge identical
> > sequences of code in GCC to be used mainly for RL78.
> > The commercial RL78 compilers have such algorithms implemented and
> they make
> > quite good use of it.
> > Opportunities arise from the limited capabilities of RL78, for other targets
> > this might be a lot less useful.
> >
> > A while ago I found the following:
> > https://www.gnu.org/software/gcc/projects/cfo.html
> > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and
> > this is what I found out:
> > For RX: no visible improvements with any of them
> > For RL78: some minor improvements only with -frtl-seqbastr:
> > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c  with
> > "-Os" and "-Os  -frtl-seqabstr" (using the modified gcc 4.9.2)
> > The algorithm was effective only in 60 files(out of 1643 files, that's only
> > 0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute)
> > On those 60 files I got an average of 6.5% improvement with the best
> > improvement for pr58574.c (36.4%).
> >
> > What do you think: is it worthwhile porting this to the trunk or I will just
> > waste my time?
> > Or should I start fresh? Maybe start from here:
> > http://llvm.org/docs/MergeFunctions.html?
>
> Martin Liška contributed identical code folding to GCC quite a few years
> ago now.  Look up the -fipa-icf option.  If that is not enough for you,
> starting from that seems more natural.
>
> Martin
>
> >
> > Gimple or rtl?
> > I suppose the preferred way will be to do this in gimple; however based on
> > what I did so far,  it's more likely to find identical sequences in rtl (for
> > RL78).
> >
> > Any thoughts? Thank you!
> >
> > Best Regards,
> > Sebastian



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, 
Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered 
No. 04586709.

[PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.

2018-01-18 Thread Carl Love

GCC maintainers:

The following patch adds missing 128-bit support for the builtins
vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be().  It also includes a bug
fix required for the new 128-bit arguments for the vec_xst_be() and
vec_xl_be() builtins.  New test cases are also included.  This patch
completes the tests of all the various load/store builtins that I have
been working on.  This patch adds a torture test for the various load
store tests to check that they work for -O0, -O1 and -O2 testing.  This
was done as the work on the load/store builtins found numerous issues
that were optimization level dependent which were fixed by the previous
commits as well as this commit.

The patch has been tested on 

  powerpc64le-unknown-linux-gnu (Power 8 LE)
  powerpc64le-unknown-linux-gnu (Power 8 BE)
  powerpc64le-unknown-linux-gnu (Power 9 LE)

Let me know if the patch looks OK or not.  Let me know if you want to
include it in stage 4 or wait for the next release.  Thanks.

   Carl Love



gcc/ChangeLog:

2018-01-18 Carl Love 

* config/rs6000/rs6000-builtin.def (ST_ELEMREV_V1TI, LD_ELEMREV_V1TI,
LVX_V1TI): Add macro expansion.
* config/rs6000/rs6000-c.c (altivec_builtin_types): Add argument
definitions for VSX_BUILTIN_VEC_XST_BE, VSX_BUILTIN_VEC_ST,
VSX_BUILTIN_VEC_XL, LD_ELEMREV_V1TI builtins.
* config/rs6000/rs6000-p8swap.c (insn_is_swappable_p);
Change check to determine if the instruction is a byte reversing
entry.  Fix typo in comment.
* config/rs6000/rs6000.c (altivec_expand_builtin): Add case entry
for VSX_BUILTIN_ST_ELEMREV_V1TI and VSX_BUILTIN_LD_ELEMREV_V1TI.
Add def_builtin calls for new builtins.
* config/rs6000/vsx.md (vsx_st_elemrev_v1ti, vsx_ld_elemrev_v1ti):
Add define_insn expansion.

gcc/testsuite/ChangeLog:

2018-01-18  Carl Love  
* gcc.target/powerpc/powerpc.exp: Add torture tests for
 builtins-4-runnable.c, builtins-6-runnable.c,
 builtins-5-p9-runnable.c, builtins-6-p9-runnable.c.
* gcc.target/powerpc/builtins-6-runnable.c: New test file.
* gcc.target/powerpc/builtins-4-runnable.c: Add additional tests
for signed/unsigned 128-bit and long long int loads.
---
 gcc/config/rs6000/rs6000-builtin.def   |3 +
 gcc/config/rs6000/rs6000-c.c   |   39 +
 gcc/config/rs6000/rs6000-p8swap.c  |5 +-
 gcc/config/rs6000/rs6000.c |   20 +
 gcc/config/rs6000/vsx.md   |   43 +-
 .../gcc.target/powerpc/builtins-4-runnable.c   |  494 +-
 .../gcc.target/powerpc/builtins-6-runnable.c   | 1001 
 gcc/testsuite/gcc.target/powerpc/powerpc.exp   |   12 +
 8 files changed, 1582 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-6-runnable.c

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index b17036c5a..757fd6d50 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1234,6 +1234,7 @@ BU_ALTIVEC_X (LVXL_V8HI,  "lvxl_v8hi",MEM)
 BU_ALTIVEC_X (LVXL_V16QI,  "lvxl_v16qi",   MEM)
 BU_ALTIVEC_X (LVX, "lvx",  MEM)
 BU_ALTIVEC_X (LVX_V2DF,"lvx_v2df", MEM)
+BU_ALTIVEC_X (LVX_V1TI,"lvx_v1ti", MEM)
 BU_ALTIVEC_X (LVX_V2DI,"lvx_v2di", MEM)
 BU_ALTIVEC_X (LVX_V4SF,"lvx_v4sf", MEM)
 BU_ALTIVEC_X (LVX_V4SI,"lvx_v4si", MEM)
@@ -1783,12 +1784,14 @@ BU_VSX_X (STXVW4X_V4SF,   "stxvw4x_v4sf",   MEM)
 BU_VSX_X (STXVW4X_V4SI,  "stxvw4x_v4si",   MEM)
 BU_VSX_X (STXVW4X_V8HI,  "stxvw4x_v8hi",   MEM)
 BU_VSX_X (STXVW4X_V16QI,  "stxvw4x_v16qi", MEM)
+BU_VSX_X (LD_ELEMREV_V1TI,"ld_elemrev_v1ti",  MEM)
 BU_VSX_X (LD_ELEMREV_V2DF,"ld_elemrev_v2df",  MEM)
 BU_VSX_X (LD_ELEMREV_V2DI,"ld_elemrev_v2di",  MEM)
 BU_VSX_X (LD_ELEMREV_V4SF,"ld_elemrev_v4sf",  MEM)
 BU_VSX_X (LD_ELEMREV_V4SI,"ld_elemrev_v4si",  MEM)
 BU_VSX_X (LD_ELEMREV_V8HI,"ld_elemrev_v8hi",  MEM)
 BU_VSX_X (LD_ELEMREV_V16QI,   "ld_elemrev_v16qi", MEM)
+BU_VSX_X (ST_ELEMREV_V1TI,"st_elemrev_v1ti",  MEM)
 BU_VSX_X (ST_ELEMREV_V2DF,"st_elemrev_v2df",  MEM)
 BU_VSX_X (ST_ELEMREV_V2DI,"st_elemrev_v2di",  MEM)
 BU_VSX_X (ST_ELEMREV_V4SF,"st_elemrev_v4sf",  MEM)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 123e46aa1..a0f790d39 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -3149,16 +3149,27 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 },
   { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LXVD2X_V2DF,
 RS6000_BTI_V2DF, RS6000_BT

Re: [PATCH][arm] Convert gcc.target/arm/stl-cond.c into an RTL test

2018-01-18 Thread Christophe Lyon

On 18 January 2018 at 10:28, Kyrill  Tkachov
 wrote:
> Hi Christophe,
>
>
> On 18/01/18 09:25, Christophe Lyon wrote:
>>
>> Hi Kyrill,
>>
>>
>> On 17 January 2018 at 12:27, Kyrill  Tkachov
>>  wrote:
>>>
>>> Hi all,
>>>
>>> This is an awkward testsuite failure. The original bug was that we were
>>> failing to put out
>>> the conditional code in the conditional form of the STL instruction
>>> (oops!).
>>> So we wanted to output STLNE, but instead output STL.
>>> The testacase relies on if-conversion to conditionalise the insn for STL.
>>> However, ever since r251643 the expansion of a non-relaxed atomic store
>>> always includes a compiler barrier. That blocks if-conversion in all
>>> cases.
>>>
>>> So there's no easy to get to a conditional STL instruction from a C
>>> program.
>>> But we do want to test for the original bug fix that if the RTL insn for
>>> STL
>>> is conditionalised
>>> it should output the conditional code.
>>>
>>> The solution in this patch is to convert the test into an RTL test with
>>> the
>>> COND_EXEC form
>>> of the STL insn and scan the assembly output there.
>>> This seems to work fine, and gives us an opportunity to create a
>>> gcc.dg/rtl/arm directory
>>> in the RTL tests.
>>>
>>> This now makes the gcc.target/arm/stl-cond.c disappear (as the test is
>>> deleted) and
>>> the new test in gcc.dg/rtl/arm/stl-cond.c passes.
>>>
>>> Committing to trunk.
>>> Thanks,
>>> Kyrill
>>>
>>> 2018-01-17  Kyrylo Tkachov  
>>>
>>>  * gcc.dg/rtl/arm/stl-cond.c: New test.
>>>  * gcc.target/arm/stl-cond.c: Delete.
>>
>> I've noticed that the new test is unsupported on armeb, is this
>> intentional?
>
>
> I think it can PASS on armeb. I guess it's just a matter of changing the
> target in
> /* { dg-do compile { target arm-*-* } } */
> to arm*-*-*.
>
> Such a patch is pre-approved.
> Kyrill
>

OK, done as r256851 after confirming that is does pass on armeb.

Christophe

>> Christophe
>
>

Re: New code merge optimization?

2018-01-18 Thread Andrew Pinski

On Thu, Jan 18, 2018 at 8:03 AM, Sebastian Perta
 wrote:
> Hi,
>
> Thank you!
> As the description says this finds equivalent functions, I would like to find 
> identical sequences inside
> functions but at least this will provide all the up to date tools to compare 
> code sequences, thank you again!

There is already some identical sequence finding (code hoisting) in
GCC 7 (and above), see
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00360.html
for the patch.

Thanks,
Andrew

>
> Best Regards,
> Sebastian
>
>
>> -Original Message-
>> From: Martin Jambor [mailto:mjam...@suse.cz]
>> Sent: 18 January 2018 15:18
>> To: Sebastian Perta ; gcc-
>> patc...@gcc.gnu.org
>> Cc: l...@gcc.gnu.org
>> Subject: Re: New code merge optimization?
>>
>> Hi,
>>
>> On Thu, Jan 18 2018, Sebastian Perta wrote:
>> > Hello,
>> >
>> > I am interested in implementing a new pass in gcc to merge identical
>> > sequences of code in GCC to be used mainly for RL78.
>> > The commercial RL78 compilers have such algorithms implemented and
>> they make
>> > quite good use of it.
>> > Opportunities arise from the limited capabilities of RL78, for other 
>> > targets
>> > this might be a lot less useful.
>> >
>> > A while ago I found the following:
>> > https://www.gnu.org/software/gcc/projects/cfo.html
>> > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and
>> > this is what I found out:
>> > For RX: no visible improvements with any of them
>> > For RL78: some minor improvements only with -frtl-seqbastr:
>> > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c  with
>> > "-Os" and "-Os  -frtl-seqabstr" (using the modified gcc 4.9.2)
>> > The algorithm was effective only in 60 files(out of 1643 files, that's only
>> > 0.03% of the files currently present in 
>> > gcc/testsuite/gcc.c-torture/execute)
>> > On those 60 files I got an average of 6.5% improvement with the best
>> > improvement for pr58574.c (36.4%).
>> >
>> > What do you think: is it worthwhile porting this to the trunk or I will 
>> > just
>> > waste my time?
>> > Or should I start fresh? Maybe start from here:
>> > http://llvm.org/docs/MergeFunctions.html?
>>
>> Martin Liška contributed identical code folding to GCC quite a few years
>> ago now.  Look up the -fipa-icf option.  If that is not enough for you,
>> starting from that seems more natural.
>>
>> Martin
>>
>> >
>> > Gimple or rtl?
>> > I suppose the preferred way will be to do this in gimple; however based on
>> > what I did so far,  it's more likely to find identical sequences in rtl 
>> > (for
>> > RL78).
>> >
>> > Any thoughts? Thank you!
>> >
>> > Best Regards,
>> > Sebastian
>
>
>
> Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, 
> Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered 
> No. 04586709.

[PATCH, i386]: Backport a small retpoline improvement to gcc-7 branch

2018-01-18 Thread Uros Bizjak

Hello!

I'd like to backport a small improvement to retpoline functionality to
gcc-7 branch.

2018-01-17  Uros Bizjak  

* config/i386/i386.c (indirect_thunk_name): Declare regno
as unsigned int.  Compare regno with INVALID_REGNUM.
(output_indirect_thunk): Ditto.
(output_indirect_thunk_function): Ditto.
(ix86_code_end): Declare regno as unsigned int.  Use INVALID_REGNUM
in the call to output_indirect_thunk_function.

OK for branch?

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 256797)
+++ config/i386/i386.c  (working copy)
@@ -12051,16 +12051,16 @@ static int indirect_thunks_bnd_used;
 /* Fills in the label name that should be used for the indirect thunk.  */
 
 static void
-indirect_thunk_name (char name[32], int regno, bool need_bnd_p,
-bool ret_p)
+indirect_thunk_name (char name[32], unsigned int regno,
+bool need_bnd_p, bool ret_p)
 {
-  if (regno >= 0 && ret_p)
+  if (regno != INVALID_REGNUM && ret_p)
 gcc_unreachable ();
 
   if (USE_HIDDEN_LINKONCE)
 {
   const char *bnd = need_bnd_p ? "_bnd" : "";
-  if (regno >= 0)
+  if (regno != INVALID_REGNUM)
{
  const char *reg_prefix;
  if (LEGACY_INT_REGNO_P (regno))
@@ -12078,7 +12078,7 @@ static void
 }
   else
 {
-  if (regno >= 0)
+  if (regno != INVALID_REGNUM)
{
  if (need_bnd_p)
ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno);
@@ -12130,7 +12130,7 @@ static void
  */
 
 static void
-output_indirect_thunk (bool need_bnd_p, int regno)
+output_indirect_thunk (bool need_bnd_p, unsigned int regno)
 {
   char indirectlabel1[32];
   char indirectlabel2[32];
@@ -12160,7 +12160,7 @@ static void
 
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
 
-  if (regno >= 0)
+  if (regno != INVALID_REGNUM)
 {
   /* MOV.  */
   rtx xops[2];
@@ -12184,12 +12184,12 @@ static void
 }
 
 /* Output a funtion with a call and return thunk for indirect branch.
-   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
-   function address is in REGNO.  Otherwise, the function address is
+   If BND_P is true, the BND prefix is needed.  If REGNO != UNVALID_REGNUM,
+   the function address is in REGNO.  Otherwise, the function address is
on the top of stack.  */
 
 static void
-output_indirect_thunk_function (bool need_bnd_p, int regno)
+output_indirect_thunk_function (bool need_bnd_p, unsigned int regno)
 {
   char name[32];
   tree decl;
@@ -12238,7 +12238,7 @@ static void
ASM_OUTPUT_LABEL (asm_out_file, name);
   }
 
-  if (regno < 0)
+  if (regno == INVALID_REGNUM)
 {
   /* Create alias for __x86.return_thunk/__x86.return_thunk_bnd.  */
   char alias[32];
@@ -12312,16 +12312,16 @@ static void
 ix86_code_end (void)
 {
   rtx xops[2];
-  int regno;
+  unsigned int regno;
 
   if (indirect_thunk_needed)
-output_indirect_thunk_function (false, -1);
+output_indirect_thunk_function (false, INVALID_REGNUM);
   if (indirect_thunk_bnd_needed)
-output_indirect_thunk_function (true, -1);
+output_indirect_thunk_function (true, INVALID_REGNUM);
 
   for (regno = FIRST_REX_INT_REG; regno <= LAST_REX_INT_REG; regno++)
 {
-  int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1;
+  unsigned int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1;
   if ((indirect_thunks_used & (1 << i)))
output_indirect_thunk_function (false, regno);

Fix call of edge insertions hooks when cloning

2018-01-18 Thread Jan Hubicka

Hi,
this patch was attached to the PR for a while.  I have regtested it and 
comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 256850)
+++ ChangeLog   (working copy)
@@ -1,5 +1,12 @@
 2018-01-18  Jan Hubicka  
 
+   PR ipa/82256
+   patch by PaX Team
+   * cgraphclones.c (cgraph_node::create_version_clone_with_body):
+   Fix call of call_cgraph_insertion_hooks.
+
+2018-01-18  Jan Hubicka  
+
PR ipa/83619
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge
frequencies.
Index: cgraphclones.c
===
--- cgraphclones.c  (revision 256849)
+++ cgraphclones.c  (working copy)
@@ -1007,7 +1007,7 @@ cgraph_node::create_version_clone_with_b
   /* Update the call_expr on the edges to call the new version node. */
   update_call_expr (new_version_node);
 
-  symtab->call_cgraph_insertion_hooks (this);
+  symtab->call_cgraph_insertion_hooks (new_version_node);
   return new_version_node;
 }

Re: [PATCH] document -Wclass-memaccess suppression by casting (PR 81327)

2018-01-18 Thread Martin Sebor


On 01/15/2018 03:09 AM, Florian Weimer wrote:

* Martin Sebor:


+the virtual table.  Modifying the representation of such objects may violate

   ^vtable pointer?

The vtable itself is not corrupted, I assume.


Well, what happens is undefined, so who knows? ;)

But of course in reality, writing to an object can only overwrite
what's stored in the object, so only the vtable pointers will be
corrupted.  I doubt anyone would have misunderstood what was meant
but I've committed r256852 for the sake of being accurate.

Martin

RE: New code merge optimization?

2018-01-18 Thread Sebastian Perta

Hi,

Thank you very much!
I tried with and without -fcode-hoisting the file I mentioned earlier 
gcc/testsuite/gcc.c-torture/execute/pr58574.c but it remains the same.
But I can certainly extend on this.

Sorry I didn't found out this myself! I'll do more research next time before I 
start asking questions.

Best Regards,
Sebastian



> -Original Message-
> From: Andrew Pinski [mailto:pins...@gmail.com]
> Sent: 18 January 2018 16:19
> To: Sebastian Perta 
> Cc: Martin Jambor ; gcc-patches@gcc.gnu.org;
> l...@gcc.gnu.org
> Subject: Re: New code merge optimization?
>
> On Thu, Jan 18, 2018 at 8:03 AM, Sebastian Perta
>  wrote:
> > Hi,
> >
> > Thank you!
> > As the description says this finds equivalent functions, I would like to 
> > find
> identical sequences inside
> > functions but at least this will provide all the up to date tools to compare
> code sequences, thank you again!
>
> There is already some identical sequence finding (code hoisting) in
> GCC 7 (and above), see
> https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00360.html
> for the patch.
>
> Thanks,
> Andrew
>
> >
> > Best Regards,
> > Sebastian
> >
> >
> >> -Original Message-
> >> From: Martin Jambor [mailto:mjam...@suse.cz]
> >> Sent: 18 January 2018 15:18
> >> To: Sebastian Perta ; gcc-
> >> patc...@gcc.gnu.org
> >> Cc: l...@gcc.gnu.org
> >> Subject: Re: New code merge optimization?
> >>
> >> Hi,
> >>
> >> On Thu, Jan 18 2018, Sebastian Perta wrote:
> >> > Hello,
> >> >
> >> > I am interested in implementing a new pass in gcc to merge identical
> >> > sequences of code in GCC to be used mainly for RL78.
> >> > The commercial RL78 compilers have such algorithms implemented and
> >> they make
> >> > quite good use of it.
> >> > Opportunities arise from the limited capabilities of RL78, for other
> targets
> >> > this might be a lot less useful.
> >> >
> >> > A while ago I found the following:
> >> > https://www.gnu.org/software/gcc/projects/cfo.html
> >> > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and
> >> > this is what I found out:
> >> > For RX: no visible improvements with any of them
> >> > For RL78: some minor improvements only with -frtl-seqbastr:
> >> > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c
> with
> >> > "-Os" and "-Os  -frtl-seqabstr" (using the modified gcc 4.9.2)
> >> > The algorithm was effective only in 60 files(out of 1643 files, that's 
> >> > only
> >> > 0.03% of the files currently present in gcc/testsuite/gcc.c-
> torture/execute)
> >> > On those 60 files I got an average of 6.5% improvement with the best
> >> > improvement for pr58574.c (36.4%).
> >> >
> >> > What do you think: is it worthwhile porting this to the trunk or I will 
> >> > just
> >> > waste my time?
> >> > Or should I start fresh? Maybe start from here:
> >> > http://llvm.org/docs/MergeFunctions.html?
> >>
> >> Martin Liška contributed identical code folding to GCC quite a few years
> >> ago now.  Look up the -fipa-icf option.  If that is not enough for you,
> >> starting from that seems more natural.
> >>
> >> Martin
> >>
> >> >
> >> > Gimple or rtl?
> >> > I suppose the preferred way will be to do this in gimple; however based
> on
> >> > what I did so far,  it's more likely to find identical sequences in rtl 
> >> > (for
> >> > RL78).
> >> >
> >> > Any thoughts? Thank you!
> >> >
> >> > Best Regards,
> >> > Sebastian
> >
> >
> >
> > Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne
> End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under
> Registered No. 04586709.



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, 
Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered 
No. 04586709.

[Committed][AArch64] Fix fp16 test failures after PR82964 fix

2018-01-18 Thread Wilco Dijkstra

This fixes test failures in gcc.target/aarch64/f16_mov_immediate_*.c
after fixing PR82964 
(https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01636.html). 
The check for a scalar floating point constant didn't include 16-bit floating 
point modes,
so use GET_MODE_CLASS instead.

This fixes the failures and has no effect otherwise.  Committed as trivial fix.

ChangeLog:
2018-01-18  Wilco Dijkstra  

gcc/
PR target/82964
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
Use GET_MODE_CLASS for scalar floating point.
--

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
671168cc3f32ac3f2669fc79835435fdf618a56d..c6a83c881038873d8b68e36f906783be63ddde56
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11813,8 +11813,7 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x)
 {
   /* Support CSE and rematerialization of common constants.  */
   if (CONST_INT_P (x)
-  || (CONST_DOUBLE_P (x)
- && (mode == SFmode || mode == DFmode || mode == TFmode))
+  || (CONST_DOUBLE_P (x) && GET_MODE_CLASS (mode) == MODE_FLOAT)
   || GET_CODE (x) == CONST_VECTOR)
 return true;

Re: [PATCH] PR82964: Fix 128-bit immediate ICEs

2018-01-18 Thread Wilco Dijkstra

Christophe Lyon wrote:

> After this patch (r256800), I have noticed new failures on aarch64:
>    gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times
> mov\tw[0-9]+, #?19520 3 (found 0 times)

Thanks for spotting these, the scripts appear to have missed those
(contrib/dg-cmp-results.sh skipped them eventhough they were in
contrib/compare_tests...). Anyway these are fixed now.

Wilco

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Jakub Jelinek

On Thu, Jan 18, 2018 at 08:10:50AM -0500, John David Anglin wrote:
> On 2018-01-17 3:07 PM, Jakub Jelinek wrote:
> > PR83882 complains that PR81715 testcase fails on callee copies parameter
> > targets.  The following patch ought to fix that, but I have only
> > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase
> > with hppa.
> > 
> > John, do you think you could test this on hppa without the callee copies
> > default change?
> On hppa2.0w-hp-hpux11.11, pr81715.C now passes.
> 
> However, there are some new fails in the gcc and g++ suites:

That is really weird, at least in the cross-compiler on throw-1.C I don't
see with -mno-caller-copies the new code triggering at all.
Tried both hppa-linux and hppa-hpux crosses.
Furthermore, throw-1.C has:
/* { dg-skip-if "Stack alignment is too small" { hppa*-*-hpux* } } */
so I wonder how it comes you see them run on hpux at all.
I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c.
What kind of error would be on pr83487-*?

If any of those are really caused by my patch, I'd be interest to see
e.g. *.gimple and *.optimized dumps with and without the patch
(rather than reverting the whole patch, it should be enough to hand edit
in function.c the
&& flag_stack_reuse != SR_NONE
part and change it to && 0 && flag_stack_reuse != SR_NONE or similar.

> FAIL: g++.dg/torture/stackalign/throw-1.C   -O0  execution test
> FAIL: g++.dg/torture/stackalign/throw-1.C   -O0 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-1.C   -O1 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-1.C   -O2 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-1.C   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-1.C   -O3 -g -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-1.C   -Os -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -O0  execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -O0 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -O1 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -O2 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -O3 -g -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-2.C   -Os -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -O0  execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -O0 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -O1 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -O2 -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -O3 -g -fpic execution test
> FAIL: g++.dg/torture/stackalign/throw-4.C   -Os -fpic execution test
> 
> FAIL: gcc.dg/torture/pr52451.c   -O0  execution test
> FAIL: gcc.dg/torture/pr52451.c   -O1  execution test
> FAIL: gcc.dg/torture/pr52451.c   -O2  execution test
> FAIL: gcc.dg/torture/pr52451.c   -O3 -g  execution test
> FAIL: gcc.dg/torture/pr52451.c   -Os  execution test
> 
> FAIL: gcc.dg/compat/pr83487-1 c_compat_y_tst.o compile
> FAIL: gcc.dg/compat/pr83487-2 c_compat_y_tst.o compile
> 
> Dave
> 
> -- 
> John David Anglin  dave.ang...@bell.net

Jakub

Re: [PATCH], Fix PR target/pr83862: Fix PowerPC long double signbit with -mabi=ieeelongdouble

2018-01-18 Thread Segher Boessenkool

On Wed, Jan 17, 2018 at 06:40:12PM -0500, Michael Meissner wrote:
> On Wed, Jan 17, 2018 at 04:09:57PM -0600, Segher Boessenkool wrote:
> > On Tue, Jan 16, 2018 at 10:55:43PM -0500, Michael Meissner wrote:
> > > PR target/83862 pointed out a problem I put into the 128-bit floating 
> > > point
> > > type signbit optimization.  The issue is we want to avoid doing a load to 
> > > a
> > > floating point/vector register and then a direct move to do signbit, so we
> > > change the load to load the upper 64-bits of the floating point value to 
> > > get
> > > the sign bit.  Unfortunately, if the type is IEEE 128-bit and memory is
> > > addressed with an indexed address on a little endian system, it generates 
> > > an
> > > illegal address and generates an internal compiler error.
> > 
> > So all this is caused by these splitters running after reload.  Why do
> > we have to do that?  Do we?  We should be able to just change it to a
> > subreg and shift, early already?
> 
> The part that is failing is trying to optimize the case:
> 
>   x = signbit (*p)
> 
> Doing the code with just registers means that you will get:
> 
>   vr = load *p
>   gr = diret move from vr
>   shift
> 
> With the optimization you get:
> 
>   gr = 'upper' word
>   shift
> 
> If the address is:
> 
>   base + index
> 
> In little endian, the compiler tried to generate:
> 
>   (base + index) + 8
> 
> And it didn't have a temporary register to put base+index.

Yes, but you won't have this problem if you do this before RA.  Is there
any good reason to do it only after RA?

If you do it before RA you immediately get

some_di = load upper from *p
shift

just as you want (and things might even be optimised further).


Segher

Re: [PATCH], Fix PR target/pr83862: Fix PowerPC long double signbit with -mabi=ieeelongdouble

2018-01-18 Thread Michael Meissner

On Thu, Jan 18, 2018 at 12:39:05PM -0600, Segher Boessenkool wrote:
> On Wed, Jan 17, 2018 at 06:40:12PM -0500, Michael Meissner wrote:
> > On Wed, Jan 17, 2018 at 04:09:57PM -0600, Segher Boessenkool wrote:
> > > On Tue, Jan 16, 2018 at 10:55:43PM -0500, Michael Meissner wrote:
> > > > PR target/83862 pointed out a problem I put into the 128-bit floating 
> > > > point
> > > > type signbit optimization.  The issue is we want to avoid doing a load 
> > > > to a
> > > > floating point/vector register and then a direct move to do signbit, so 
> > > > we
> > > > change the load to load the upper 64-bits of the floating point value 
> > > > to get
> > > > the sign bit.  Unfortunately, if the type is IEEE 128-bit and memory is
> > > > addressed with an indexed address on a little endian system, it 
> > > > generates an
> > > > illegal address and generates an internal compiler error.
> > > 
> > > So all this is caused by these splitters running after reload.  Why do
> > > we have to do that?  Do we?  We should be able to just change it to a
> > > subreg and shift, early already?
> > 
> > The part that is failing is trying to optimize the case:
> > 
> > x = signbit (*p)
> > 
> > Doing the code with just registers means that you will get:
> > 
> > vr = load *p
> > gr = diret move from vr
> > shift
> > 
> > With the optimization you get:
> > 
> > gr = 'upper' word
> > shift
> > 
> > If the address is:
> > 
> > base + index
> > 
> > In little endian, the compiler tried to generate:
> > 
> > (base + index) + 8
> > 
> > And it didn't have a temporary register to put base+index.
> 
> Yes, but you won't have this problem if you do this before RA.  Is there
> any good reason to do it only after RA?

Yes you will, because it is a memory address not a register.

> If you do it before RA you immediately get
> 
>   some_di = load upper from *p
>   shift
> 
> just as you want (and things might even be optimised further).

And in my experience, splitting such loads and changing the type before RA,
often times leads to error.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] handle local aggregate initialization in strlen (PR 83821)

2018-01-18 Thread Martin Sebor


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01131.html

This was submitted in stage 3 but if fixing xfailed assertions
in tests by enhancing optimizations is out of scope for the
current stage let me know so I can schedule this change for
GCC 9.

On 01/12/2018 02:30 PM, Martin Sebor wrote:

A failure in a test for the recently enhanced -Warray-bounds
warning exposed an unnecessarily broad restriction in the strlen
pass that prevents it from tracking the length of a member string
of locally defined and initialized struct:

  void f (void)
  {
struct { char s[8]; int i } a = { "1234", 5 };

if (strlen (a.s) != 4)   // not folded
  abort ();
   }

IIUC, the restriction was in place to account for writes into
an array changing or invalidating the length of a string stored
in its initial elements.  This would happen if the write either
changed the string's terminating nul byte, or if it reset one
of the prior non-nul bytes.

To reflect just this intent the restriction can be tightened
up to improve the pass' ability to track even the lengths of
string members of locally initialized aggregates.  Besides
leading to better code this change also clears up the test
failure.

Tested on x86_64-linux.

Martin

[PATCH, libstdc++]: Fix configure check for linux/random.h

2018-01-18 Thread Uros Bizjak

Hello!

linux/random.h uses __u32 types, which are defined through inclusion
of linux/types.h header. On some older systems (e.g. CentOS 5.11)
linux/random.h doesn't include linux/types.h by itself, and configure
declares found linux/random.h as unusable.

Attached patch fixes detection by including linux/types.h in the
configure check.

2018-01-18  Uros Bizjak  

* configure.ac (AC_CHECK_HEADERS): Add linux/types.h.  Conditionally
include linux/types.h when checking linux/random.h header.
* config.h.in: Regenerate.
* configure: Ditto.
* src/c++11/random.cc: Conditionally include linux/types.h.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32} CentOS 5.11 and Fedora 26.

OK for mainline?

Uros.
diff --git a/libstdc++-v3/config.h.in b/libstdc++-v3/config.h.in
index e31e054..5a0f067 100644
--- a/libstdc++-v3/config.h.in
+++ b/libstdc++-v3/config.h.in
@@ -270,6 +270,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_LINUX_RANDOM_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_LINUX_TYPES_H
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_LOCALE_H
 
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index f5798d3..69845f4 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -21799,7 +21799,7 @@ for ac_header in endian.h execinfo.h float.h fp.h 
ieeefp.h inttypes.h \
 locale.h machine/endian.h machine/param.h nan.h stdint.h stdlib.h string.h \
 strings.h sys/ipc.h sys/isa_defs.h sys/machine.h sys/param.h \
 sys/resource.h sys/sem.h sys/stat.h sys/time.h sys/types.h unistd.h \
-wchar.h wctype.h linux/random.h
+wchar.h wctype.h linux/types.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" 
"$ac_includes_default"
@@ -21814,6 +21814,23 @@ fi
 done
 
 
+for ac_header in linux/random.h
+do :
+  ac_fn_c_check_header_compile "$LINENO" "linux/random.h" 
"ac_cv_header_linux_random_h" "#ifdef HAVE_LINUX_TYPES_H
+# include 
+#endif
+
+"
+if test "x$ac_cv_header_linux_random_h" = x""yes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_LINUX_RANDOM_H 1
+_ACEOF
+
+fi
+
+done
+
+
 # Only do link tests if native. Else, hardcode.
 if $GLIBCXX_IS_NATIVE; then
 
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index c90c509..e110f57 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -237,7 +237,13 @@ AC_CHECK_HEADERS([endian.h execinfo.h float.h fp.h 
ieeefp.h inttypes.h \
 locale.h machine/endian.h machine/param.h nan.h stdint.h stdlib.h string.h \
 strings.h sys/ipc.h sys/isa_defs.h sys/machine.h sys/param.h \
 sys/resource.h sys/sem.h sys/stat.h sys/time.h sys/types.h unistd.h \
-wchar.h wctype.h linux/random.h])
+wchar.h wctype.h linux/types.h])
+
+AC_CHECK_HEADERS([linux/random.h], [], [],
+[[#ifdef HAVE_LINUX_TYPES_H
+# include 
+#endif
+]])
 
 # Only do link tests if native. Else, hardcode.
 if $GLIBCXX_IS_NATIVE; then
diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 415f794..19daf6d 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -42,6 +42,10 @@
 # include 
 #endif
 
+#ifdef _GLIBCXX_HAVE_LINUX_TYPES_H
+# include 
+#endif
+
 #ifdef _GLIBCXX_HAVE_LINUX_RANDOM_H
 # include 
 #endif

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread John David Anglin


On 2018-01-18 12:51 PM, Jakub Jelinek wrote:

On Thu, Jan 18, 2018 at 08:10:50AM -0500, John David Anglin wrote:

On 2018-01-17 3:07 PM, Jakub Jelinek wrote:

PR83882 complains that PR81715 testcase fails on callee copies parameter
targets.  The following patch ought to fix that, but I have only
bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase
with hppa.

John, do you think you could test this on hppa without the callee copies
default change?

On hppa2.0w-hp-hpux11.11, pr81715.C now passes.

However, there are some new fails in the gcc and g++ suites:

That is really weird, at least in the cross-compiler on throw-1.C I don't
see with -mno-caller-copies the new code triggering at all.
Tried both hppa-linux and hppa-hpux crosses.
Furthermore, throw-1.C has:
/* { dg-skip-if "Stack alignment is too small" { hppa*-*-hpux* } } */
Sorry, these fails are my fault.  I tweaked the skip to just hppa*64*.  
Nominally, 32-bit
hpux has 64 byte stack alignment but maybe alignment in signal frames is 
insufficient.

so I wonder how it comes you see them run on hpux at all.
I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c.

Will have to investigate.  I see they predate your patch.

What kind of error would be on pr83487-*?

Digging into log, I see:

spawn /test/gnu/gcc/objdir/gcc/xgcc -B/test/gnu/gcc/objdir/gcc/ 
-fno-diagnostics-show-caret -fdiagnostics-color=never 
-DSKIP_DECIMAL_FLOAT -c -o c_compat_y_tst.o 
/test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/compat//pr83487-1_y.c
/test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/compat//pr83487-1_y.c:27:1: 
warning: alignment (16) for a exceeds maximum alignment for global 
common data.  Using 8

output is:
/test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/compat//pr83487-1_y.c:27:1: 
warning: alignment (16) for a exceeds maximum alignment for global 
common data.  Using 8


FAIL: gcc.dg/compat/pr83487-1 c_compat_y_tst.o compile

The typical fix is to add "-fno-common" but I'm not sure where it would 
go in this case.


So, I don't see any regressions from you change.

Thanks,
Dave

--
John David Anglin  dave.ang...@bell.net

Re: [PATCH, libstdc++]: Fix configure check for linux/random.h

2018-01-18 Thread Jonathan Wakely


On 18/01/18 20:31 +0100, Uros Bizjak wrote:

Hello!

linux/random.h uses __u32 types, which are defined through inclusion
of linux/types.h header. On some older systems (e.g. CentOS 5.11)
linux/random.h doesn't include linux/types.h by itself, and configure
declares found linux/random.h as unusable.

Attached patch fixes detection by including linux/types.h in the
configure check.

2018-01-18  Uros Bizjak  

   * configure.ac (AC_CHECK_HEADERS): Add linux/types.h.  Conditionally
   include linux/types.h when checking linux/random.h header.
   * config.h.in: Regenerate.
   * configure: Ditto.
   * src/c++11/random.cc: Conditionally include linux/types.h.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32} CentOS 5.11 and Fedora 26.

OK for mainline?


OK, thanks.

Re: Go patch committed: Enable escape analysis for runtime

2018-01-18 Thread Rainer Orth

Hi Ian,

> This patch to the Go frontend by Cherry Zhang enables escape analysis
> for the runtime package in the Go frontend.  The runtime package was
> hard-coded non-escape, and the escape analysis was not run for the
> runtime package.  This patch removes the hard-code, and lets the
> escape analysis decide.  It is not allowed for local variables and
> closures in the runtime to be heap allocated. This patch adds the
> check that make sure that they indeed do not escape.
>
> The escape analysis is always run when compiling the runtime now.
>
> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
> to mainline.

this broke Solaris bootstrap:

/vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:182:57: error: 
n escapes to heap, not allowed in runtime
  if port_getn(portfd, &events[0], uint32(len(events)), &n, wait) < 0 {
 ^
/vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:176:11: error: 
zero escapes to heap, not allowed in runtime
   wait = &zero
   ^
/vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:174:6: error: 
.runtime.zero escapes to heap, not allowed in runtime
  var zero timespec
  ^
/vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:179:6: error: 
.runtime.events escapes to heap, not allowed in runtime
  var events [128]portevent
  ^
/vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:181:6: error: 
.runtime.n escapes to heap, not allowed in runtime
  var n uint32 = 1
  ^

Fixed as follows.  i386-pc-solaris2.11 and sparc-sun-solaris2.11 builds
have completed, make check still running.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgo/go/runtime/netpoll_solaris.go b/libgo/go/runtime/netpoll_solaris.go
--- a/libgo/go/runtime/netpoll_solaris.go
+++ b/libgo/go/runtime/netpoll_solaris.go
@@ -84,6 +84,7 @@ func port_associate(port, source int32, 
 //extern port_dissociate
 func port_dissociate(port, source int32, object uintptr) int32
 
+//go:noescape
 //extern port_getn
 func port_getn(port int32, evs *portevent, max uint32, nget *uint32, timeout *timespec) int32

C++ PATCH for c++/82461, member list-initialization and -std=c++17

2018-01-18 Thread Jason Merrill

In C++17 mode, build_special_member_call builds up an initializer for
the object and then uses it directly rather than pass it to the copy
constructor.  This pattern results in a TARGET_EXPR showing up in
potential_constant_expression, which complains about a TARGET_EXPR
indicating a temporary of non-literal type, which it isn't in this
case because it's being used to initialize a non-temporary object.
This patch recognizes that case by checking TARGET_EXPR_DIRECT_INIT_P.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 9084980615c1aa29277fee9fe33a0b52bbc0ab94
Author: Jason Merrill 
Date:   Wed Jan 17 17:24:16 2018 -0500

PR c++/82461 - constexpr list-initialized member

* constexpr.c (potential_constant_expression_1): Check
TARGET_EXPR_DIRECT_INIT_P.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 8984613aa41..9a548d29bbc 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -5726,7 +5726,8 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
   return RECUR (TREE_OPERAND (t, 1), want_rval);
 
 case TARGET_EXPR:
-  if (!literal_type_p (TREE_TYPE (t)))
+  if (!TARGET_EXPR_DIRECT_INIT_P (t)
+ && !literal_type_p (TREE_TYPE (t)))
{
  if (flags & tf_error)
{
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-list2.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-list2.C
new file mode 100644
index 000..780a64dbbc4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-list2.C
@@ -0,0 +1,20 @@
+// PR c++/82461
+// { dg-do compile { target c++11 } }
+
+class A {
+private:
+public:
+  constexpr A() {}
+  ~A() {}
+};
+
+class B {
+private:
+  A  a;
+public:
+  constexpr B() : a{} {}
+// works
+// constexpr B() : a() {}
+
+  ~B() {}
+};

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread John David Anglin


On 2018-01-18 2:44 PM, John David Anglin wrote:

I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c.

Will have to investigate.  I see they predate your patch.
This is a test issue.  The long double support on hpux is a software 
emulation

and it's not supported the fenv.h routines.

Is this change okay?

Dave

--
John David Anglin  dave.ang...@bell.net

2018-01-18  John David Anglin  

* gcc.dg/torture/pr52451.c (main): Skip long double test on
hppa*-*-hpux*.

Index: gcc.dg/torture/pr52451.c
===
--- gcc.dg/torture/pr52451.c(revision 256744)
+++ gcc.dg/torture/pr52451.c(working copy)
@@ -49,7 +49,9 @@
 
   TEST (float, f);
   TEST (double, );
+#if !defined(__hppa__) || !defined(__hpux__)
   TEST (long double, l);
+#endif
   
   return 0;
 }

Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)

2018-01-18 Thread Jakub Jelinek

On Thu, Jan 18, 2018 at 03:31:38PM -0500, John David Anglin wrote:
> On 2018-01-18 2:44 PM, John David Anglin wrote:
> > > I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c.
> > Will have to investigate.  I see they predate your patch.
> This is a test issue.  The long double support on hpux is a software
> emulation
> and it's not supported the fenv.h routines.

I think this explanation belongs into the testcase as a comment
to explain why you aren't testing it on hppa-hpux.

Ok with that change.

> 2018-01-18  John David Anglin  
> 
>   * gcc.dg/torture/pr52451.c (main): Skip long double test on
>   hppa*-*-hpux*.
> 
> Index: gcc.dg/torture/pr52451.c
> ===
> --- gcc.dg/torture/pr52451.c  (revision 256744)
> +++ gcc.dg/torture/pr52451.c  (working copy)
> @@ -49,7 +49,9 @@
>  
>TEST (float, f);
>TEST (double, );
> +#if !defined(__hppa__) || !defined(__hpux__)
>TEST (long double, l);
> +#endif
>
>return 0;
>  }


Jakub

[C++ Patch, obvious] Fix comment typo

2018-01-18 Thread Paolo Carlini


Hi,

I noticed the below, I'm committing the fix it as obvious.

Cheers, Paolo.

/

2018-01-18  Paolo Carlini  

* cp-tree.h: Fix comment typo (DECL_NON_TRIVIALLY_INITIALIZED_P
vs DECL_NONTRIVIALLY_INITIALIZED_P).
Index: cp-tree.h
===
--- cp-tree.h   (revision 256864)
+++ cp-tree.h   (working copy)
@@ -407,7 +407,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   ICS_BAD_FLAG (in _CONV)
   FN_TRY_BLOCK_P (in TRY_BLOCK)
   BIND_EXPR_BODY_BLOCK (in BIND_EXPR)
-  DECL_NON_TRIVIALLY_INITIALIZED_P (in VAR_DECL)
+  DECL_NONTRIVIALLY_INITIALIZED_P (in VAR_DECL)
   CALL_EXPR_ORDERED_ARGS (in CALL_EXPR, AGGR_INIT_EXPR)
   DECLTYPE_FOR_REF_CAPTURE (in DECLTYPE_TYPE)
   CONSTRUCTOR_C99_COMPOUND_LITERAL (in CONSTRUCTOR)

Re: std::vector default default and move constructors

2018-01-18 Thread François Dumont


On 16/01/2018 01:20, Jonathan Wakely wrote:

On 15/01/18 22:32 +0100, François Dumont wrote:

On 15/01/2018 13:29, Jonathan Wakely wrote:

In fact it introduces a serious regression because of this line:


-  vector(vector&& __x) noexcept
-  : _Base(std::move(__x)) { }
+  vector(vector&&) = default;


Consider what happens if we have an allocator that is not
nothrow-move-constructible, which can happen if the allocator has a
noexcept(false) move constructor, or more likely just has no move
constructor but has a noexcept(false) copy constructor:


One of the move constructors needs to be defined with a function body,
not defaulted, so that it can also be declared noexcept


Here is an updated patch with this change. Do you also want an explicit 
noexcept on the vector(vector&&) default declaration ? Ok to commit this 
vector patch ?


I was going to propose a similar patch for vector but checking 
Standard there is not this noexcept qualification on the move constructor.


François

diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
index acec501..a6282ec 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -85,34 +85,56 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   typedef typename __gnu_cxx::__alloc_traits<_Tp_alloc_type>::pointer
	pointer;
 
-  struct _Vector_impl
-  : public _Tp_alloc_type
+  struct _Vector_impl_data
   {
 	pointer _M_start;
 	pointer _M_finish;
 	pointer _M_end_of_storage;
 
-	_Vector_impl()
-	: _Tp_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage()
-	{ }
-
-	_Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT
-	: _Tp_alloc_type(__a), _M_start(), _M_finish(), _M_end_of_storage()
+	_Vector_impl_data() _GLIBCXX_NOEXCEPT
+	: _M_start(), _M_finish(), _M_end_of_storage()
 	{ }
 
 #if __cplusplus >= 201103L
-	_Vector_impl(_Tp_alloc_type&& __a) noexcept
-	: _Tp_alloc_type(std::move(__a)),
-	  _M_start(), _M_finish(), _M_end_of_storage()
-	{ }
+	_Vector_impl_data(_Vector_impl_data&& __x) noexcept
+	: _M_start(__x._M_start), _M_finish(__x._M_finish),
+	  _M_end_of_storage(__x._M_end_of_storage)
+	{ __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); }
 #endif
 
-	void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
+	void
+	_M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT
 	{
 	  std::swap(_M_start, __x._M_start);
 	  std::swap(_M_finish, __x._M_finish);
 	  std::swap(_M_end_of_storage, __x._M_end_of_storage);
 	}
+  };
+
+  struct _Vector_impl
+	: public _Tp_alloc_type, public _Vector_impl_data
+  {
+	_Vector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Tp_alloc_type()) )
+	: _Tp_alloc_type()
+	{ }
+
+	_Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT
+	: _Tp_alloc_type(__a)
+	{ }
+
+#if __cplusplus >= 201103L
+	_Vector_impl(_Vector_impl&& __x) noexcept
+	: _Tp_alloc_type(std::move(__x)), _Vector_impl_data(std::move(__x))
+	{ }
+
+	_Vector_impl(_Tp_alloc_type&& __a) noexcept
+	  : _Tp_alloc_type(std::move(__a))
+	{ }
+
+	_Vector_impl(_Tp_alloc_type&& __a, _Vector_impl&& __rv) noexcept
+	: _Tp_alloc_type(std::move(__a)), _Vector_impl_data(std::move(__rv))
+	{ }
+#endif
 
 #if _GLIBCXX_SANITIZE_STD_ALLOCATOR && _GLIBCXX_SANITIZE_VECTOR
 	template
@@ -235,38 +257,42 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   _Tp_alloc_type&
   _M_get_Tp_allocator() _GLIBCXX_NOEXCEPT
-  { return *static_cast<_Tp_alloc_type*>(&this->_M_impl); }
+  { return this->_M_impl; }
 
   const _Tp_alloc_type&
   _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT
-  { return *static_cast(&this->_M_impl); }
+  { return this->_M_impl; }
 
   allocator_type
   get_allocator() const _GLIBCXX_NOEXCEPT
   { return allocator_type(_M_get_Tp_allocator()); }
 
-  _Vector_base()
-  : _M_impl() { }
+#if __cplusplus >= 201103L
+  _Vector_base() = default;
+#else
+  _Vector_base() { }
+#endif
 
   _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT
   : _M_impl(__a) { }
 
+#if !_GLIBCXX_INLINE_VERSION
   _Vector_base(size_t __n)
   : _M_impl()
   { _M_create_storage(__n); }
+#endif
 
   _Vector_base(size_t __n, const allocator_type& __a)
   : _M_impl(__a)
   { _M_create_storage(__n); }
 
 #if __cplusplus >= 201103L
+  _Vector_base(_Vector_base&&) = default;
+
+# if !_GLIBCXX_INLINE_VERSION
   _Vector_base(_Tp_alloc_type&& __a) noexcept
   : _M_impl(std::move(__a)) { }
 
-  _Vector_base(_Vector_base&& __x) noexcept
-  : _M_impl(std::move(__x._M_get_Tp_allocator()))
-  { this->_M_impl._M_swap_data(__x._M_impl); }
-
   _Vector_base(_Vector_base&& __x, const allocator_type& __a)
   : _M_impl(__a)
   {
@@ -278,6 +304,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	_M_create_storage(__n);
 	  }
   }
+# endif
+
+  _Vector_base(const allocator_type& __a, _Vector_base&& __x)
+  : _M_impl(_Tp_alloc_type(__a), std::move(__x._M_impl))
+

Re: C++ PATCH for c++/83714, ICE checking return from template

2018-01-18 Thread Jason Merrill

On Wed, Jan 17, 2018 at 4:31 PM, Jason Merrill  wrote:
> On Wed, Jan 17, 2018 at 3:46 PM, Paolo Carlini  
> wrote:
>> Hi Jason,
>>
>> On 17/01/2018 00:04, Jason Merrill wrote:
>>>
>>> Like my recent patch for 83186, we were missing a
>>> build_non_dependent_expr.
>>>
>>> Tested x86_64-pc-linux-gnu, applying to trunk.
>>
>> Lately I'm seeing (H.J. Lu too) a regression:
>>
>> FAIL: g++.dg/template/inherit4.C -std=c++11 (test for excess errors)
>> FAIL: g++.dg/template/inherit4.C -std=c++14 (test for excess errors)
>>
>> which seems related to this change of yours: if I comment out the new
>> build_non_dependent_expr call the test is accepted again. Could you please
>> have a look?
>
> Hmm, wonder why I didn't see that in my testing.  Checking a fix now.

We were failing to treat A::foo_ as an instantiation-dependent
SCOPE_REF, which it is because of the dependent base B.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1579ec4777b69f2fd55ce1b3e425d6222f1f015a
Author: Jason Merrill 
Date:   Wed Jan 17 16:29:45 2018 -0500

Fix template/inherit4.C.

PR c++/83714
* search.c (any_dependent_bases_p): Handle null TREE_BINFO.
* pt.c (instantiation_dependent_scope_ref_p): True if
any_dependent_bases_p.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3d7d45864c6..0296845a31b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -23952,6 +23952,9 @@ instantiation_dependent_scope_ref_p (tree t)
 {
   if (DECL_P (TREE_OPERAND (t, 1))
   && CLASS_TYPE_P (TREE_OPERAND (t, 0))
+  /* A dependent base could make a member inaccessible in the current
+class.  */
+  && !any_dependent_bases_p ()
   && accessible_in_template_p (TREE_OPERAND (t, 0),
   TREE_OPERAND (t, 1)))
 return false;
diff --git a/gcc/cp/search.c b/gcc/cp/search.c
index 477e9aef105..920fc15468a 100644
--- a/gcc/cp/search.c
+++ b/gcc/cp/search.c
@@ -2612,6 +2612,13 @@ any_dependent_bases_p (tree type)
   if (!type || !CLASS_TYPE_P (type) || !processing_template_decl)
 return false;
 
+  /* If we haven't set TYPE_BINFO yet, we don't know anything about the bases.
+ Return false because in this situation we aren't actually looking up names
+ in the scope of the class, so it doesn't matter whether it has dependent
+ bases.  */
+  if (!TYPE_BINFO (type))
+return false;
+
   unsigned i;
   tree base_binfo;
   FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_BINFOS (TYPE_BINFO (type)), i, base_binfo)
diff --git a/gcc/testsuite/g++.dg/torture/pr83619.C 
b/gcc/testsuite/g++.dg/torture/pr83619.C
index a5ca5372f18..5afd3ce0790 100644
--- a/gcc/testsuite/g++.dg/torture/pr83619.C
+++ b/gcc/testsuite/g++.dg/torture/pr83619.C
@@ -22,7 +22,7 @@ public:
 static void
 c (e *g)
 {
-  g->c ();
+  g->c (); // { dg-message "incomplete" }
 }
   };
 };

C++ PATCH to fix bogus error with constexpr and empty class (PR c++/81933)

2018-01-18 Thread Marek Polacek

The problem in this PR is that we get 
error: constexpr call flows off the end of the function
for a valid testcase, but not in C++17 mode.  That's because in C++17 we
execute:

4321   if (cxx_dialect >= cxx17 && !BINFO_VIRTUAL_P (binfo))
4322 {
4323   tree decl = build_base_field_1 (t, basetype, next_field);
4324   DECL_FIELD_OFFSET (decl) = BINFO_OFFSET (binfo);
4325   DECL_FIELD_BIT_OFFSET (decl) = bitsize_zero_node;
4326   SET_DECL_OFFSET_ALIGN (decl, BITS_PER_UNIT);
4327 }

which puts some additional fields into some classes.  This makes a difference
for split_nonconstant_init:

751   if (split_nonconstant_init_1 (dest, init))
752 init = NULL_TREE;

where split_nonconstant_init_1 is true so that means that the DECL_INITIAL will
be null.  Why is split_nonconstant_init_1 true?  Because it ends up calling
complete_ctor_at_level_p with

6114   return count_type_elements (type, true) == num_elts;

but the count_type_elements result differs between c++14/c++17 precisely
because of the fields added in build_base_field.  But we should only add those
fields when initializing aggregates with bases, which is not what's happening
in this testcase.

With DECL_INITIAL being null, we error here (result is NULL):

1702   result = *ctx->values->get (slot ? slot : res);
1703   if (result == NULL_TREE && !*non_constant_p)
1704 {
1705   if (!ctx->quiet)
1706 error ("% call flows off the end "
1707"of the function");
1708   *non_constant_p = true;
1709 }

I think the problem is that we're dealing with an empty class here, for which
we should just generate an empty ctor, as in few other spots.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-01-18  Marek Polacek  

PR c++/81933
* constexpr.c (cxx_eval_call_expression): When evaluating a call whose
result's type is an empty class, return empty constructor.

* g++.dg/cpp1y/constexpr-empty4.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 8984613aa41..0ff74b98437 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -1701,7 +1701,15 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
result = void_node;
  else
{
- result = *ctx->values->get (slot ? slot : res);
+ tree retval = slot ? slot : res;
+ /* If the class is empty, we aren't actually loading anything.  */
+ if (is_really_empty_class (TREE_TYPE (retval)))
+   {
+ result = build_constructor (TREE_TYPE (retval), NULL);
+ TREE_CONSTANT (result) = true;
+   }
+ else
+   result = *ctx->values->get (retval);
  if (result == NULL_TREE && !*non_constant_p)
{
  if (!ctx->quiet)
diff --git gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C 
gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C
index e69de29bb2d..059220f8268 100644
--- gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C
@@ -0,0 +1,51 @@
+// PR c++/81933
+// { dg-do compile { target c++14 } }
+
+namespace std {
+template  struct __decay_and_strip { typedef _Tp __type; };
+template  struct enable_if { typedef int type; };
+template  struct _Head_base {
+  constexpr _Head_base(_Head) {}
+};
+template  struct _Tuple_impl;
+template 
+struct _Tuple_impl<_Idx, _Head, _Tail...> : _Tuple_impl<1, _Tail...>, // { 
dg-warning "direct base" }
+_Head_base<_Head> {
+  typedef _Tuple_impl<1, _Tail...> _Inherited;
+  typedef _Head_base<_Head> _Base;
+  constexpr _Tuple_impl(_Head __head, _Tail... __tail)
+  : _Inherited(__tail...), _Base(__head) {}
+  _Tuple_impl(const _Tuple_impl &) = default;
+  _Tuple_impl(_Tuple_impl &&);
+};
+template 
+struct _Tuple_impl<_Idx, _Head> : _Head_base<_Head> {
+  typedef _Head_base<_Head> _Base;
+  constexpr _Tuple_impl(_Head __head) : _Base(__head) {}
+};
+template  struct _TC {
+  static constexpr bool _NotSameTuple() { return true; }
+};
+template  class tuple : _Tuple_impl<0, _Elements...> {
+  typedef _Tuple_impl<0, _Elements...> _Inherited;
+
+public:
+  template ::_NotSameTuple()>::type = false>
+  constexpr tuple(_UElements... __elements) : _Inherited(__elements...) {}
+  tuple(const tuple &) = default;
+};
+template 
+constexpr tuple::__type...>
+make_tuple(_Elements... __args) {
+  typedef tuple::__type...> 
__result_type;
+  return __result_type(__args...);
+}
+}
+struct any_udt {};
+template  constexpr auto flatten(Tuples... tuples) {
+  auto all = std::make_tuple(tuples...);
+  auto flat(all);
+  return flat;
+}
+constexpr auto fail = flatten(any_udt{}, any_udt{});

Marek

Re: [PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.

2018-01-18 Thread Segher Boessenkool

Hi Carl,

On Thu, Jan 18, 2018 at 08:06:17AM -0800, Carl Love wrote:
> The following patch adds missing 128-bit support for the builtins
> vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be().  It also includes a bug
> fix required for the new 128-bit arguments for the vec_xst_be() and
> vec_xl_be() builtins.  New test cases are also included.  This patch
> completes the tests of all the various load/store builtins that I have
> been working on.  This patch adds a torture test for the various load
> store tests to check that they work for -O0, -O1 and -O2 testing.  This
> was done as the work on the load/store builtins found numerous issues
> that were optimization level dependent which were fixed by the previous
> commits as well as this commit.

> Let me know if the patch looks OK or not.  Let me know if you want to
> include it in stage 4 or wait for the next release.  Thanks.

Since this is last I think we should include it now; it isn't likely
to regress anything, either.

But, some comments, and a real bug:

> 2018-01-18  Carl Love  
>   * gcc.target/powerpc/powerpc.exp: Add torture tests for
>builtins-4-runnable.c, builtins-6-runnable.c,
>builtins-5-p9-runnable.c, builtins-6-p9-runnable.c.

The indent here is wrong (stray space).

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15572,6 +15572,12 @@ altivec_expand_builtin (tree exp, rtx target, bool 
> *expandedp)
> unaligned-supporting store, so use a generic expander.  For
> little-endian, the exact element-reversing instruction must
> be used.  */
> +   case VSX_BUILTIN_ST_ELEMREV_V1TI:
> + {
> +   enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v1ti
> +   : CODE_FOR_vsx_st_elemrev_v1ti);
> +   return altivec_expand_stv_builtin (code, exp);
> +  }

Last line has a space too many.  Or, actually, all the rest have one
too few?

> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -1093,6 +1093,18 @@ (define_insn "vsx_ld_elemrev_v2di"
>    "lxvd2x %x0,%y1"
>    [(set_attr "type" "vecload")])
>  
> +(define_insn "vsx_ld_elemrev_v1ti"
> +  [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> +(vec_select:V1TI
> +   (match_operand:V1TI 1 "memory_operand" "Z")
> +   (parallel [(const_int 0)])))]
> +  "VECTOR_MEM_VSX_P (V1TImode) && !BYTES_BIG_ENDIAN"
> +{
> +  return "lxvd2x %x0,%y1; xxpermdi %x0,%x0,%x0,2";
> +}

Not ; but \; please.  No space after it.

We currently have exactly as many xxpermdi,2 as xxswapdi (147 each)
but the latter is more readable, please prefer that.

> +(define_insn "vsx_st_elemrev_v1ti"
> +  [(set (match_operand:V1TI 0 "memory_operand" "=Z")
> +(vec_select:V1TI
> +  (match_operand:V1TI 1 "vsx_register_operand" "wa")
> +  (parallel [(const_int 0)])))]
> +  "VECTOR_MEM_VSX_P (V2DImode) && !BYTES_BIG_ENDIAN"
> +{
> +  return "xxpermdi %x1,%x1,%x1,2; stxvd2x %x1,%y0";
> +}

This is wrong: it changes operand 1 but the RTL does not mention that.

> --- a/gcc/testsuite/gcc.target/powerpc/powerpc.exp
> +++ b/gcc/testsuite/gcc.target/powerpc/powerpc.exp
> @@ -49,4 +49,16 @@ gcc-dg-runtest [list $srcdir/$subdir/savres.c] "" $alti
>  
>  # All done.
>  torture-finish
> +
> +torture-init 
> +# Test load/store builtins at all optimizations
> +set-torture-options [list -O0 -O1 -O2]

All?  -O3 and -Os as well?  (And maybe more, but those are important).

> +gcc-dg-runtest [list $srcdir/$subdir/builtins-4-runnable.c \
> +  $srcdir/$subdir/builtins-6-runnable.c \
> +  $srcdir/$subdir/builtins-5-p9-runnable.c \
> +      $srcdir/$subdir/builtins-6-p9-runnable.c] "" 
> $DEFAULT_CFLAGS

Weird indent on this line.

Rest looks fine :-)


Segher

[C++ Patch] PR 83921 ("[7/8 Regression] GCC rejects constexpr initialization of empty aggregate")

2018-01-18 Thread Paolo Carlini


Hi,

I'm finishing testing on x86_64-linux the below - which anyway seems 
very unlikely to cause regressions because we aren't really stress 
testing the relevant checks in potential_constant_expression_1 much, if 
at all (surely stmtexpr19.C tests static).


Anyway, the issue is the following. In 239268 aka "Implement C++17 
constexpr lambda" Jason added some checks to 
potential_constant_expression_1 covering static, thread_local and 
uninitialized var declaration in constexpr function context. Then 
extended to constexpr context more generally in 249382 aka "constexpr 
and static var in statement-expression", with ext/stmtexpr19.C covering 
the static case. Now, it looks like the check for uninitialized vars in 
constexpr functions context is more correctly carried out by 
check_for_uninitialized_const_var instead, because the simple check in 
potential_constant_expression_1 as-is causes the regression pointed out 
by this bug. Thus the fix below which just restricts the check in 
potential_constant_expression_1, and the testcases, one for this bug 
proper, plus one, very similar to stmtexpr19.C, double checking that we 
are still diagnosing in the statement-expression context. I also 
verified under the debugger how for constexpr-83921.C we are actually 
running check_for_uninitialized_const_var on 'f' - which obviously passes.


Thanks, Paolo.




/cp
2018-01-18  Paolo Carlini  

PR c++/83921
* constexpr.c (potential_constant_expression_1): Do not handle
uninitialized vars in constexpr function here.

/testsuite
2018-01-18  Paolo Carlini  

PR c++/83921
* g++.dg/cpp1y/constexpr-83921.C: New.
* g++.dg/ext/stmtexpr20.C: Likewise.
Index: cp/constexpr.c
===
--- cp/constexpr.c  (revision 256865)
+++ cp/constexpr.c  (working copy)
@@ -5708,7 +5708,9 @@ potential_constant_expression_1 (tree t, bool want
  "% in % context", tmp);
  return false;
}
- else if (!DECL_NONTRIVIALLY_INITIALIZED_P (tmp))
+ else if (!DECL_NONTRIVIALLY_INITIALIZED_P (tmp)
+  /* Handled in check_for_uninitialized_const_var.  */
+  && !var_in_constexpr_fn (tmp))
{
  if (flags & tf_error)
error_at (DECL_SOURCE_LOCATION (tmp), "uninitialized "
Index: testsuite/g++.dg/cpp1y/constexpr-83921.C
===
--- testsuite/g++.dg/cpp1y/constexpr-83921.C(nonexistent)
+++ testsuite/g++.dg/cpp1y/constexpr-83921.C(working copy)
@@ -0,0 +1,7 @@
+// PR c++/83921
+// { dg-do compile { target c++14 } }
+
+struct Foo {};
+constexpr void test() {
+  Foo f;
+}
Index: testsuite/g++.dg/ext/stmtexpr20.C
===
--- testsuite/g++.dg/ext/stmtexpr20.C   (nonexistent)
+++ testsuite/g++.dg/ext/stmtexpr20.C   (working copy)
@@ -0,0 +1,15 @@
+// PR c++/83921
+// { dg-options "" }
+// { dg-do compile { target c++11 } }
+
+struct test { const int *addr; };
+
+const test* setup()
+{
+  static constexpr test atest =
+{
+  ({ int inner; &inner; }) // { dg-error "uninitialized" }
+};
+
+  return &atest;
+}

[PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)

2018-01-18 Thread Jakub Jelinek

On Thu, Jan 18, 2018 at 04:59:01PM +0100, Jan Hubicka wrote:
> this patch ICE where the profile in cgraph mismatch profile in BB. This is 
> becuase
> of expansion of speculative devirtualization where we get some roundoff 
> issues.
> 
> Bootstrapped/regtested x86_64-linux, comitted.
> Honza
> 
>   PR ipa/83619
>   * g++.dg/torture/pr83619.C: New testcase.
>   * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge
>   frequencies.
> --- testsuite/g++.dg/torture/pr83619.C(revision 0)
> +++ testsuite/g++.dg/torture/pr83619.C(working copy)
...

This testcase FAILs everywhere:
/.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C: In static member function 
'static void i::j<  >::c(e*)':
/.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of 
incomplete type 'class e'
/.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:8:7: note: forward declaration 
of 'class e'
FAIL: g++.dg/torture/pr83619.C   -O0  (test for excess errors)
Excess errors:
/.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of 
incomplete type 'class e'

The following patch tweaks it so that it doesn't emit the warning, yet still
ICEs before your cgraph.c change and PASSes after it.
Tested on x86_64-linux and i686-linux, ok for trunk?

2018-01-18  Jakub Jelinek  

PR ipa/83619
* g++.dg/torture/pr83619.C (e): Define before first use instead of
forward declaration.

--- gcc/testsuite/g++.dg/torture/pr83619.C.jj   2018-01-18 21:11:54.865206861 
+0100
+++ gcc/testsuite/g++.dg/torture/pr83619.C  2018-01-18 23:51:32.693482293 
+0100
@@ -5,7 +5,9 @@ class d
 public:
   virtual unsigned c ();
 };
-class e;
+class e : public d
+{
+};
 class i
 {
   void h ();
@@ -33,9 +35,6 @@ public:
   l (int);
   k *operator-> ();
 };
-class e : public d
-{
-};
 class m final : e
 {
   unsigned c ();

Jakub

[C++ PATCH] Fix ICE in potential_constant_expression_1 (PR c++/83918)

2018-01-18 Thread Jakub Jelinek

Hi!

Before location wrappers were introduced, the potential_constant_expression_1
assumption that a VIEW_CONVERT_EXPR must have non-NULL type was right,
if it is a type dependent reinterpret cast, it would be
REINTERPRET_CAST_EXPR instead.

Location wrappers around decls can have NULL type, if the decl they wrap
also has NULL type.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?  Alternatively it could if (TREE_TYPE (t) && ...

On a related note, I wonder if the
return (RECUR (from, TREE_CODE (t) != VIEW_CONVERT_EXPR));
a few lines below is correct, VCE isn't always an lvalue, only if it is used
in non-rvalue contexts.  So, shouldn't it be instead:
return (RECUR (from, (TREE_CODE (t) == VIEW_CONVERT_EXPR
  ? want_rval : rval)));
Don't have a testcase where it would matter though.

2018-01-18  Jakub Jelinek  

PR c++/83918
* constexpr.c (potential_constant_expression_1)
: Don't check for INTEGER_CST to pointer
type conversion for location wrappers.

* g++.dg/cpp1z/pr83918.C: New test.

--- gcc/cp/constexpr.c.jj   2018-01-17 22:00:13.624228171 +0100
+++ gcc/cp/constexpr.c  2018-01-18 10:38:07.357469094 +0100
@@ -5385,7 +5385,8 @@ potential_constant_expression_1 (tree t,
 may change to something more specific to type-punning (DR 1312).  */
   {
 tree from = TREE_OPERAND (t, 0);
-   if (POINTER_TYPE_P (TREE_TYPE (t))
+   if (!location_wrapper_p (t)
+   && POINTER_TYPE_P (TREE_TYPE (t))
&& TREE_CODE (from) == INTEGER_CST
&& !integer_zerop (from))
  {
--- gcc/testsuite/g++.dg/cpp1z/pr83918.C.jj 2018-01-18 10:46:04.405439296 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/pr83918.C2018-01-18 10:45:42.934442290 
+0100
@@ -0,0 +1,32 @@
+// PR c++/83918
+// { dg-do compile }
+// { dg-options "-std=c++17" }
+
+constexpr unsigned
+foo (unsigned x, unsigned y)
+{
+  return x > y ? x : y;
+}
+
+template  struct A;
+template  struct B;
+template 
+struct A , B >
+{
+  enum : unsigned
+  {
+u = foo (sizeof (S), sizeof (U)),
+v = A , B >::w,
+w = foo (u, v)
+  };
+};
+
+template <>
+struct A , B <>>
+{
+  enum : unsigned { w = 0 };
+};
+
+constexpr static const auto v { A ,
+  B <9,8,7,6,5,4,3,2,1>>::w };
+static_assert (v == sizeof (int));

 
Jakub

[C++ PATCH] Avoid spurious -Wignored-qualifiers warning on artificial C++17 P0138R2 cast (PR c++/83919)

2018-01-18 Thread Jakub Jelinek

Hi!

These casts for P0138R2 isn't something the user typed in their code,
so diagnosing -Wignored-qualifiers on these looks wrong.
-Wuseless-cast has been handled similarly in the past already.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-18  Jakub Jelinek  

PR c++/83919
* typeck.c (convert_for_assignment): Suppress warn_ignored_qualifiers
for direct enum init.
* decl.c (reshape_init): Likewise.

* g++.dg/cpp0x/pr83919.C: New test.

--- gcc/cp/typeck.c.jj  2018-01-17 22:00:06.863228592 +0100
+++ gcc/cp/typeck.c 2018-01-18 10:58:39.976333499 +0100
@@ -8689,6 +8689,7 @@ convert_for_assignment (tree type, tree
   if (check_narrowing (ENUM_UNDERLYING_TYPE (type), elt, complain))
{
  warning_sentinel w (warn_useless_cast);
+ warning_sentinel w2 (warn_ignored_qualifiers);
  rhs = cp_build_c_cast (type, elt, complain);
}
   else
--- gcc/cp/decl.c.jj2018-01-18 00:41:40.564015137 +0100
+++ gcc/cp/decl.c   2018-01-18 11:10:10.579713182 +0100
@@ -6091,6 +6091,7 @@ reshape_init (tree type, tree init, tsub
   if (check_narrowing (ENUM_UNDERLYING_TYPE (type), elt, complain))
{
  warning_sentinel w (warn_useless_cast);
+ warning_sentinel w2 (warn_ignored_qualifiers);
  return cp_build_c_cast (type, elt, tf_warning_or_error);
}
   else
--- gcc/testsuite/g++.dg/cpp0x/pr83919.C.jj 2018-01-18 11:03:08.048295967 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr83919.C2018-01-18 10:59:57.465322655 
+0100
@@ -0,0 +1,10 @@
+// PR c++/83919
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wignored-qualifiers" }
+
+enum class Conf;
+struct foo
+{
+  foo (const Conf& conf) : x{conf} {}  // { dg-bogus "type qualifiers ignored 
on cast result type" }
+  const Conf x;
+};

Jakub

[PATCH] Fix unwind info in x86 interrupt functions (PR debug/83728)

2018-01-18 Thread Jakub Jelinek

Hi!

Last summer i386 INCOMING_FRAME_SP_OFFSET macro has been changed, so that it
is one word for most of the functions (as previously always), but 2 words
for functions with interrupt attribute.
Unfortunately this breaks the unwind info, as can be seen on the
gcc.dg/guality/pr68037-1.c testcase.  Our infrastructure assumes we have
just one set of cfi instructions in the CIE (we can have multiple CIEs, but
only for various flags, not for different cfi instructions), so if
the first function being assembled is an interrupt function, we start
with the 2*word initial offset in the CIE we emit (when
-fno-dwarf2-cfi-asm), or just assume the CIE has that (with -fdwarf2-cfi-asm
when it is GAS that emits it) even when it doesn't.  For -fdwarf2-cfi-asm
the effect is that until first CFA adjustment the offset is off by a word,
after that correct.  In both cases all CFA offsets are off by a word in
non-interrupt functions.
Or, if the first function doesn't have interrupt attribute, CFI in
non-interrupt functions is correct, but interrupt functions are wrong.

I've looked around and it seems stormy16 is the only other target that
bases the INCOMING_FRAME_SP_OFFSET value not just on ABI changing switches,
but on properties of the current function.

For these 2 targets the following patch introduces another macro,
DEFAULT_INCOMING_FRAME_SP_OFFSET, that is meant to be the same for all
functions of the same ABI and where GAS supports .cfi_startproc it should
also match what GAS does.  For the hopefully minority of functions that
need something else (i.e. interrupt functions) we emit a CFI instruction
right at the start of the function.

Bootstrapped/regtested on x86_64-linux and i686-linux, fixes the pr68037-1.c
FAILs, ok for trunk?

2018-01-18  Jakub Jelinek  

PR debug/81570
PR debug/83728
* dwarf2cfi.c (DEFAULT_INCOMING_FRAME_SP_OFFSET): Define to
INCOMING_FRAME_SP_OFFSET if not defined.
(scan_trace): Add ENTRY argument.  If true and
DEFAULT_INCOMING_FRAME_SP_OFFSET != INCOMING_FRAME_SP_OFFSET,
emit a note to adjust the CFA offset.
(create_cfi_notes): Adjust scan_trace callers.
(create_cie_data): Use DEFAULT_INCOMING_FRAME_SP_OFFSET rather than
INCOMING_FRAME_SP_OFFSET in the CIE.
* config/i386/i386.h (DEFAULT_INCOMING_FRAME_SP_OFFSET): Define.
* config/stormy16/stormy16.h (DEFAULT_INCOMING_FRAME_SP_OFFSET):
Likewise.
* doc/tm.texi.in (DEFAULT_INCOMING_FRAME_SP_OFFSET): Document.
* doc/tm.texi: Regenerated.

--- gcc/dwarf2cfi.c.jj  2018-01-04 00:43:17.632703231 +0100
+++ gcc/dwarf2cfi.c 2018-01-18 15:57:21.165911020 +0100
@@ -52,6 +52,10 @@ along with GCC; see the file COPYING3.
 #ifndef INCOMING_RETURN_ADDR_RTX
 #define INCOMING_RETURN_ADDR_RTX  (gcc_unreachable (), NULL_RTX)
 #endif
+
+#ifndef DEFAULT_INCOMING_FRAME_SP_OFFSET
+#define DEFAULT_INCOMING_FRAME_SP_OFFSET INCOMING_FRAME_SP_OFFSET
+#endif
 
 /* A collected description of an entire row of the abstract CFI table.  */
 struct GTY(()) dw_cfi_row
@@ -2484,7 +2488,7 @@ scan_insn_after (rtx_insn *insn)
instructions therein.  */
 
 static void
-scan_trace (dw_trace_info *trace)
+scan_trace (dw_trace_info *trace, bool entry)
 {
   rtx_insn *prev, *insn = trace->head;
   dw_cfa_location this_cfa;
@@ -2503,6 +2507,17 @@ scan_trace (dw_trace_info *trace)
   this_cfa = cur_row->cfa;
   cur_cfa = &this_cfa;
 
+  /* If the current function starts with a non-standard incoming frame
+ sp offset, emit a note before the first instruction.  */
+  if (entry
+  && DEFAULT_INCOMING_FRAME_SP_OFFSET != INCOMING_FRAME_SP_OFFSET)
+{
+  add_cfi_insn = insn;
+  gcc_assert (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_DELETED);
+  this_cfa.offset = INCOMING_FRAME_SP_OFFSET;
+  def_cfa_1 (&this_cfa);
+}
+
   for (prev = insn, insn = NEXT_INSN (insn);
insn;
prev = insn, insn = NEXT_INSN (insn))
@@ -2671,12 +2686,12 @@ create_cfi_notes (void)
 
   /* Always begin at the entry trace.  */
   ti = &trace_info[0];
-  scan_trace (ti);
+  scan_trace (ti, true);
 
   while (!trace_work_list.is_empty ())
 {
   ti = trace_work_list.pop ();
-  scan_trace (ti);
+  scan_trace (ti, false);
 }
 
   queued_reg_saves.release ();
@@ -2980,7 +2995,12 @@ create_cie_data (void)
   /* On entry, the Canonical Frame Address is at SP.  */
   memset (&loc, 0, sizeof (loc));
   loc.reg = dw_stack_pointer_regnum;
-  loc.offset = INCOMING_FRAME_SP_OFFSET;
+  /* create_cie_data is called just once per TU, and when using .cfi_startproc
+ is even done by the assembler rather than the compiler.  If the target
+ has different incoming frame sp offsets depending on what kind of
+ function it is, use a single constant offset for the target and
+ if needed, adjust before the first instruction in insn stream.  */
+  loc.offset = DEFAULT_INCOMING_FRAME_SP_OFFSET;
   def_cfa_1 (&loc);
 
   if (targetm.debug_unwind_i

[PATCH] -Warray-bounds: Fix false positive in some "switch" stmts (PR tree-optimization/83510)

2018-01-18 Thread David Malcolm

PR tree-optimization/83510 reports that r255649 (for
PR tree-optimization/83312) introduced a false positive for
-Warray-bounds for array accesses within certain switch statements:
those for which value-ranges allow more than one case to be reachable,
but for which one or more of the VR-unreachable cases contain
out-of-range array accesses.

In the reproducer, after the switch in f is inlined into g, we have 3 cases
for the switch (case 9, case 10-19, and default), within a loop that
ranges from 0..9.

With both the old and new code, vr_values::simplify_switch_using_ranges clears
the EDGE_EXECUTABLE flag on the edge to the "case 10-19" block.  This
happens during the dom walk within the substitute_and_fold_engine.

With the old code, the clearing of that EDGE_EXECUTABLE flag led to the
  /* Skip blocks that were found to be unreachable.  */
code in the old implementation of vrp_prop::check_all_array_refs skipping
the "case 10-19" block.

With the new code, we have a second dom walk, and that dom_walker's ctor
sets all edges to be EDGE_EXECUTABLE, losing that information.

Then, dom_walker::before_dom_children (here, the subclass'
check_array_bounds_dom_walker::before_dom_children) can return one edge, if
there's a unique successor edge, and dom_walker::walk filters the dom walk
to just that edge.

Here we have two VR-valid edges (case 9 and default), and an VR-invalid
successor edge (case 10-19).  There's no *unique* valid successor edge,
and hence taken_edge is NULL, and the filtering in dom_walker::walk
doesn't fire.

Hence we've lost the filtering of the "case 10-19" BB, hence the false
positive.

The issue is that we have two dom walks: first within vr_values'
substitute_and_fold_dom_walker (which has skip_unreachable_blocks == false),
then another within vrp_prop::check_all_array_refs (with
skip_unreachable_blocks == true).

Each has different "knowledge" about ruling out edges due to value-ranges,
but we aren't combining that information.  The former "knows" about
out-edges at a particular control construct (e.g. at a switch), the latter
"knows" about dominance, but only about unique successors (hence the
problem when two out of three switch cases are valid).

This patch combines the information by preserving the EDGE_EXECUTABLE
flags from the first dom walk, and using it in the second dom walk,
potentially rejecting additional edges.

Doing so fixes the false positive.

I attempted an alternative fix, merging the two dom walks into one, but
that led to crashes in identify_jump_threads, so I went with this, as
a less invasive fix.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
OK for trunk?

gcc/ChangeLog:
PR tree-optimization/83510
* domwalk.c (set_all_edges_as_executable): New function.
(dom_walker::dom_walker): Add new param "preserve_flags".  Move
setup of edge flags to set_all_edges_as_executable and guard it
with !preserve_flags.
* domwalk.h (dom_walker::dom_walker): Add new param
"preserve_flags", defaulting to false.
(set_all_edges_as_executable): New decl.
* tree-vrp.c
(check_array_bounds_dom_walker::check_array_bounds_dom_walker):
Pass "true" for new param of dom_walker's ctor.
(vrp_prop::vrp_finalize): Call set_all_edges_as_executable
if check_all_array_refs will be called.

gcc/testsuite/ChangeLog:
PR tree-optimization/83510
* gcc.c-torture/compile/pr83510.c: New test case.
---
 gcc/domwalk.c |  30 +++--
 gcc/domwalk.h |  11 ++
 gcc/testsuite/gcc.c-torture/compile/pr83510.c | 172 ++
 gcc/tree-vrp.c|  21 +++-
 4 files changed, 224 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr83510.c

diff --git a/gcc/domwalk.c b/gcc/domwalk.c
index 102a293..988ff71 100644
--- a/gcc/domwalk.c
+++ b/gcc/domwalk.c
@@ -169,12 +169,29 @@ sort_bbs_postorder (basic_block *bbs, int n)
 qsort (bbs, n, sizeof *bbs, cmp_bb_postorder);
 }
 
+/* Set EDGE_EXECUTABLE on every edge within FN's CFG.  */
+
+void
+set_all_edges_as_executable (function *fn)
+{
+  basic_block bb;
+  FOR_ALL_BB_FN (bb, fn)
+{
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, bb->succs)
+   e->flags |= EDGE_EXECUTABLE;
+}
+}
+
 /* Constructor for a dom walker.
 
If SKIP_UNREACHBLE_BLOCKS is true, then we need to set
-   EDGE_EXECUTABLE on every edge in the CFG. */
+   EDGE_EXECUTABLE on every edge in the CFG, unless
+   PRESERVE_FLAGS is true. */
 dom_walker::dom_walker (cdi_direction direction,
bool skip_unreachable_blocks,
+   bool preserve_flags,
int *bb_index_to_rpo)
   : m_dom_direction (direction),
 m_skip_unreachable_blocks (skip_unreachable_blocks),
@@ -200,14 +217,9 @@ dom_walker::dom_walker (cdi_direction direction,
   if (!m_skip_u

[C++ PATCH] Fix ICE in joust with -Wconversion (PR c++/81167)

2018-01-18 Thread Jakub Jelinek

Hi!

As mentioned in the PR, we ICE on this testcase because
w->fn is a conversion operator, w->convs[0]->type is a reference
to a class type, but because that conversion is ck_ref_bind,
source_type looks through it and finds ck_identity with
the class type.  Then we because w->fn is not a constructor
do source = TREE_TYPE (source);, assuming we got a pointer
type like on the other 5 testcases in check-c++-all with -Wconversion
that cover this code, so source is NULL and we die in calling
warning with bogus arguments.

The following patch fixes it by only using TREE_TYPE on pointer/reference
types.  I must say I don't understand this fully, but conversion
operators should be used on class types, so that is what we are looking for
with source, right?  No idea about the ! DECL_CONSTRUCTOR_P (w->fn)
check though and nothing in the testsuite covers that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-18  Jakub Jelinek  

PR c++/81167
* call.c (joust): Only use TREE_TYPE (source) if source is
a POINTER_TYPE_P.

* g++.dg/cpp0x/pr81167.C: New test.

--- gcc/cp/call.c.jj2018-01-17 22:00:13.626228171 +0100
+++ gcc/cp/call.c   2018-01-18 21:01:51.596136303 +0100
@@ -10082,7 +10082,7 @@ joust (struct z_candidate *cand1, struct
   else if (warn)
{
  tree source = source_type (w->convs[0]);
- if (! DECL_CONSTRUCTOR_P (w->fn))
+ if (! DECL_CONSTRUCTOR_P (w->fn) && POINTER_TYPE_P (source))
source = TREE_TYPE (source);
  if (warning (OPT_Wconversion, "choosing %qD over %qD", w->fn, l->fn)
  && warning (OPT_Wconversion, "  for conversion from %qH to %qI",
--- gcc/testsuite/g++.dg/cpp0x/pr81167.C.jj 2018-01-18 21:07:48.723189500 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr81167.C2018-01-18 21:07:18.610187374 
+0100
@@ -0,0 +1,24 @@
+// PR c++/81167
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wconversion" }
+
+struct bar;
+
+struct foo
+{
+  foo () {}
+  foo (const bar &) {}
+};
+
+struct bar
+{
+  operator foo () && { return foo (); }
+};
+
+void test ()
+{
+  foo f = bar ();
+// { dg-warning "choosing 'bar::operator foo\\(\\) &&' over 'foo::foo\\(const 
bar&\\)'" "" { target *-*-* } .-1 }
+// { dg-warning "for conversion from 'bar' to 'foo'" "" { target *-*-* } .-2 }
+// { dg-message "because conversion sequence for the argument is better" "" { 
target *-*-* } .-3 }
+}

Jakub

Re: [PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)

2018-01-18 Thread Jeff Law

On 01/18/2018 03:57 PM, Jakub Jelinek wrote:
> On Thu, Jan 18, 2018 at 04:59:01PM +0100, Jan Hubicka wrote:
>> this patch ICE where the profile in cgraph mismatch profile in BB. This is 
>> becuase
>> of expansion of speculative devirtualization where we get some roundoff 
>> issues.
>>
>> Bootstrapped/regtested x86_64-linux, comitted.
>> Honza
>>
>>  PR ipa/83619
>>  * g++.dg/torture/pr83619.C: New testcase.
>>  * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge
>>  frequencies.
>> --- testsuite/g++.dg/torture/pr83619.C   (revision 0)
>> +++ testsuite/g++.dg/torture/pr83619.C   (working copy)
> ...
> 
> This testcase FAILs everywhere:
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C: In static member function 
> 'static void i::j<  >::c(e*)':
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of 
> incomplete type 'class e'
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:8:7: note: forward 
> declaration of 'class e'
> FAIL: g++.dg/torture/pr83619.C   -O0  (test for excess errors)
> Excess errors:
> /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of 
> incomplete type 'class e'
> 
> The following patch tweaks it so that it doesn't emit the warning, yet still
> ICEs before your cgraph.c change and PASSes after it.
> Tested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-01-18  Jakub Jelinek  
> 
>   PR ipa/83619
>   * g++.dg/torture/pr83619.C (e): Define before first use instead of
>   forward declaration.
OK.

Note I think someone has filed a regression against this failure.  SO
you might want to check the recent P3s to find and close it out.

jeff

Re: [PATCH,PTX] Add support for CUDA 9

2018-01-18 Thread Cesar Philippidis

On 12/19/2017 04:39 PM, Tom de Vries wrote:
> On 12/20/2017 12:25 AM, Cesar Philippidis wrote:
>> og7-ptx-cuda9.diff
>>
>>
>> 2017-12-19  Cesar Philippidis  
>>
>> gcc/
>> * config/nvptx/nvptx.c (output_init_frag): Don't use generic address
>> spaces for function labels.
>>
>> gcc/testsuite/
>> * gcc.target/nvptx/indirect_call.c: New test.
>>
>>
>> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
>> index dfb27ef..a7b4c09 100644
>> --- a/gcc/config/nvptx/nvptx.c
>> +++ b/gcc/config/nvptx/nvptx.c
>> @@ -1894,9 +1894,15 @@ output_init_frag (rtx sym)
>>         if (sym)
>>   {
>> -  fprintf (asm_out_file, "generic(");
>> +  bool function = SYMBOL_REF_DECL (sym)
>> +    && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL);
> 
> Please indent using parentheses like this:
> 
> bool function = (SYMBOL_REF_DECL (sym)
>  && (TREE_CODE (SYMBOL_REF_DECL (sym)) == ...));

Done.

>> +  if (!function)
>> +    fprintf (asm_out_file, "generic(");
>>     output_address (VOIDmode, sym);
>> -  fprintf (asm_out_file, val ? ") + " : ")");
>> +  if (!function)
>> +    fprintf (asm_out_file, val ? ") + " : ")");
>> +  else if (val)
>> +    fprintf (asm_out_file, " + ");
> 
> 
> Please use:
> 
>   if (!function)
> fprintf (asm_out_file, ")");
>   if (val)
> fprintf (asm_out_file, " + ");

Done.

>>   }
>>       if (!sym || val)
>> diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c
>> b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
>> new file mode 100644
>> index 000..39992a7
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
>> @@ -0,0 +1,19 @@
>> +/* { dg-options "-O2 -msoft-stack" } */
>> +/* { dg-do run } */
>> +
>> +int
>> +f1 (int a)
>> +{
>> +  return a + 1;
>> +}
>> +
>> +int (*f2)(int) = f1;
>> +
>> +int
>> +main ()
>> +{
>> +  if (f2 (100) != 101)
>> +    __builtin_abort();
>> +
>> +  return 0;
>> +}
>>

Here's the updated patch with the changes that you requested. There are
no new regressions in trunk. I tested it on my desktop running driver
387.34 on a Pascal GPU.

Is this OK for trunk?

Cesar
2018-01-18  Cesar Philippidis  

	gcc/
	* config/nvptx/nvptx.c (output_init_frag): Don't use generic address
	spaces for function labels.

	gcc/testsuite/
	* gcc.target/nvptx/indirect_call.c: New test.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 86fc13f4fc0..4cb87c8ad07 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1899,9 +1899,15 @@ output_init_frag (rtx sym)
   
   if (sym)
 {
-  fprintf (asm_out_file, "generic(");
+  bool function = (SYMBOL_REF_DECL (sym)
+		   && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL));
+  if (!function)
+	fprintf (asm_out_file, "generic(");
   output_address (VOIDmode, sym);
-  fprintf (asm_out_file, val ? ") + " : ")");
+  if (!function)
+	fprintf (asm_out_file, ")");
+  if (val)
+	fprintf (asm_out_file, " + ");
 }
 
   if (!sym || val)
diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
new file mode 100644
index 000..39992a7137b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
@@ -0,0 +1,19 @@
+/* { dg-options "-O2 -msoft-stack" } */
+/* { dg-do run } */
+
+int
+f1 (int a)
+{
+  return a + 1;
+}
+  
+int (*f2)(int) = f1;
+
+int
+main ()
+{
+  if (f2 (100) != 101)
+__builtin_abort();
+
+  return 0;
+}

Re: Go patch committed: Enable escape analysis for runtime

2018-01-18 Thread Ian Lance Taylor

On Thu, Jan 18, 2018 at 11:53 AM, Rainer Orth
 wrote:
>
>> This patch to the Go frontend by Cherry Zhang enables escape analysis
>> for the runtime package in the Go frontend.  The runtime package was
>> hard-coded non-escape, and the escape analysis was not run for the
>> runtime package.  This patch removes the hard-code, and lets the
>> escape analysis decide.  It is not allowed for local variables and
>> closures in the runtime to be heap allocated. This patch adds the
>> check that make sure that they indeed do not escape.
>>
>> The escape analysis is always run when compiling the runtime now.
>>
>> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
>> to mainline.
>
> this broke Solaris bootstrap:
>
> /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:182:57: 
> error: n escapes to heap, not allowed in runtime
>   if port_getn(portfd, &events[0], uint32(len(events)), &n, wait) < 0 {
>  ^
> /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:176:11: 
> error: zero escapes to heap, not allowed in runtime
>wait = &zero
>^
> /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:174:6: error: 
> .runtime.zero escapes to heap, not allowed in runtime
>   var zero timespec
>   ^
> /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:179:6: error: 
> .runtime.events escapes to heap, not allowed in runtime
>   var events [128]portevent
>   ^
> /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:181:6: error: 
> .runtime.n escapes to heap, not allowed in runtime
>   var n uint32 = 1
>   ^
>
> Fixed as follows.  i386-pc-solaris2.11 and sparc-sun-solaris2.11 builds
> have completed, make check still running.

Thanks, patch committed.

Ian

libgo patch commited: Recognize AIX objects and archives

2018-01-18 Thread Ian Lance Taylor

This patch by Tony Reix adds code to the go tool to recognize AIX
objects and archives.  Bootstrapped on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 256872)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-6b4f1391e2cd637fa5ed797bb3ab43d9ede61997
+cf33aedb66b5121708f5640443c1336080a47fe5
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/go/internal/work/exec.go
===
--- libgo/go/cmd/go/internal/work/exec.go   (revision 256794)
+++ libgo/go/cmd/go/internal/work/exec.go   (working copy)
@@ -1282,6 +1282,7 @@ func (b *Builder) cover(a *Action, dst,
 
 var objectMagic = [][]byte{
{'!', '<', 'a', 'r', 'c', 'h', '>', '\n'}, // Package archive
+   {'<', 'b', 'i', 'g', 'a', 'f', '>', '\n'}, // Package AIX big archive
{'\x7F', 'E', 'L', 'F'},   // ELF
{0xFE, 0xED, 0xFA, 0xCE},  // Mach-O big-endian 32-bit
{0xFE, 0xED, 0xFA, 0xCF},  // Mach-O big-endian 64-bit
@@ -1291,6 +1292,8 @@ var objectMagic = [][]byte{
{0x00, 0x00, 0x01, 0xEB},  // Plan 9 i386
{0x00, 0x00, 0x8a, 0x97},  // Plan 9 amd64
{0x00, 0x00, 0x06, 0x47},  // Plan 9 arm
+   {0x01, 0xDF},  // XCOFF32
+   {0x01, 0xF7},  // XCOFF64
 }
 
 func isObject(s string) bool {

libgo patch committed: Add go:noescape for AIX

2018-01-18 Thread Ian Lance Taylor

This patch by Tony Reix adds some go:noescape declarations for AIX.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 256873)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-cf33aedb66b5121708f5640443c1336080a47fe5
+8195b62d353026256037fa44409c4ad95762e6b7
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/netpoll_aix.go
===
--- libgo/go/runtime/netpoll_aix.go (revision 256593)
+++ libgo/go/runtime/netpoll_aix.go (working copy)
@@ -34,12 +34,15 @@ const _PS_DELETE = 0x2
 //extern pollset_create
 func pollset_create(maxfd int32) pollset_t
 
+//go:noescape
 //extern pollset_ctl
 func pollset_ctl(ps pollset_t, pollctl_array *poll_ctl, array_length int32) 
int32
 
+//go:noescape
 //extern pollset_poll
 func pollset_poll(ps pollset_t, polldata_array *pollfd, array_length int32, 
timeout int32) int32
 
+//go:noescape
 //extern pipe
 func libc_pipe(fd *int32) int32
 
Index: libgo/go/runtime/stubs2.go
===
--- libgo/go/runtime/stubs2.go  (revision 256593)
+++ libgo/go/runtime/stubs2.go  (working copy)
@@ -10,6 +10,7 @@ package runtime
 
 import "unsafe"
 
+//go:noescape
 func read(fd int32, p unsafe.Pointer, n int32) int32
 func closefd(fd int32) int32

Re: Go patch committed: Update to Go1.10beta1

2018-01-18 Thread Ian Lance Taylor

On Thu, Jan 11, 2018 at 8:56 AM, Rainer Orth
 wrote:
>
>>> This patch updates libgo to the Go1.10beta1 release.  The final Go
>>> 1.10 release is expected around February 1, so it's not clear how the
>>> release timing is going to work with GCC 8.  In any case this updates
>>> GCC to something pretty close to the final Go 1.10 release.
>>>
>>> A few changes to the frontend were required to match changes in the
>>> runtime map code, and to handle some no-writebarrier cases in the
>>> runtime package.
>>>
>>> As usual with these updates the complete patch is too large to include
>>> in this e-mail message.  I've just included the changes to
>>> gccgo-specific code.
>>>
>>> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
>>> to mainline.
>>
>> the patch broke Solaris bootstrap:
>>
>> mv: cannot stat 'os/signal/internal/pty.s-gox.tmp': No such file or directory
>> make[4]: *** [Makefile:3348: os/signal/internal/pty.s-gox] Error 1
>> make[4]: *** Waiting for unfinished jobs
>>
>> Fixed trivially as follows, which allowed the build to complete.  make
>> check still running...
>
> there's another build failure, only affecting Solaris 10:
>
> /vol/gcc/src/hg/trunk/local/libgo/go/os/user/cgo_lookup_unix.go:274:20: 
> error: integer constant overflow
>   sp.Pw_uid = 1<<32 - 2
> ^
> /vol/gcc/src/hg/trunk/local/libgo/go/os/user/cgo_lookup_unix.go:275:20: 
> error: integer constant overflow
>   sp.Pw_gid = 1<<32 - 3
> ^
> make[8]: *** [Makefile:3313: os/user.lo] Error 1
>
> sysinfo.go has
>
> type Passwd struct { Pw_name *int8; Pw_passwd *int8; Pw_uid int32; Pw_gid 
> int32; Pw_age *int8; Pw_comment *int8; Pw_gecos *int8; Pw_dir *int8; Pw_shell 
> *int8; }
>
> while on S11 there is
>
> type Passwd struct { Pw_name *int8; Pw_passwd *int8; Pw_uid uint32; Pw_gid 
> uint32; Pw_age *int8; Pw_comment *int8; Pw_gecos *int8; Pw_dir *int8; 
> Pw_shell *int8; }

I fixed this by clobbering the types used in Go, as follows.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu, committed to
mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 256874)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-8195b62d353026256037fa44409c4ad95762e6b7
+bce8720d4eb662f31026e9c7be6ce4d0aeb4ae3b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/mksysinfo.sh
===
--- libgo/mksysinfo.sh  (revision 256593)
+++ libgo/mksysinfo.sh  (working copy)
@@ -685,9 +685,13 @@ if ! grep "const EAI_OVERFLOW " ${OUT} >
 fi
 
 # The passwd struct.
+# Force uid and gid from int32 to uint32 for consistency; they are
+# int32 on Solaris 10 but uint32 everywhere else including Solaris 11.
 grep '^type _passwd ' gen-sysinfo.go | \
 sed -e 's/_passwd/Passwd/' \
   -e 's/ pw_/ Pw_/g' \
+  -e 's/ Pw_uid int32/ Pw_uid uint32/' \
+  -e 's/ Pw_gid int32/ Pw_gid uint32/' \
 >> ${OUT}
 
 # The group struct.

Re: Go patch committed: Update to Go1.10beta1

2018-01-18 Thread Ian Lance Taylor

On Thu, Jan 18, 2018 at 12:02 AM, Rainer Orth
 wrote:
>
>> On Thu, Jan 11, 2018 at 1:46 AM, Rainer Orth
>>  wrote:
>>>
 On Wed, Jan 10, 2018 at 5:42 AM, Ian Lance Taylor  wrote:
>
> Whoops, there's a bug on big-endian 32-bit systems.  I'm testing
> https://golang.org/cl/87135.

 Committed as follows.
>>>
>>> thanks, that fixed quite a lot of the failures.
>>>
>>> However, many others remain, too many to report here.  I've filed PR
>>> go/83787 to capture those.
>>
>> Thanks.  I found the problem: there is a new function makechan that
>> takes a size argument of type int, and the old makechan, that took
>> int64, is now makechan64.  Since the size argument was the last one,
>> this worked fine except on 32-bit big-endian systems.  Fixed with this
>> patch.  Bootstrapped and tested on x86_64-pc-linux-gnu and
>> sparc-sun-solaris2.12.  Committed to mainline.
>
> great, thanks for fixing this.
>
> One related question: I mentioned in passing in the PR that four go test
> invocations in gotools would hang indefinitely.  I noticed that they
> weren't called with -test.timeout, unlike libgo's gotest.  However, in
> exceptional cases like this this probably can't be helped short of
> adding an external timeout mechanism.

Thanks, I missed that.  This patch adds -test.timeout arguments to at
least set a timeout where possible.  I used the same 240s value used
by libgo by default.  Bootstrapped and ran gotools testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2018-01-18  Ian Lance Taylor  

* Makefile.am (GOTOOLS_TEST_TIMEOUT): Define.
(check-go-tool): Pass -test.timeout with GOTOOLS_TEST_TIMEOUT.
(check-cgo-test, check-carchive-test, check-vet): Likewise.
* Makefile.in: Rebuild.
Index: Makefile.am
===
--- Makefile.am (revision 256398)
+++ Makefile.am (working copy)
@@ -161,6 +161,9 @@ uninstall-local:
 
 GOTESTFLAGS =
 
+# Number of seconds before tests time out.
+GOTOOLS_TEST_TIMEOUT = 240
+
 # Run tests using the go tool, and frob the output to look like that
 # generated by DejaGNU.  The main output of this is two files:
 # gotools.sum and gotools.log.
@@ -234,11 +237,11 @@ check-go-tool: go$(EXEEXT) $(noinst_PROG
cp $(libgodir)/objabi.go check-go-dir/src/cmd/internal/objabi/
@abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \
abs_checkdir=`cd check-go-dir && $(PWD_COMMAND)`; \
-   echo "cd check-go-dir/src/cmd/go && $(ECHO_ENV) GOPATH=$${abs_checkdir} 
$(abs_builddir)/go$(EXEEXT) test -test.short -test.v" > cmd_go-testlog
+   echo "cd check-go-dir/src/cmd/go && $(ECHO_ENV) GOPATH=$${abs_checkdir} 
$(abs_builddir)/go$(EXEEXT) test -test.short 
-test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v" > cmd_go-testlog
$(CHECK_ENV) \
GOPATH=`cd check-go-dir && $(PWD_COMMAND)`; \
export GOPATH; \
-   (cd check-go-dir/src/cmd/go && $(abs_builddir)/go$(EXEEXT) test 
-test.short -test.v) >> cmd_go-testlog 2>&1 || echo "--- $${fl}: go test cmd/go 
(0.00s)" >> cmd_go-testlog
+   (cd check-go-dir/src/cmd/go && $(abs_builddir)/go$(EXEEXT) test 
-test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v) >> cmd_go-testlog 
2>&1 || echo "--- $${fl}: go test cmd/go (0.00s)" >> cmd_go-testlog
grep '^--- ' cmd_go-testlog | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' | 
sort -k 2
 
 # check-runtime runs `go test runtime` in our environment.
@@ -269,11 +272,11 @@ check-cgo-test: go$(EXEEXT) $(noinst_PRO
$(MKDIR_P) cgo-test-dir/misc/cgo
cp -r $(libgomiscdir)/cgo/test cgo-test-dir/misc/cgo/
@abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \
-   echo "cd cgo-test-dir/misc/cgo/test && $(ECHO_ENV) GOTRACEBACK=2 
$(abs_builddir)/go$(EXEEXT) test -test.short -test.v" > cgo-testlog
+   echo "cd cgo-test-dir/misc/cgo/test && $(ECHO_ENV) GOTRACEBACK=2 
$(abs_builddir)/go$(EXEEXT) test -test.short 
-test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v" > cgo-testlog
$(CHECK_ENV) \
GOTRACEBACK=2; \
export GOTRACEBACK; \
-   (cd cgo-test-dir/misc/cgo/test && $(abs_builddir)/go$(EXEEXT) test 
-test.short -test.v) >> cgo-testlog 2>&1 || echo "--- $${fl}: go test 
misc/cgo/test (0.00s)" >> cgo-testlog
+   (cd cgo-test-dir/misc/cgo/test && $(abs_builddir)/go$(EXEEXT) test 
-test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v) >> cgo-testlog 2>&1 
|| echo "--- $${fl}: go test misc/cgo/test (0.00s)" >> cgo-testlog
grep '^--- ' cgo-testlog | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' | sort 
-k 2
 
 # check-carchive-test runs `go test misc/cgo/testcarchive/carchive_test.go`
@@ -283,11 +286,11 @@ check-carchive-test: go$(EXEEXT) $(noins
$(MKDIR_P) carchive-test-dir/misc/cgo
cp -r $(libgomiscdir)/cgo/testcarchive carchive-test-dir/misc/cgo/
@abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \
-   echo "cd carchive-test-dir/misc/cgo/testcarchive && $(ECHO_ENV) 
LIBRARY_PATH=`echo $${abs_libgodir}/.libs` $(

libgo patch committed: Use rlimit64 when we use getrlimit64

2018-01-18 Thread Ian Lance Taylor

This libgo patch changes the syscall package to use the rlimit64 type,
if available, when we use getrlimit64.  This makes no difference on
most systems, because  renames the type appropriately
anyhow, but apparently it makes a difference on AIX.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu, and got a confirmation that
this fixes the problem on AIX.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 256875)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-bce8720d4eb662f31026e9c7be6ce4d0aeb4ae3b
+87525458bcd5ab4beb5b95e7d58e3dfdbc1bd478
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 256794)
+++ libgo/Makefile.am   (working copy)
@@ -674,7 +674,7 @@ s-errno:
 
 sysinfo.go: s-sysinfo; @true
 s-sysinfo: $(srcdir)/mksysinfo.sh gen-sysinfo.go errno.i
-   GOOS=$(GOOS) $(SHELL) $(srcdir)/mksysinfo.sh
+   GOARCH=$(GOARCH) GOOS=$(GOOS) $(SHELL) $(srcdir)/mksysinfo.sh
$(SHELL) $(srcdir)/mvifdiff.sh tmp-sysinfo.go sysinfo.go
$(STAMP) $@
 
Index: libgo/mksysinfo.sh
===
--- libgo/mksysinfo.sh  (revision 256875)
+++ libgo/mksysinfo.sh  (working copy)
@@ -1120,8 +1120,16 @@ if test "$timex" != ""; then
 fi
 
 # The rlimit struct.
-grep '^type _rlimit ' gen-sysinfo.go | \
-sed -e 's/_rlimit/Rlimit/' \
+# On systems that use syscall/libcall_posix_largefile.go, use rlimit64
+# if it exists.
+rlimit="_rlimit"
+if test "${GOOS}" = "aix" || test "${GOOS}" = "linux" || (test "${GOOS}" = 
"solaris" && (test "${GOARCH}" = "386" || test "${GOARCH}" = "sparc")); then
+  if grep '^type _rlimit64 ' gen-sysinfo.go > /dev/null 2>&1; then
+rlimit="_rlimit64"
+  fi
+fi
+grep "^type ${rlimit} " gen-sysinfo.go | \
+sed -e "s/${rlimit}/Rlimit/" \
   -e 's/rlim_cur/Cur/' \
   -e 's/rlim_max/Max/' \
 >> ${OUT}
@@ -1130,7 +1138,13 @@ grep '^type _rlimit ' gen-sysinfo.go | \
 grep '^const _RLIMIT_' gen-sysinfo.go |
 sed -e 's/^\(const \)_\(RLIMIT_[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
 grep '^const _RLIM_' gen-sysinfo.go |
+grep -v '^const _RLIM_INFINITY ' |
 sed -e 's/^\(const \)_\(RLIM_[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
+if test "${rlimit}" = "_rlimit64" && grep '^const _RLIM64_INFINITY ' 
gen-sysinfo.go > /dev/null 2>&1; then
+  echo 'const RLIM_INFINITY = _RLIM64_INFINITY' >> ${OUT}
+elif grep '^const _RLIM_INFINITY ' gen-sysinfo-go; then
+  echo 'const RLIM_INFINITY = _RLIM_INFINITY' >> ${OUT}
+fi
 
 # The sysinfo struct.
 grep '^type _sysinfo ' gen-sysinfo.go | \

[PATCH, fortran] Support Fortran 2018 teams

2018-01-18 Thread Damian Rouson

 
All,

Attached please find what I hope is the final version of the patch that adds 
partial support for teams of images, the last remaining major Fortran 2018 
feature without current support in gfortran (or any other Fortran compiler of 
which I’m aware).  I submitted the first draft of this patch last September. 
Regression tests show no unexpected failures (i.e., the output of ‘grep -w FAIL 
gfortran.sum’ is empty). The Changelogs are below.

Among the details remaining for future commits are the following: support for 
the GET_TEAM intrinsic function is incomplete, the TEAM_NUMBER intrinsic 
function doesn’t yet accept the optional argument, and a SYNC TEAM test is 
needed.  Using any of the unsupported features generates a compile-time error 
message from gfortran or a runtime error message from OpenCoarrays. There are 
no internal compiler errors of which I’m aware.  And there remains one big, 
dissertation-worthy project in handling the use of teams in the presence of 
failed images.

Ok for trunk?

Damian


2018-01-18  Damian Rouson  
        Alessandro Fanfarillo  

        Partial support for Fortran 2018 teams features.

        * array.c (gfc_match_array_ref): add team attribute in coarray
        transfers.
        * check.c (gfc_check_get_team, gfc_check_team_number): add new 
        functions for get_team and team_number.
        * dump-parse-tree.c (show_code_node): add new statements: form team,
        change team, end team, and sync team.
        * expr.c (gfc_find_team_co): add new function.
        * gfortran.h: add new statements.
        * intrinsic.c (add_functions): add get_team and team_number functions.
        * intrinsic.h: add get_team and team_number prototypes for check,
        simplify, and resolve.
        * iresolve.c (gfc_resolve_get_team, gfc_resolve_team_number): add new 
        functions.
        * iso-fortran-env.def: add the team_type derived type.
        * match.c (gfc_match_if, gfc_match_form_team, gfc_match_team_number)
        (gfc_match_end_team, gfc_match_sync_team, gfc_match_change_team):
        add change team, end team, form team, sync team match and functions.
        * match.h: add new prototypes for change team, end team, form team,
        and sync team.
        * parse.c (decode_statement): add cases for change team, end team,
        form team, and sync team.
        * resolve.c: add cases for exec form team, change team, end team, and 
        sync team.
        * simplify.c (gfc_simplify_get_team): add new function for get team.
        * st.c (gfc_free_statement): add cases exec for change team, end team,
        form team, sync team.
        * trans-decl.c (gfor_fndecl_caf_form_team)
        (gfor_fndecl_caf_change_team, gfor_fndecl_caf_end_team)
        (gfor_fndecl_caf_sync_team, gfor_fndecl_caf_get_team)
        (gfor_fndecl_caf_team_number): add functions and definitions.
        * trans-intrinsic.c (conv_caf_send, conv_intrinsic_team_number): add 
        new function and team_type argument support.
        * trans-stmt.c (gfc_trans_form_team, gfc_trans_change_team)
        (gfc_trans_end_team, gfc_trans_sync_team): add new functions.
        * trans-stmt.h: add new prototypes.
        * trans-types.c (gfc_get_derived_type): check condition for team_type.
        * trans.c (trans_code): new exec cases for form team, change team, end 
        team, and sync team.
        * trans.h: add new prototypes.

2018-01-18  Damian Rouson  
        Alessandro Fanfarillo  

        Fortran 2018 teams tests.

        * gfortran.dg/team_number_1.f90: new test for team_number.
        * gfortran.dg/team_change_1.f90: new test for change team.
        * gfortran.dg/team_end_1.f90: new test for end team.
        * gfortran.dg/team_form_1.f90: new test for form team.





teams-patch.diff
Description: Binary data

Re: [PATCH v3] Ability to remap file names in FILE, etc (PR other/70268)

2018-01-18 Thread Boris Kolpackov

Ximin Luo  writes:

> Higher-level build scripts sometimes like to save CFLAGS etc into the build 
> output, making the overall build output unreproducible even if GCC is playing 
> nicely. Rather than add logic to strip -f{file,debug,macro,...}-prefix-map, 
> into all possible higher-level programs that might want to save CFLAGS, it is
> simpler if GCC could read it from a user-specified environment variable.

This does feel like we are trying to fix the issue in the wrong place.
Also, won't such broken packages normally store all options (including
-I) rather than just CFLAGS? And if the answer is no, then perhaps you
could put -ffile-prefix-map into CPPFLAGS instead of CFLAGS? Kind of
even feels right seeing that it affects the preprocessor.

Boris

84 matches

Mail list logo