Re: Go patch committed: Update to Go1.10beta1
Hi Ian, > On Thu, Jan 11, 2018 at 1:46 AM, Rainer Orth > wrote: >> >>> On Wed, Jan 10, 2018 at 5:42 AM, Ian Lance Taylor wrote: Whoops, there's a bug on big-endian 32-bit systems. I'm testing https://golang.org/cl/87135. >>> >>> Committed as follows. >> >> thanks, that fixed quite a lot of the failures. >> >> However, many others remain, too many to report here. I've filed PR >> go/83787 to capture those. > > Thanks. I found the problem: there is a new function makechan that > takes a size argument of type int, and the old makechan, that took > int64, is now makechan64. Since the size argument was the last one, > this worked fine except on 32-bit big-endian systems. Fixed with this > patch. Bootstrapped and tested on x86_64-pc-linux-gnu and > sparc-sun-solaris2.12. Committed to mainline. great, thanks for fixing this. One related question: I mentioned in passing in the PR that four go test invocations in gotools would hang indefinitely. I noticed that they weren't called with -test.timeout, unlike libgo's gotest. However, in exceptional cases like this this probably can't be helped short of adding an external timeout mechanism. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Wed, 17 Jan 2018, Jakub Jelinek wrote: > Hi! > > PR83882 complains that PR81715 testcase fails on callee copies parameter > targets. The following patch ought to fix that, but I have only > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase > with hppa. Looks reasonable. > John, do you think you could test this on hppa without the callee copies > default change? > > Or should we not care anymore if there aren't any similar targets left? How's that communicated to the middle-end anyways? Richard. > 2018-01-17 Jakub Jelinek > > PR sanitizer/81715 > PR testsuite/83882 > * function.h (gimplify_parameters): Add gimple_seq * argument. > * function.c: Include gimple.h and options.h. > (gimplify_parameters): Add cleanup argument, add CLOBBER stmts > for the added local temporaries if needed. > * gimplify.c (gimplify_body): Adjust gimplify_parameters caller, > if there are any parameter cleanups, wrap whole body into a > try/finally with the cleanups. > > --- gcc/function.h.jj 2018-01-03 10:19:53.858533740 +0100 > +++ gcc/function.h2018-01-16 14:31:21.409972177 +0100 > @@ -607,7 +607,7 @@ extern bool initial_value_entry (int i, > extern void instantiate_decl_rtl (rtx x); > extern int aggregate_value_p (const_tree, const_tree); > extern bool use_register_for_decl (const_tree); > -extern gimple_seq gimplify_parameters (void); > +extern gimple_seq gimplify_parameters (gimple_seq *); > extern void locate_and_pad_parm (machine_mode, tree, int, int, int, >tree, struct args_size *, >struct locate_and_pad_arg_data *); > --- gcc/function.c.jj 2018-01-12 11:35:48.901222595 +0100 > +++ gcc/function.c2018-01-16 15:13:22.165665047 +0100 > @@ -79,6 +79,8 @@ along with GCC; see the file COPYING3. > #include "tree-ssa.h" > #include "stringpool.h" > #include "attribs.h" > +#include "gimple.h" > +#include "options.h" > > /* So we can assign to cfun in this file. */ > #undef cfun > @@ -3993,7 +3995,7 @@ gimplify_parm_type (tree *tp, int *walk_ > statements to add to the beginning of the function. */ > > gimple_seq > -gimplify_parameters (void) > +gimplify_parameters (gimple_seq *cleanup) > { >struct assign_parm_data_all all; >tree parm; > @@ -4058,6 +4060,16 @@ gimplify_parameters (void) > else if (TREE_CODE (type) == COMPLEX_TYPE > || TREE_CODE (type) == VECTOR_TYPE) > DECL_GIMPLE_REG_P (local) = 1; > + > + if (!is_gimple_reg (local) > + && flag_stack_reuse != SR_NONE) > + { > + tree clobber = build_constructor (type, NULL); > + gimple *clobber_stmt; > + TREE_THIS_VOLATILE (clobber) = 1; > + clobber_stmt = gimple_build_assign (local, clobber); > + gimple_seq_add_stmt (cleanup, clobber_stmt); > + } > } > else > { > --- gcc/gimplify.c.jj 2018-01-16 12:21:15.895859416 +0100 > +++ gcc/gimplify.c2018-01-16 14:41:27.643872081 +0100 > @@ -12589,7 +12589,7 @@ gbind * > gimplify_body (tree fndecl, bool do_parms) > { >location_t saved_location = input_location; > - gimple_seq parm_stmts, seq; > + gimple_seq parm_stmts, parm_cleanup = NULL, seq; >gimple *outer_stmt; >gbind *outer_bind; >struct cgraph_node *cgn; > @@ -12628,7 +12628,7 @@ gimplify_body (tree fndecl, bool do_parm > >/* Resolve callee-copies. This has to be done before processing > the body so that DECL_VALUE_EXPR gets processed correctly. */ > - parm_stmts = do_parms ? gimplify_parameters () : NULL; > + parm_stmts = do_parms ? gimplify_parameters (&parm_cleanup) : NULL; > >/* Gimplify the function's body. */ >seq = NULL; > @@ -12657,6 +12657,13 @@ gimplify_body (tree fndecl, bool do_parm >tree parm; > >gimplify_seq_add_seq (&parm_stmts, gimple_bind_body (outer_bind)); > + if (parm_cleanup) > + { > + gtry *g = gimple_build_try (parm_stmts, parm_cleanup, > + GIMPLE_TRY_FINALLY); > + parm_stmts = NULL; > + gimple_seq_add_stmt (&parm_stmts, g); > + } >gimple_bind_set_body (outer_bind, parm_stmts); > >for (parm = DECL_ARGUMENTS (current_function_decl); > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] Avoid creating overflows in match.pd (P + A) - (P + B) POINTER_DIFF_EXPR optimization (PR c/61240)
On Wed, 17 Jan 2018, Jakub Jelinek wrote: > Hi! > > POINTER_DIFF_EXPR returns a signed integer, but sadly POINTER_PLUS_EXPR > second arguments are unsigned integers, so if we are adding negative numbers > to pointers, those are very large numbers and we get TREE_OVERFLOW which the > C FE then during c_fully_fold diagnoses. Bah... > We want to treat the numbers as signed integers for this purpose, using > VCE seems easiest way to get rid of the unwanted overflows. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Or do you prefer manual fold_converts in the (with { }) block + test > for INTEGER_CST && TREE_OVERFLOW + drop_tree_overflow instead? I think the patch is ok as-is. I really hope to make the POINTER_PLUS_EXPR offset operand signed in next stage1. Thanks, Richard. > 2018-01-17 Jakub Jelinek > > PR c/61240 > * match.pd ((P + A) - P, P - (P + A), (P + A) - (P + B)): For > pointer_diff optimizations use view_convert instead of convert. > > * gcc.dg/pr61240.c: New test. > > --- gcc/match.pd.jj 2018-01-15 10:02:04.0 +0100 > +++ gcc/match.pd 2018-01-17 17:10:54.855061485 +0100 > @@ -1832,7 +1832,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* The second argument of pointer_plus must be interpreted as signed, and > thus sign-extended if necessary. */ > (with { tree stype = signed_type_for (TREE_TYPE (@1)); } > - (convert (convert:stype @1 > + /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR > + second arg is unsigned even when we need to consider it as signed, > + we don't want to diagnose overflow here. */ > + (convert (view_convert:stype @1 > >/* (T)P - (T)(P + A) -> -(T) A */ >(simplify > @@ -1876,7 +1879,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* The second argument of pointer_plus must be interpreted as signed, and > thus sign-extended if necessary. */ > (with { tree stype = signed_type_for (TREE_TYPE (@1)); } > - (negate (convert (convert:stype @1) > + /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR > + second arg is unsigned even when we need to consider it as signed, > + we don't want to diagnose overflow here. */ > + (negate (convert (view_convert:stype @1) > >/* (T)(P + A) - (T)(P + B) -> (T)A - (T)B */ >(simplify > @@ -1927,7 +1933,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* The second argument of pointer_plus must be interpreted as signed, and > thus sign-extended if necessary. */ > (with { tree stype = signed_type_for (TREE_TYPE (@1)); } > - (minus (convert (convert:stype @1)) (convert (convert:stype @2))) > + /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR > + second arg is unsigned even when we need to consider it as signed, > + we don't want to diagnose overflow here. */ > + (minus (convert (view_convert:stype @1)) > + (convert (view_convert:stype @2))) > > > /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax(). */ > --- gcc/testsuite/gcc.dg/pr61240.c.jj 2018-01-17 17:25:45.821030898 +0100 > +++ gcc/testsuite/gcc.dg/pr61240.c2018-01-17 17:26:24.118029550 +0100 > @@ -0,0 +1,20 @@ > +/* PR c/61240 */ > +/* { dg-do compile } */ > + > +typedef __PTRDIFF_TYPE__ ptrdiff_t; > + > +ptrdiff_t > +foo (ptrdiff_t a[4]) > +{ > + int i[4]; > + int *p = i + 2; > + static ptrdiff_t b = p - (p - 1); /* { dg-bogus "integer overflow in > expression" } */ > + static ptrdiff_t c = (p - 1) - p; /* { dg-bogus "integer overflow in > expression" } */ > + static ptrdiff_t d = (p - 2) - (p - 1);/* { dg-bogus "integer overflow in > expression" } */ > + static ptrdiff_t e = (p - 1) - (p - 2);/* { dg-bogus "integer overflow in > expression" } */ > + a[0] = p - (p - 1);/* { dg-bogus "integer overflow > in expression" } */ > + a[1] = (p - 1) - p;/* { dg-bogus "integer overflow > in expression" } */ > + a[2] = (p - 2) - (p - 1); /* { dg-bogus "integer overflow in > expression" } */ > + a[3] = (p - 1) - (p - 2); /* { dg-bogus "integer overflow in > expression" } */ > + return b + c + d + e; > +} > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre
On Thu, 2018-01-18 at 08:52 +0100, Uros Bizjak wrote: > This puts an extra burden on the developer, which has to use correct > thunk name in their code. Sure, this can be solved trivially with > #ifdef __x86_64__, so the issue is minor, but I thought it has to be > mentioned before the name is set in stone. Except the developer can mostly use the inline thunk and not have to worry their pretty little heads about it. And the kernel developers who *have* chosen to implement their own thunk (and who requested the thunk-extern variant in the first place) have explicitly asked for the symbol name to be as it is. I spent a happy few hours on Sunday night looking at this. Seriously, in neither the thunk implementation nor the call sites (in explicit asm) was it easier. It was horrid — just look at the patch I sent out with the footnote "let's not do this". In the thunk I still had the actual register name with the 'e' or the 'r' in a macro argument anyway, because I had to deal with the value from that register. So the thunk macro ended up taking *two* arguments because its name now didn't match what it had to call the register. And in the call sites I was putting the target into a register again using either 'eax' or 'rax' form, then having to branch to the thunk using the different '_ax' form. It just hurt. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote: > On Wed, 17 Jan 2018, Jakub Jelinek wrote: > > > Hi! > > > > PR83882 complains that PR81715 testcase fails on callee copies parameter > > targets. The following patch ought to fix that, but I have only > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase > > with hppa. > > Looks reasonable. > > > John, do you think you could test this on hppa without the callee copies > > default change? > > > > Or should we not care anymore if there aren't any similar targets left? > > How's that communicated to the middle-end anyways? You mean the callee copies stuff? gimplify_parameters creates new temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the parameters to the temporary and at the end of function clears DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well. Jakub
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Thu, 18 Jan 2018, Jakub Jelinek wrote: > On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote: > > On Wed, 17 Jan 2018, Jakub Jelinek wrote: > > > > > Hi! > > > > > > PR83882 complains that PR81715 testcase fails on callee copies parameter > > > targets. The following patch ought to fix that, but I have only > > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase > > > with hppa. > > > > Looks reasonable. > > > > > John, do you think you could test this on hppa without the callee copies > > > default change? > > > > > > Or should we not care anymore if there aren't any similar targets left? > > > > How's that communicated to the middle-end anyways? > > You mean the callee copies stuff? gimplify_parameters creates new > temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the > parameters to the temporary and at the end of function clears > DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well. No, I meant whether the target wants this or not. Richard.
Re: [PATCH] i386: Use const reference of struct ix86_frame to avoid copy
On Wed, Jan 17, 2018 at 5:00 PM, H.J. Lu wrote: > We can use const reference of struct ix86_frame to avoid making a local > copy of ix86_frame. ix86_expand_epilogue makes a local copy of struct > ix86_frame and uses the reg_save_offset field as a local variable. This > patch uses a separate local variable for reg_save_offset. > > Tested on x86-64 with ada. OK for trunk? OK. Thanks, Uros. > H.J. > -- > PR target/83905 > * config/i386/i386.c (ix86_expand_prologue): Use cost reference > of struct ix86_frame. > (ix86_expand_epilogue): Likewise. Add a local variable for > the reg_save_offset field in struct ix86_frame. > --- > gcc/config/i386/i386.c | 24 > 1 file changed, 12 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index a301e18ed70..340eca42449 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -13385,7 +13385,6 @@ ix86_expand_prologue (void) > { >struct machine_function *m = cfun->machine; >rtx insn, t; > - struct ix86_frame frame; >HOST_WIDE_INT allocate; >bool int_registers_saved; >bool sse_registers_saved; > @@ -13413,7 +13412,7 @@ ix86_expand_prologue (void) >m->fs.sp_valid = true; >m->fs.sp_realigned = false; > > - frame = m->frame; > + const struct ix86_frame &frame = cfun->machine->frame; > >if (!TARGET_64BIT && ix86_function_ms_hook_prologue > (current_function_decl)) > { > @@ -14291,7 +14290,6 @@ ix86_expand_epilogue (int style) > { >struct machine_function *m = cfun->machine; >struct machine_frame_state frame_state_save = m->fs; > - struct ix86_frame frame; >bool restore_regs_via_mov; >bool using_drap; >bool restore_stub_is_tail = false; > @@ -14304,7 +14302,7 @@ ix86_expand_epilogue (int style) > } > >ix86_finalize_stack_frame_flags (); > - frame = m->frame; > + const struct ix86_frame &frame = cfun->machine->frame; > >m->fs.sp_realigned = stack_realign_fp; >m->fs.sp_valid = stack_realign_fp > @@ -14348,11 +14346,13 @@ ix86_expand_epilogue (int style) > + UNITS_PER_WORD); > } > > + HOST_WIDE_INT reg_save_offset = frame.reg_save_offset; > + >/* Special care must be taken for the normal return case of a function > using eh_return: the eax and edx registers are marked as saved, but > not restored along this path. Adjust the save location to match. */ >if (crtl->calls_eh_return && style != 2) > -frame.reg_save_offset -= 2 * UNITS_PER_WORD; > +reg_save_offset -= 2 * UNITS_PER_WORD; > >/* EH_RETURN requires the use of moves to function properly. */ >if (crtl->calls_eh_return) > @@ -14368,11 +14368,11 @@ ix86_expand_epilogue (int style) >else if (TARGET_EPILOGUE_USING_MOVE >&& cfun->machine->use_fast_prologue_epilogue >&& (frame.nregs > 1 > - || m->fs.sp_offset != frame.reg_save_offset)) > + || m->fs.sp_offset != reg_save_offset)) > restore_regs_via_mov = true; >else if (frame_pointer_needed >&& !frame.nregs > - && m->fs.sp_offset != frame.reg_save_offset) > + && m->fs.sp_offset != reg_save_offset) > restore_regs_via_mov = true; >else if (frame_pointer_needed >&& TARGET_USE_LEAVE > @@ -14440,7 +14440,7 @@ ix86_expand_epilogue (int style) >rtx t; > >if (frame.nregs) > - ix86_emit_restore_regs_using_mov (frame.reg_save_offset, style == 2); > + ix86_emit_restore_regs_using_mov (reg_save_offset, style == 2); > >/* eh_return epilogues need %ecx added to the stack pointer. */ >if (style == 2) > @@ -14535,19 +14535,19 @@ ix86_expand_epilogue (int style) > in epilogues. */ >if (!m->fs.sp_valid || m->fs.sp_realigned > || (TARGET_SEH > - && (m->fs.sp_offset - frame.reg_save_offset > + && (m->fs.sp_offset - reg_save_offset > >= SEH_MAX_FRAME_SIZE))) > { > pro_epilogue_adjust_stack (stack_pointer_rtx, > hard_frame_pointer_rtx, > GEN_INT (m->fs.fp_offset > - - frame.reg_save_offset), > + - reg_save_offset), > style, false); > } > - else if (m->fs.sp_offset != frame.reg_save_offset) > + else if (m->fs.sp_offset != reg_save_offset) > { > pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx, > GEN_INT (m->fs.sp_offset > - - frame.reg_save_offset), > + - reg_save_offset), > style, > m->fs.cfa_reg == stack_pointer_rtx); > } > -- > 2.14.3 > >
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Thu, Jan 18, 2018 at 09:18:21AM +0100, Richard Biener wrote: > On Thu, 18 Jan 2018, Jakub Jelinek wrote: > > > On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote: > > > On Wed, 17 Jan 2018, Jakub Jelinek wrote: > > > > > > > Hi! > > > > > > > > PR83882 complains that PR81715 testcase fails on callee copies parameter > > > > targets. The following patch ought to fix that, but I have only > > > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the > > > > testcase > > > > with hppa. > > > > > > Looks reasonable. > > > > > > > John, do you think you could test this on hppa without the callee copies > > > > default change? > > > > > > > > Or should we not care anymore if there aren't any similar targets left? > > > > > > How's that communicated to the middle-end anyways? > > > > You mean the callee copies stuff? gimplify_parameters creates new > > temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the > > parameters to the temporary and at the end of function clears > > DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well. > > No, I meant whether the target wants this or not. Ah, that. Through targetm.calls.callee_copies hook. Jakub
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Thu, 18 Jan 2018, Jakub Jelinek wrote: > On Thu, Jan 18, 2018 at 09:18:21AM +0100, Richard Biener wrote: > > On Thu, 18 Jan 2018, Jakub Jelinek wrote: > > > > > On Thu, Jan 18, 2018 at 09:10:03AM +0100, Richard Biener wrote: > > > > On Wed, 17 Jan 2018, Jakub Jelinek wrote: > > > > > > > > > Hi! > > > > > > > > > > PR83882 complains that PR81715 testcase fails on callee copies > > > > > parameter > > > > > targets. The following patch ought to fix that, but I have only > > > > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the > > > > > testcase > > > > > with hppa. > > > > > > > > Looks reasonable. > > > > > > > > > John, do you think you could test this on hppa without the callee > > > > > copies > > > > > default change? > > > > > > > > > > Or should we not care anymore if there aren't any similar targets > > > > > left? > > > > > > > > How's that communicated to the middle-end anyways? > > > > > > You mean the callee copies stuff? gimplify_parameters creates new > > > temporaries, assigns the parameter to them, set DECL_VALUE_EXPR on the > > > parameters to the temporary and at the end of function clears > > > DECL_VALUE_EXPR and with this patch adds a CLOBBER in a cleanup as well. > > > > No, I meant whether the target wants this or not. > > Ah, that. Through targetm.calls.callee_copies hook. So there are quite a few more callee copies targets then, notably unconditionally epiphany, mmix, mn10300 and v850 and conditionally a few more, for example mips. Richard.
Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}
On 01/17/2018 06:51 PM, Jakub Jelinek wrote: On Wed, Jan 17, 2018 at 06:42:33PM +0100, Tom de Vries wrote: +static rtx +expand_builtin_goacc_parlevel_id_size (tree exp, rtx target, int ignore) +{ + tree fndecl = get_callee_fndecl (exp); + + const char *name; + switch (DECL_FUNCTION_CODE (fndecl)) +{ +case BUILT_IN_GOACC_PARLEVEL_ID: + name = "__builtin_goacc_parlevel_id"; + break; +case BUILT_IN_GOACC_PARLEVEL_SIZE: + name = "__builtin_goacc_parlevel_size"; + break; Can you avoid that many switches on DECL_FUNCTION_CODE? Like initialize in this one not just name, but also the fallback_retval and gen_fn variables and just use them later? +default: + gcc_unreachable (); +} + + if (oacc_get_fn_attrib (current_function_decl) == NULL_TREE) +{ + error ("%s only supported in openacc code", name); OpenACC ? + return const0_rtx; +} + + tree arg = CALL_EXPR_ARG (exp, 0); + if (TREE_CODE (arg) != INTEGER_CST) +{ + error ("non-constant argument 0 to %s", name); %qs instead of %s, 2 times. + return const0_rtx; +} + + int dim = TREE_INT_CST_LOW (arg); + switch (dim) +{ +case GOMP_DIM_GANG: +case GOMP_DIM_WORKER: +case GOMP_DIM_VECTOR: + break; +default: + error ("illegal argument 0 to %s", name); + return const0_rtx; +} + + if (ignore) +return target; + + if (!targetm.have_oacc_dim_size ()) +{ + rtx retval; + switch (DECL_FUNCTION_CODE (fndecl)) + { + case BUILT_IN_GOACC_PARLEVEL_ID: + retval = const0_rtx; + break; + case BUILT_IN_GOACC_PARLEVEL_SIZE: + retval = GEN_INT (1); In addition to moving these assignments to a single switch, this one can be fallback_retval = const1_rtx; Otherwise LGTM for stage1. Followed up on the comments above. Retested as before. Will commit as attached in stage 1. Thanks, - Tom Add __builtin_goacc_parlevel_{id,size} 2018-01-06 Tom de Vries PR libgomp/82428 * builtins.def (DEF_GOACC_BUILTIN_ONLY): Define. * omp-builtins.def (BUILT_IN_GOACC_PARLEVEL_ID) (BUILT_IN_GOACC_PARLEVEL_SIZE): New builtin. * builtins.c (expand_builtin_goacc_parlevel_id_size): New function. (expand_builtin): Call expand_builtin_goacc_parlevel_id_size. * doc/extend.texi (Other Builtins): Add __builtin_goacc_parlevel_id and __builtin_goacc_parlevel_size. * f95-lang.c (DEF_GOACC_BUILTIN_ONLY): Define. * c-c++-common/goacc/builtin-goacc-parlevel-id-size-2.c: New test. * c-c++-common/goacc/builtin-goacc-parlevel-id-size.c: New test. * testsuite/libgomp.oacc-c-c++-common/gang-static-2.c: Use __builtin_goacc_parlevel_{id,size}. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Same. * testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Same. * testsuite/libgomp.oacc-c-c++-common/tile-1.c: Same. --- gcc/builtins.c | 71 +++ gcc/builtins.def | 4 ++ gcc/doc/extend.texi| 10 +++ gcc/fortran/f95-lang.c | 4 ++ gcc/omp-builtins.def | 5 ++ .../goacc/builtin-goacc-parlevel-id-size-2.c | 37 ++ .../goacc/builtin-goacc-parlevel-id-size.c | 79 ++ .../libgomp.oacc-c-c++-common/gang-static-2.c | 21 +++--- .../libgomp.oacc-c-c++-common/loop-auto-1.c| 18 ++--- .../libgomp.oacc-c-c++-common/loop-dim-default.c | 14 ++-- .../testsuite/libgomp.oacc-c-c++-common/loop-g-1.c | 17 ++--- .../testsuite/libgomp.oacc-c-c++-common/loop-g-2.c | 16 ++--- .../libgomp.oacc-c-c++-common/loop-gwv-1.c | 17 +++-- .../libgomp.oacc-c-c++-common/loop-red-g-1.c | 16 ++--- .../libgomp.oacc-c
Re: [PATCH][arm] Fix gcc.target/arm/pr40887.c directives
Hi Kyrill, On 17 January 2018 at 12:20, Kyrill Tkachov wrote: > Hi all, > > This patch converts gcc.target/arm/pr40887.c to use the proper effective > target check and dg-add-options for armv5te > so that we avoid situations where we end up trying to compile the test with > a Thumb1 hard-float ABI, which makes the > compiler complain. > > This allows the test to pass gracefully for me for my compiler configured > with: > --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard > --with-mode=thumb > > Committing to trunk. > > Thanks, > Kyrill > > 2018-01-17 Kyrylo Tkachov > > * gcc.target/arm/pr40887.c: Add armv5te effective target checks and > directives. I think you had to swap dg-options and dg-add-options directives, so I committed the attached patch as obvious. Christophe 2018-01-18 Christophe Lyon * gcc.target/arm/pr40887.c: Fix dg-options and dg-add-options order. Index: gcc/testsuite/gcc.target/arm/pr40887.c === --- gcc/testsuite/gcc.target/arm/pr40887.c (revision 256838) +++ gcc/testsuite/gcc.target/arm/pr40887.c (revision 256839) @@ -1,8 +1,8 @@ /* { dg-do compile } */ /* { dg-skip-if "need at least armv5" { *-*-* } { "-march=armv[234]*" } { "" } } */ /* { dg-require-effective-target arm_arch_v5te_ok } */ +/* { dg-options "-O2" } */ /* { dg-add-options arm_arch_v5te } */ -/* { dg-options "-O2" } */ /* { dg-final { scan-assembler "blx" } } */ int (*indirect_func)(int x);
[PATCH][arm] PR target/65578: Fix builtin-bswap16-1.c and builtin-bswap-1.c
Hi all, The builtin-bswap-1.c and builtin-bswap16-1.c are pretty annoying at the moment. They force an explicit armv6 option that is a thumb1 target, so if you're testing a toolchain configured with something like --with-cpu=cortex-a15 --with-float=hard --with-mode=thumb you'll get those pesky errors about Thumb1 hard-float not being implemented, even though the tests don't relate to floating-point functionality at all. I *think* this is also due to the wrong order of dg-options and dg-require-effective-target directives that might end up not doing a proper effective target check. The solution in this patch is to commonise the code and create a couple of tests for each. One tests an armv6t2 target. This allows us to test an ARM or a Thumb2 target. The second one sets an armv6-m target, which is a Thumb1 target. The dg-add-options machinery for arm_arch_v6m knows how to add the right -mfloat-abi=soft option. With this patch we end up testing all of ARM, Thumb1, Thumb2 codegen whereas before we only ever tried testing Thumb1, if the multilib options happened to line up just right, and would give an ugly error otherwise. Now, if the multilib options don't allow the test it should just appear as UNSUPPORTED. Committing to trunk. Thanks, Kyrill 2018-01-18 Kyrylo Tkachov PR target/65578 * gcc.target/arm/builtin-bswap.x: New file. * gcc.target/arm/builtin-bswap-1.c: Include the above. Add checks and options for armv6t2. * gcc.target/arm/builtin-bswap-2.c: Include the above. Add checks and options for Thumb1. * gcc.target/arm/builtin-bswap16.x: New file. * gcc.target/arm/builtin-bswap16-1.c: Include the above. Add checks and options for armv6t2. * gcc.target/arm/builtin-bswap16-2.c: Include the above. Add checks and options for Thumb1. diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c index 43195bd8235b492a3c4783d88bf744ac92ee9f68..c1e7740d14d3ca4e93a71e38b12f82c19791a204 100644 --- a/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-1.c @@ -1,7 +1,9 @@ /* { dg-do compile } */ -/* { dg-options "-O2" } */ -/* { dg-require-effective-target arm_arch_v6_ok } */ -/* { dg-add-options arm_arch_v6 } */ +/* { dg-require-effective-target arm_arch_v6t2_ok } */ +/* { dg-add-options arm_arch_v6t2 } */ +/* This test depends on if-conversion creating the conditional forms of + of the instructions. Add an -mtune option known to facilitate that. */ +/* { dg-additional-options "-O2 -mtune=cortex-a53" } */ /* { dg-final { scan-assembler-not "orr\[ \t\]" } } */ /* { dg-final { scan-assembler-times "revsh\\t" 1 { target { arm_nothumb } } } } */ /* { dg-final { scan-assembler-times "revshne\\t" 1 { target { arm_nothumb } } } } */ @@ -13,69 +15,4 @@ /* { dg-final { scan-assembler-times "revne\\t" 2 { target { arm_nothumb } } } } */ /* { dg-final { scan-assembler-times "rev\\t" 4 { target { ! arm_nothumb } } } } */ -/* revsh */ -short swaps16 (short x) -{ - return __builtin_bswap16 (x); -} - -extern short foos16 (short); - -/* revshne */ -short swaps16_cond (short x, int y) -{ - short z = x; - if (y) -z = __builtin_bswap16 (x); - return foos16 (z); -} - -/* rev16 */ -unsigned short swapu16 (unsigned short x) -{ - return __builtin_bswap16 (x); -} - -extern unsigned short foou16 (unsigned short); - -/* rev16ne */ -unsigned short swapu16_cond (unsigned short x, int y) -{ - unsigned short z = x; - if (y) -z = __builtin_bswap16 (x); - return foou16 (z); -} - -/* rev */ -int swaps32 (int x) { - return __builtin_bswap32 (x); -} - -extern int foos32 (int); - -/* revne */ -int swaps32_cond (int x, int y) -{ - int z = x; - if (y) -z = __builtin_bswap32 (x); - return foos32 (z); -} - -/* rev */ -unsigned int swapu32 (unsigned int x) -{ - return __builtin_bswap32 (x); -} - -extern unsigned int foou32 (unsigned int); - -/* revne */ -unsigned int swapsu2 (unsigned int x, int y) -{ - int z = x; - if (y) -z = __builtin_bswap32 (x); - return foou32 (z); -} +#include "builtin-bswap.x" diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c b/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c new file mode 100644 index ..4ba79aa322dc72b521dbff7e5f5ac3f77767988c --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arch_v6m_ok } */ +/* { dg-add-options arm_arch_v6m } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { scan-assembler-not "orr\[ \t\]" } } */ +/* { dg-final { scan-assembler-times "revsh\\t" 2 } } */ +/* { dg-final { scan-assembler-times "rev16\\t" 2 } } */ +/* { dg-final { scan-assembler-times "rev\\t" 4 } } */ + +#include "builtin-bswap.x" diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap.x b/gcc/testsuite/gcc.target/arm/builtin-bswap.x new file mode 100644 index 0
Re: [PATCH][arm] Fix gcc.target/arm/pr40887.c directives
On 18/01/18 09:15, Christophe Lyon wrote: Hi Kyrill, On 17 January 2018 at 12:20, Kyrill Tkachov wrote: Hi all, This patch converts gcc.target/arm/pr40887.c to use the proper effective target check and dg-add-options for armv5te so that we avoid situations where we end up trying to compile the test with a Thumb1 hard-float ABI, which makes the compiler complain. This allows the test to pass gracefully for me for my compiler configured with: --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb Committing to trunk. Thanks, Kyrill 2018-01-17 Kyrylo Tkachov * gcc.target/arm/pr40887.c: Add armv5te effective target checks and directives. I think you had to swap dg-options and dg-add-options directives, so I committed the attached patch as obvious. Thanks Christophe, I had spotted it and was planning to fix it by using dg-additional-options, but it's the same in the end. Kyrill Christophe
Re: [PATCH][arm] Convert gcc.target/arm/stl-cond.c into an RTL test
Hi Kyrill, On 17 January 2018 at 12:27, Kyrill Tkachov wrote: > Hi all, > > This is an awkward testsuite failure. The original bug was that we were > failing to put out > the conditional code in the conditional form of the STL instruction (oops!). > So we wanted to output STLNE, but instead output STL. > The testacase relies on if-conversion to conditionalise the insn for STL. > However, ever since r251643 the expansion of a non-relaxed atomic store > always includes a compiler barrier. That blocks if-conversion in all cases. > > So there's no easy to get to a conditional STL instruction from a C program. > But we do want to test for the original bug fix that if the RTL insn for STL > is conditionalised > it should output the conditional code. > > The solution in this patch is to convert the test into an RTL test with the > COND_EXEC form > of the STL insn and scan the assembly output there. > This seems to work fine, and gives us an opportunity to create a > gcc.dg/rtl/arm directory > in the RTL tests. > > This now makes the gcc.target/arm/stl-cond.c disappear (as the test is > deleted) and > the new test in gcc.dg/rtl/arm/stl-cond.c passes. > > Committing to trunk. > Thanks, > Kyrill > > 2018-01-17 Kyrylo Tkachov > > * gcc.dg/rtl/arm/stl-cond.c: New test. > * gcc.target/arm/stl-cond.c: Delete. I've noticed that the new test is unsupported on armeb, is this intentional? Christophe
Re: [PATCH][arm] Convert gcc.target/arm/stl-cond.c into an RTL test
Hi Christophe, On 18/01/18 09:25, Christophe Lyon wrote: Hi Kyrill, On 17 January 2018 at 12:27, Kyrill Tkachov wrote: Hi all, This is an awkward testsuite failure. The original bug was that we were failing to put out the conditional code in the conditional form of the STL instruction (oops!). So we wanted to output STLNE, but instead output STL. The testacase relies on if-conversion to conditionalise the insn for STL. However, ever since r251643 the expansion of a non-relaxed atomic store always includes a compiler barrier. That blocks if-conversion in all cases. So there's no easy to get to a conditional STL instruction from a C program. But we do want to test for the original bug fix that if the RTL insn for STL is conditionalised it should output the conditional code. The solution in this patch is to convert the test into an RTL test with the COND_EXEC form of the STL insn and scan the assembly output there. This seems to work fine, and gives us an opportunity to create a gcc.dg/rtl/arm directory in the RTL tests. This now makes the gcc.target/arm/stl-cond.c disappear (as the test is deleted) and the new test in gcc.dg/rtl/arm/stl-cond.c passes. Committing to trunk. Thanks, Kyrill 2018-01-17 Kyrylo Tkachov * gcc.dg/rtl/arm/stl-cond.c: New test. * gcc.target/arm/stl-cond.c: Delete. I've noticed that the new test is unsupported on armeb, is this intentional? I think it can PASS on armeb. I guess it's just a matter of changing the target in /* { dg-do compile { target arm-*-* } } */ to arm*-*-*. Such a patch is pre-approved. Kyrill Christophe
Re: [PATCH] PR82964: Fix 128-bit immediate ICEs
Hi Wilco, On 17 January 2018 at 17:22, Wilco Dijkstra wrote: > James Greenhalgh wrote: > >> - /* Do not allow wide int constants - this requires support in movti. */ >> + /* Only allow simple 128-bit immediates. */ >>if (CONST_WIDE_INT_P (x)) >> -return false; >> +return aarch64_mov128_immediate (x); > >> I can see why this could be correct, but it is unclear why it is neccessary >> to fix the bug. What goes wrong if we leave this as "return false". > > It's not necessary, things only go wrong if you return true for a wider set of > immediates than those directly supported by the movti pattern - and that may > be a regalloc issue. > > However removing it (returning false in all cases) actually improves code > quality > due to a bug in memset expansion. Therefore I'll commit it as returning false > for now (there was no change in test results) and update it once memset is > fixed > and inlining works as expected. > > Returning true means memset(p, 32, 63) expands as: > > mov x2, 2314885530818453536 > mov x3, 2314885530818453536 > mov x6, 2314885530818453536 > mov w5, 538976288 > mov w4, 8224 > mov w1, 32 > stp x2, x3, [x0] > stp x2, x3, [x0, 16] > stp x2, x3, [x0, 32] > str x6, [x0, 48] > str w5, [x0, 56] > strhw4, [x0, 60] > strbw1, [x0, 62] > ret > > Wilco > > After this patch (r256800), I have noticed new failures on aarch64: gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times mov\tw[0-9]+, #?19520 3 (found 0 times) gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times movi\tv[0-9]+\\.2s, 0x4c, lsl 8 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times movi\tv[0-9]+\\.2s, 0xbc, lsl 8 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times fmov\th[0-9], w[0-9]+ 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times mov\tw[0-9]+, 19520 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times movi\tv[0-9]+\\.2s, 0x5c, lsl 8 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times movi\tv[0-9]+\\.2s, 0x7c, lsl 8 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times movi\tv[0-9]+\\.2s, 0x80, lsl 8 1 (found 0 times) gcc.target/aarch64/f16_mov_immediate_2.c scan-assembler-times movi\tv[0-9]+\\.4h, ?#0 1 (found 0 times) Christophe
Re: [PATCH] lto, testsuite: Fix ICE in -Wodr (PR lto/83121)
Hi David, On 15 January 2018 at 11:09, Richard Biener wrote: > On Mon, Jan 8, 2018 at 8:36 PM, David Malcolm wrote: >> On Sat, 2018-01-06 at 08:44 +0100, Richard Biener wrote: >>> On January 5, 2018 11:55:11 PM GMT+01:00, David Malcolm >> hat.com> wrote: >>> > On Fri, 2018-01-05 at 10:36 +0100, Richard Biener wrote: >>> > > On Thu, Jan 4, 2018 at 10:52 PM, David Malcolm >> > > om> >>> > > wrote: >>> > > > PR lto/83121 reports an ICE deep inside the linemap code when >>> > > > -Wodr >>> > > > reports on a type mismatch. >>> > > > >>> > > > The root cause is that the warning can access the >>> > > > DECL_SOURCE_LOCATION >>> > > > of a streamed-in decl before the lto_location_cache has been >>> > > > applied. >>> > > > >>> > > > lto_location_cache::input_location stores >>> > > > RESERVED_LOCATION_COUNT >>> > > > (==2) >>> > > > as a poison value until the cache is applied: >>> > > > 250 /* Keep value RESERVED_LOCATION_COUNT in *loc as >>> > > > linemap >>> > > > lookups will >>> > > > 251 ICE on it. */ >>> > > > >>> > > > The fix is relatively simple: apply the cache before reading >>> > > > the >>> > > > DECL_SOURCE_LOCATION. >>> > > > >>> > > > (I wonder if we should instead have a INVALID_LOCATION value to >>> > > > handle >>> > > > this case more explicitly? e.g. 0x? or reserve 2 in >>> > > > libcpp for >>> > > > that purpose, and have the non-reserved locations start at >>> > > > 3? Either >>> > > > would be more invasive, though) >>> > > > >>> > > > Triggering the ICE was fiddly: it seems to be affected by many >>> > > > things, >>> > > > including the order of files, and (I think) by filenames. My >>> > > > theory is >>> > > > that it's affected by the ordering of the tree nodes in the LTO >>> > > > stream: >>> > > > for the ICE to occur, the types in question need to be compared >>> > > > before >>> > > > some other operation flushes the lto_location_cache. This >>> > > > ordering >>> > > > is affected by the hash-based ordering in DFS in lto-streamer- >>> > > > out.c, which >>> > > > might explain why r255066 seemed to trigger the bug; the only >>> > > > relevant >>> > > > change to LTO there seemed to be: >>> > > > * lto-streamer-out.c (hash_tree): Hash TYPE_EMPTY_P and >>> > > > DECL_PADDING_P. >>> > > > If so, then the bug was presumably already present, but hidden. >>> > > > >>> > > > The patch also adds regression test coverage for the ICE, which >>> > > > is >>> > > > more >>> > > > involved - as far as I can tell, we don't have an existing way >>> > > > to >>> > > > verify >>> > > > diagnostics emitted during link-time optimization. >>> > > > >>> > > > Hence the patch adds some machinery to lib/lto.exp to support >>> > > > two >>> > > > new >>> > > > directives: dg-lto-warning and dg-lto-message, corresponding to >>> > > > dg-warning and dg-message respectively, where the diagnostics >>> > > > are >>> > > > expected to be emitted at link-time. >>> > > > >>> > > > The test case includes examples of LTO warnings and notes in >>> > > > both >>> > > > the >>> > > > primary and secondary source files >>> > > > >>> > > > Doing so required reusing the logic from DejaGnu for handling >>> > > > diagnostics. >>> > > > Unfortunately the pertinent code is a 50 line loop within a >>> > > > ~200 >>> > > > line Tcl >>> > > > function in dg.exp (dg-test), so I had to copy it from DejaGnu, >>> > > > making >>> > > > various changes as necessary (see >>> > > > lto_handle_diagnostics_for_file >>> > > > in the >>> > > > patch; for example the LTO version supports multiple source >>> > > > files, >>> > > > identifying which source file emitted a diagnostic). >>> > > > >>> > > > For non-LTO diagnostics we currently ignore surplus "note" >>> > > > diagnostics. >>> > > > This patch updates lto_prune_warns to follow this behavior >>> > > > (since >>> > > > otherwise we'd need numerous dg-lto-message directives for the >>> > > > motivating >>> > > > test case). >>> > > > >>> > > > The patch adds these PASS results to g++.sum: >>> > > > >>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_0.o assemble, -O0 -flto >>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_1.o assemble, -O0 -flto >>> > > > PASS: g++.dg/lto/pr83121 (test for LTO warnings, pr83121_0.C >>> > > > line >>> > > > 6) >>> > > > PASS: g++.dg/lto/pr83121 (test for LTO warnings, pr83121_0.C >>> > > > line >>> > > > 8) >>> > > > PASS: g++.dg/lto/pr83121 (test for LTO warnings, pr83121_1.C >>> > > > line >>> > > > 2) >>> > > > PASS: g++.dg/lto/pr83121 (test for LTO warnings, pr83121_1.C >>> > > > line >>> > > > 3) >>> > > > PASS: g++.dg/lto/pr83121 cp_lto_pr83121_0.o-cp_lto_pr83121_1.o >>> > > > link, -O0 -flto >>> > > > >>> > > > The output for dg-lto-message above refers to "warnings", >>> > > > rather >>> > > > than >>> > > > "messages" but that's the same as for the non-LTO case, where >>> > > > dg- >>> > > > message >>> > > > also refers to "warnings". >>> > > > >>> > > > Successfully bootstrapped®rtested on x86_64-pc-l
Re: [PR c++/83160] local ref to capture
On 01/17/2018 01:44 PM, Jason Merrill wrote: This makes sense to me. But I think we'd want also that flag set on the ck_identity inside the ck_base that direct_reference_binding creates, so setting it first rather than in an else. Ah yes, you spotted the bit I failed to mention. (I think we get away with it because class objects always have sufficient lvalueness, but why lie?) nathan -- Nathan Sidwell 2018-01-18 Nathan Sidwell PR c++/83160 * cp-tree.h (mark_use): Declare. * expr.c (mark_use): Make extern. * call.c (direct_reference_binding): Set inner conv's rvaluedness_matches_p, if it is an identity. (convert_like_real): Mark lvalue or rvalue use for identity as rvaledness_matches_p demands. PR c++/83160 * g++.dg/cpp0x/pr83160.C: New. Index: cp/call.c === --- cp/call.c (revision 256841) +++ cp/call.c (working copy) @@ -102,7 +102,8 @@ struct conversion { being bound to an lvalue expression or an rvalue reference is being bound to an rvalue expression. If KIND is ck_rvalue, true when we are treating an lvalue as an rvalue (12.8p33). If - KIND is ck_base, always false. */ + KIND is ck_base, always false. If ck_identity, we will be + binding a reference directly. */ BOOL_BITFIELD rvaluedness_matches_p: 1; BOOL_BITFIELD check_narrowing: 1; /* The type of the expression resulting from the conversion. */ @@ -1476,6 +1477,10 @@ direct_reference_binding (tree type, con t = TREE_TYPE (type); + if (conv->kind == ck_identity) +/* Mark the identity conv as to not decay to rvalue. */ +conv->rvaluedness_matches_p = true; + /* [over.ics.rank] When a parameter of reference type binds directly @@ -1501,6 +1506,7 @@ direct_reference_binding (tree type, con That way, convert_like knows not to generate a temporary. */ conv->need_temporary_p = false; } + return build_conv (ck_ref_bind, type, conv); } @@ -6800,7 +6806,9 @@ convert_like_real (conversion *convs, tr else gcc_unreachable (); } - expr = mark_rvalue_use (expr); + expr = mark_use (expr, /*rvalue_p=*/!convs->rvaluedness_matches_p, + /*read_p=*/true, UNKNOWN_LOCATION, + /*reject_builtin=*/true); if (type_unknown_p (expr)) expr = instantiate_type (totype, expr, complain); Index: cp/cp-tree.h === --- cp/cp-tree.h (revision 256841) +++ cp/cp-tree.h (working copy) @@ -6328,6 +6328,9 @@ extern tree create_try_catch_expr /* in expr.c */ extern tree cplus_expand_constant (tree); +extern tree mark_use (tree expr, bool rvalue_p, bool read_p, + location_t = UNKNOWN_LOCATION, + bool reject_builtin = true); extern tree mark_rvalue_use (tree, location_t = UNKNOWN_LOCATION, bool reject_builtin = true); Index: cp/expr.c === --- cp/expr.c (revision 256841) +++ cp/expr.c (working copy) @@ -89,7 +89,7 @@ cplus_expand_constant (tree cst) /* We've seen an actual use of EXPR. Possibly replace an outer variable reference inside with its constant value or a lambda capture. */ -static tree +tree mark_use (tree expr, bool rvalue_p, bool read_p, location_t loc /* = UNKNOWN_LOCATION */, bool reject_builtin /* = true */) Index: testsuite/g++.dg/cpp0x/pr83160.C === --- testsuite/g++.dg/cpp0x/pr83160.C (revision 0) +++ testsuite/g++.dg/cpp0x/pr83160.C (working copy) @@ -0,0 +1,33 @@ +// { dg-do run { target c++11 } } +// PR c++/83160 failed to capture as lvalue + +int main () +{ + const int a = 0; + + if (![&a] (const int *p) + { + const int &b = a; + // We should bind to the outer a + return &b == p; + } (&a)) +return 1; + + if (![&] (const int *p) + { + const int &b = a; + // We should bind to the outer a + return &b == p; + } (&a)) +return 2; + + if ([=] (const int *p) + { + const int &b = a; + // We should bind to the captured instance + return &b == p; + }(&a)) +return 3; + + return 0; +}
[COMMITTED] Add myself to MAINTAINERS (write after approval)
ChangeLog: 2018-01-18 Boris Kolpackov * MAINTAINERS (write after approval): Add myself. --- MAINTAINERS (revision 256843) +++ MAINTAINERS (working copy) @@ -454,6 +454,7 @@ Michael Koch Nicolas Koenig Kaz Kojima +Boris Kolpackov Dave Korn Julia Koval Matt Kraai
Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()
On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw wrote: > > This patch adds generic support for the new builtin > __builtin_speculation_safe_load. It provides the overloading of the > different access sizes and a default fall-back expansion for targets > that do not support a mechanism for inhibiting speculation. + if (ignore) +{ + warning_at (input_location, 0, + "result of __builtin_speculation_safe_load must be used to " + "ensure correct operation"); + target = NULL; +} This warning cannot be disabled but these kind of cases could appear via path isolation or missed optimizations since memory optimizations do not recognize such speculative loads. So - should it not be emitted way earlier instead, like from the FEs or during some early warning pass? In which case it could be an error as well, no? + set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode), + get_pointer_alignment (arg0))); err... so there's no way to do an unaligned speculation safe load? Using just get_pointer_alignment would be safe here. Other builtins/IFNs explicitely pass down the alignment and FEs generate such alignment from their language constraints. + set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0; sorry, but pointer types are arbitrary in GIMPLE so this is clearly wrong. Without more information you have to use zero. Thus, __builtin_speculation_safe_load ((int *)p, ...); will do the wrong thing if the type of p is not int *. + /* Mark the memory access as volatile. We don't want the optimizers to + move it or otherwise substitue an alternative value. */ + MEM_VOLATILE_P (mem) = 1; how is moving or substituting in any way related to inhibiting speculation? Why's this done for all targets rather than only those that need any such mitigation? I btw miss implementation for x86_64 (at least) where the agreed upon mitigation is to insert 'lfence' before the load. Any reason the builtin expansion is not fully left to the targets? +/* Suppressing speculation. Users are expected to use the first (N) + variant, which will be translated internally into one of the other + types. */ +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load", +BT_FN_VOID_VAR, ATTR_NULL) + +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, "speculation_safe_load_1", +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, "speculation_safe_load_2", +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) any reason you are not simply using an internal function for the non-_N variants? Looks like you're closely following the atomic/sync builtins but those predate IFNs I think. They also have more constraints on alignment and use a restrictive alias set. +rtx +default_speculation_safe_load (machine_mode mode ATTRIBUTE_UNUSED, + rtx result, rtx mem, rtx lower_bound, + rtx upper_bound, rtx cmpptr, bool warn) +{ + rtx_code_label *done_label = gen_label_rtx (); + rtx_code_label *inrange_label = gen_label_rtx (); + + if (warn) +warning_at + (input_location, 0, + "this target does not support anti-speculation operations. " + "Your program will still execute correctly, but speculation " + "will not be inhibited"); so there's no way to inhibit this warning than changing all targets? Iff it is correct for a target to expand this to an unconditional load with a preceeding fence why not have the generic target hook implementation do exactly that - emit an unconditional move? Maybe I misunderstand the clear words in the hook docs, but the builtin docs also say the behavior is undefined if the builtin is executed when the condition is false - which means traps are allowed. Richard. > * builtin-types.def (BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): > New builtin type signature. > (BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. > (BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. > (BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. > (BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. > * builtins.def (BUILT_IN_SPECULATION_SAFE_LOAD_N): New builtin. > (BUILT_IN_SPECULATION_SAFE_LOAD_1): Likewise. > (BUILT_IN_SPECULATION_SAFE_LOAD_2): Likewise. > (BUILT_IN_SPECULATION_SAFE_LOAD_4): Likewise. > (BUILT_IN_SPECULATION_SAFE_LOAD_8): Likewise. > (BUILT_IN_SPECULATION_SAFE_LOAD_16): Likewise. > * target.def (speculation_safe_load): New hook. > * doc/tm.texi.in (TARGET_SPECULATION_SAFE_LOAD): Add to > documentation. > * doc/tm.texi: Regenerated. > * doc/cpp.texi: Document __HAVE_SPECULATION_SAFE_LOAD. > * doc/extend.texi: Document __builtin_speculation_safe_
Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()
On Thu, Jan 18, 2018 at 1:44 PM, Richard Biener wrote: > On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw > wrote: >> >> This patch adds generic support for the new builtin >> __builtin_speculation_safe_load. It provides the overloading of the >> different access sizes and a default fall-back expansion for targets >> that do not support a mechanism for inhibiting speculation. > > + if (ignore) > +{ > + warning_at (input_location, 0, > + "result of __builtin_speculation_safe_load must be used to " > + "ensure correct operation"); > + target = NULL; > +} > > This warning cannot be disabled but these kind of cases could appear via > path isolation or missed optimizations since memory optimizations do not > recognize such speculative loads. So - should it not be emitted way > earlier instead, like from the FEs or during some early warning pass? > > In which case it could be an error as well, no? > > + set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode), > + get_pointer_alignment (arg0))); > > err... so there's no way to do an unaligned speculation safe load? Using > just get_pointer_alignment would be safe here. Other builtins/IFNs > explicitely > pass down the alignment and FEs generate such alignment from their > language constraints. > > + set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0; > > sorry, but pointer types are arbitrary in GIMPLE so this is clearly > wrong. Without > more information you have to use zero. Thus, > > __builtin_speculation_safe_load ((int *)p, ...); > > will do the wrong thing if the type of p is not int *. > > + /* Mark the memory access as volatile. We don't want the optimizers to > + move it or otherwise substitue an alternative value. */ > + MEM_VOLATILE_P (mem) = 1; > > how is moving or substituting in any way related to inhibiting speculation? > Why's this done for all targets rather than only those that need any such > mitigation? > > I btw miss implementation for x86_64 (at least) where the agreed upon > mitigation > is to insert 'lfence' before the load. > > Any reason the builtin expansion is not fully left to the targets? > > +/* Suppressing speculation. Users are expected to use the first (N) > + variant, which will be translated internally into one of the other > + types. */ > +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load", > +BT_FN_VOID_VAR, ATTR_NULL) > + > +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, "speculation_safe_load_1", > +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) > +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, "speculation_safe_load_2", > +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) > > any reason you are not simply using an internal function for the > non-_N variants? > Looks like you're closely following the atomic/sync builtins but those > predate IFNs I think. > They also have more constraints on alignment and use a restrictive alias set. > > +rtx > +default_speculation_safe_load (machine_mode mode ATTRIBUTE_UNUSED, > + rtx result, rtx mem, rtx lower_bound, > + rtx upper_bound, rtx cmpptr, bool warn) > +{ > + rtx_code_label *done_label = gen_label_rtx (); > + rtx_code_label *inrange_label = gen_label_rtx (); > + > + if (warn) > +warning_at > + (input_location, 0, > + "this target does not support anti-speculation operations. " > + "Your program will still execute correctly, but speculation " > + "will not be inhibited"); > > so there's no way to inhibit this warning than changing all targets? > > Iff it is correct for a target to expand this to an unconditional load with a > preceeding fence why not have the generic target hook implementation > do exactly that - emit an unconditional move? Maybe I misunderstand > the clear words in the hook docs, but the builtin docs also say > the behavior is undefined if the builtin is executed when the condition > is false - which means traps are allowed. Oh, and I think I'd like to see the builtins folded to regular MEMs very early if the target doesn't provide TARGET_SPECULATION_SAFE_LOAD. Ideally directly from where _n is disambiguated in the FE. Richard. > Richard. > > >> * builtin-types.def (BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): >> New builtin type signature. >> (BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. >> (BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. >> (BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. >> (BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. >> * builtins.def (BUILT_IN_SPECULATION_SAFE_LOAD_N): New builtin. >> (BUILT_IN_SPECULATION_SAFE_LOAD_1): Likewise. >> (BUILT_IN_SPECULATION_SAFE_LOAD_2): Likewise. >> (BUILT_IN_SPECULATION_SAF
[PATCH] Small retpoline clean-up.
Hello. The patch is small clean-up of retpoline stuff. I know that H.J. is planning to introduce an option driving which retpoline loop filler is selected. Hopefully the suggested patch clarifies current default selection of the filler. Patch survives make check -k -j10 RUNTESTFLAGS="i386.exp". Ready to install? Thanks, Martin gcc/ChangeLog: 2018-01-18 Martin Liska * config/i386/i386.c (indirect_thunk_name): Document that also lfence is emitted. (output_indirect_thunk): Document why both instructions (pause and lfence) are generated. gcc/testsuite/ChangeLog: 2018-01-18 Martin Liska * gcc.target/i386/indirect-thunk-3.c: Remove duplicate options. * gcc.target/i386/indirect-thunk-4.c: Likewise. * gcc.target/i386/indirect-thunk-6.c: Likewise. * gcc.target/i386/indirect-thunk-bnd-4.c: Likewise. * gcc.target/i386/ret-thunk-10.c: Likewise. * gcc.target/i386/ret-thunk-11.c: Likewise. * gcc.target/i386/ret-thunk-12.c: Likewise. * gcc.target/i386/ret-thunk-15.c: Likewise. * gcc.target/i386/ret-thunk-9.c: Likewise. --- gcc/config/i386/i386.c | 5 - gcc/testsuite/gcc.target/i386/indirect-thunk-3.c | 2 +- gcc/testsuite/gcc.target/i386/indirect-thunk-4.c | 2 +- gcc/testsuite/gcc.target/i386/indirect-thunk-6.c | 2 +- gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c | 2 +- gcc/testsuite/gcc.target/i386/ret-thunk-10.c | 2 +- gcc/testsuite/gcc.target/i386/ret-thunk-11.c | 2 +- gcc/testsuite/gcc.target/i386/ret-thunk-12.c | 2 +- gcc/testsuite/gcc.target/i386/ret-thunk-15.c | 2 +- gcc/testsuite/gcc.target/i386/ret-thunk-9.c | 2 +- 10 files changed, 13 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index cb0f4ed5e6a..72d25ae4f72 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -10826,6 +10826,7 @@ indirect_thunk_name (char name[32], unsigned int regno, call L2 L1: pause + lfence jmp L1 L2: mov %REG, (%sp) @@ -10837,6 +10838,7 @@ indirect_thunk_name (char name[32], unsigned int regno, call L2 L1: pause + lfence jmp L1 L2: lea WORD_SIZE(%sp), %sp @@ -10864,7 +10866,8 @@ output_indirect_thunk (bool need_bnd_p, unsigned int regno) ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1); - /* Pause + lfence. */ + /* AMD and Intel CPUs prefer each a different instruction as loop filler. + Usage of both pause + lfence is compromise solution. */ fprintf (asm_out_file, "\tpause\n\tlfence\n"); /* Jump. */ diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c index 9e24a385387..dab7ac2ef25 100644 --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mno-indirect-branch-register -mno-indirect-branch-register -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */ +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */ typedef void (*dispatch_t)(long offset); diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c index 127b5d94523..44cc5f52f45 100644 --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mno-indirect-branch-register -mno-indirect-branch-register -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */ +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */ typedef void (*dispatch_t)(long offset); diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c index e4649283d10..1490f5336ca 100644 --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c @@ -1,5 +1,5 @@ /* { dg-do compile { target *-*-linux* } } */ -/* { dg-options "-O2 -mno-indirect-branch-register -mno-indirect-branch-register -mno-indirect-branch-register -mfunction-return=keep -fpic -fno-plt -mindirect-branch=thunk" } */ +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep -fpic -fno-plt -mindirect-branch=thunk" } */ extern void bar (void); diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c index 0e19830de4d..a5fc4a23351 100644 --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { *-*-linux* && { ! x32 } } } } */ -/* { dg-options "-O2 -mno-indirect-branch-register -mno-in
Re: [PATCH] Small retpoline clean-up.
> Hello. > > The patch is small clean-up of retpoline stuff. I know that H.J. is planning > to introduce an option driving which retpoline loop filler is selected. > Hopefully the suggested patch clarifies current default selection of > the filler. > > Patch survives make check -k -j10 RUNTESTFLAGS="i386.exp". > > Ready to install? > Thanks, > Martin > > > gcc/ChangeLog: > > 2018-01-18 Martin Liska > > * config/i386/i386.c (indirect_thunk_name): Document that also > lfence is emitted. > (output_indirect_thunk): Document why both instructions > (pause and lfence) are generated. OK, thanks! Honza > > gcc/testsuite/ChangeLog: > > 2018-01-18 Martin Liska > > * gcc.target/i386/indirect-thunk-3.c: Remove duplicate options. > * gcc.target/i386/indirect-thunk-4.c: Likewise. > * gcc.target/i386/indirect-thunk-6.c: Likewise. > * gcc.target/i386/indirect-thunk-bnd-4.c: Likewise. > * gcc.target/i386/ret-thunk-10.c: Likewise. > * gcc.target/i386/ret-thunk-11.c: Likewise. > * gcc.target/i386/ret-thunk-12.c: Likewise. > * gcc.target/i386/ret-thunk-15.c: Likewise. > * gcc.target/i386/ret-thunk-9.c: Likewise. > --- > gcc/config/i386/i386.c | 5 - > gcc/testsuite/gcc.target/i386/indirect-thunk-3.c | 2 +- > gcc/testsuite/gcc.target/i386/indirect-thunk-4.c | 2 +- > gcc/testsuite/gcc.target/i386/indirect-thunk-6.c | 2 +- > gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c | 2 +- > gcc/testsuite/gcc.target/i386/ret-thunk-10.c | 2 +- > gcc/testsuite/gcc.target/i386/ret-thunk-11.c | 2 +- > gcc/testsuite/gcc.target/i386/ret-thunk-12.c | 2 +- > gcc/testsuite/gcc.target/i386/ret-thunk-15.c | 2 +- > gcc/testsuite/gcc.target/i386/ret-thunk-9.c | 2 +- > 10 files changed, 13 insertions(+), 10 deletions(-) > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index cb0f4ed5e6a..72d25ae4f72 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -10826,6 +10826,7 @@ indirect_thunk_name (char name[32], unsigned int > regno, > callL2 > L1: > pause > + lfence > jmp L1 > L2: > mov %REG, (%sp) > @@ -10837,6 +10838,7 @@ indirect_thunk_name (char name[32], unsigned int > regno, > call L2 >L1: > pause > + lfence > jmp L1 >L2: > lea WORD_SIZE(%sp), %sp > @@ -10864,7 +10866,8 @@ output_indirect_thunk (bool need_bnd_p, unsigned int > regno) > >ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1); > > - /* Pause + lfence. */ > + /* AMD and Intel CPUs prefer each a different instruction as loop filler. > + Usage of both pause + lfence is compromise solution. */ >fprintf (asm_out_file, "\tpause\n\tlfence\n"); > >/* Jump. */ > diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c > b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c > index 9e24a385387..dab7ac2ef25 100644 > --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c > +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-3.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -mno-indirect-branch-register > -mno-indirect-branch-register -mno-indirect-branch-register > -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */ > +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep > -mindirect-branch=thunk -fno-pic" } */ > > typedef void (*dispatch_t)(long offset); > > diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c > b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c > index 127b5d94523..44cc5f52f45 100644 > --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c > +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-4.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -mno-indirect-branch-register > -mno-indirect-branch-register -mno-indirect-branch-register > -mfunction-return=keep -mindirect-branch=thunk -fno-pic" } */ > +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep > -mindirect-branch=thunk -fno-pic" } */ > > typedef void (*dispatch_t)(long offset); > > diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c > b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c > index e4649283d10..1490f5336ca 100644 > --- a/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c > +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-6.c > @@ -1,5 +1,5 @@ > /* { dg-do compile { target *-*-linux* } } */ > -/* { dg-options "-O2 -mno-indirect-branch-register > -mno-indirect-branch-register -mno-indirect-branch-register > -mfunction-return=keep -fpic -fno-plt -mindirect-branch=thunk" } */ > +/* { dg-options "-O2 -mno-indirect-branch-register -mfunction-return=keep > -fpic -fno-plt -mindirect-branch=thunk" } */ > > extern void bar (void); > > diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-bnd-4.c > b/gcc/testsuite/gcc.target/i3
Re: [PATCH][arm] Fix gcc.target/arm/xor-and.c
On 17 January 2018 at 12:11, Kyrill Tkachov wrote: > Hi all, > > This test is naughty because it doesn't use the proper effective target > checks > and add-options mechanisms for setting a Thumb1 target, which leads to > Thumb1 hard-float errors > when testing a toolchain configured with --with-cpu=cortex-a15 > --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb. > > This patch fixes that in the obvious way. > > Committing to trunk. > Thanks, > Kyrill > > 2018-01-17 Kyrylo Tkachov > > * gcc.target/arm/xor-and.c: Fix armv6 effective target checks > and options. There was a problem with the order of dg-options and dg-add-options, which I fixed as a obvious at r256845. Christophe 2018-01-18 Christophe Lyon * gcc.target/arm/xor-and.c: Fix dg-options and dg-add-options order. Index: gcc/testsuite/gcc.target/arm/xor-and.c === --- gcc/testsuite/gcc.target/arm/xor-and.c (revision 256844) +++ gcc/testsuite/gcc.target/arm/xor-and.c (revision 256845) @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_arch_v6_ok } */ +/* { dg-options "-O" } */ /* { dg-add-options arm_arch_v6 } */ -/* { dg-options "-O" } */ unsigned short foo (unsigned short x) {
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On 2018-01-17 3:07 PM, Jakub Jelinek wrote: PR83882 complains that PR81715 testcase fails on callee copies parameter targets. The following patch ought to fix that, but I have only bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase with hppa. John, do you think you could test this on hppa without the callee copies default change? On hppa2.0w-hp-hpux11.11, pr81715.C now passes. However, there are some new fails in the gcc and g++ suites: FAIL: g++.dg/torture/stackalign/throw-1.C -O0 execution test FAIL: g++.dg/torture/stackalign/throw-1.C -O0 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-1.C -O1 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-1.C -O2 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-1.C -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test FAIL: g++.dg/torture/stackalign/throw-1.C -O3 -g -fpic execution test FAIL: g++.dg/torture/stackalign/throw-1.C -Os -fpic execution test FAIL: g++.dg/torture/stackalign/throw-2.C -O0 execution test FAIL: g++.dg/torture/stackalign/throw-2.C -O0 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-2.C -O1 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-2.C -O2 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-2.C -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test FAIL: g++.dg/torture/stackalign/throw-2.C -O3 -g -fpic execution test FAIL: g++.dg/torture/stackalign/throw-2.C -Os -fpic execution test FAIL: g++.dg/torture/stackalign/throw-4.C -O0 execution test FAIL: g++.dg/torture/stackalign/throw-4.C -O0 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-4.C -O1 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-4.C -O2 -fpic execution test FAIL: g++.dg/torture/stackalign/throw-4.C -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test FAIL: g++.dg/torture/stackalign/throw-4.C -O3 -g -fpic execution test FAIL: g++.dg/torture/stackalign/throw-4.C -Os -fpic execution test FAIL: gcc.dg/torture/pr52451.c -O0 execution test FAIL: gcc.dg/torture/pr52451.c -O1 execution test FAIL: gcc.dg/torture/pr52451.c -O2 execution test FAIL: gcc.dg/torture/pr52451.c -O3 -g execution test FAIL: gcc.dg/torture/pr52451.c -Os execution test FAIL: gcc.dg/compat/pr83487-1 c_compat_y_tst.o compile FAIL: gcc.dg/compat/pr83487-2 c_compat_y_tst.o compile Dave -- John David Anglin dave.ang...@bell.net
Re: [PATCH,NVPTX] Fix PR83920
On 01/18/2018 12:40 AM, Cesar Philippidis wrote: In PR83920, I encountered a nvptx bug where live predicate variables were clobbered before their value was broadcasted. Hi, I've managed to reproduce the problem based on the description in the PR. Apparently, there were problems in certain version of the CUDA driver where the JIT would generate wrong code for shfl broadcasts. Correct. And there's a work around committed for the JIT problem, which you refer to in the next line (without introducing it first). The attached patch teaches nvptx_single not to apply that workaround if the predicate register is live. > Tom, does this patch look sane to you? The fact that the cond register is live at the start of the from bb does not mean that the register can't be set inside the bb. Furthermore, the live info does not make a distinction between live-for-lane-0 and line-for-warp. So, if the condition reg is not set in the bb, but set only for lane-0 in a previous bb, then we still need to initialize lanes 1-31. So, I don't think this is the way to address this bug. I'm not sure if it defeats the purpose of your original patch. In test cases mentioned above, it does. Regardless, the live predicate registers shouldn't be clobbered before they are used. There is a bug in the workaround, that's correct. I think the way to address it is using a tmp .pred reg like so: ... { .reg .u32 %x; mov.u32 %x,%tid.x; setp.ne.u32 %rnotvzero,%x,0; } { .reg .pred %rcond2; setp.eq.u32 %rcond2, 1, 0; // workaround @%rnotvzero bra Lskip; ... setp.. %rcond,op1,op2; // could be here, could be earlier mov.b1 %rcond2, %rcond; // used pseudo opcode mov.b1 for convenience Lskip: selp.u32 %rcondu32,1,0,%rcond2; shfl.idx.b32 %rcondu32,%rcondu32,0,31; setp.ne.u32 %rcond,%rcondu32,0; } ... Unfortunately, I cannot reproduce the runtime failure with gemm example in the PR, so I didn't include it in the patch. I'm managed to modify the test-case such that it reproduces the failure with trunk (see PR). So, that test-case should be included. Thanks, - Tom However, this patch does fix the failure with da-1.c in og7. This patch does not cause any regressions. Is it OK for trunk? Thanks, Cesar nvptx-jit-relax.diff diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 55c7e3c..698c574 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -3957,6 +3957,7 @@ bb_first_real_insn (basic_block bb) static void nvptx_single (unsigned mask, basic_block from, basic_block to) { + bitmap live = DF_LIVE_IN (from); rtx_insn *head = BB_HEAD (from); rtx_insn *tail = BB_END (to); unsigned skip_mask = mask; @@ -4126,8 +4127,9 @@ nvptx_single (unsigned mask, basic_block from, basic_block to) There is nothing in the PTX spec to suggest that this is wrong, or to explain why the extra initialization is needed. So, we classify it as a JIT bug, and the extra initialization as workaround. */ - emit_insn_before (gen_movbi (pvar, const0_rtx), - bb_first_real_insn (from)); + if (!bitmap_bit_p (live, REGNO (pvar))) + emit_insn_before (gen_movbi (pvar, const0_rtx), + bb_first_real_insn (from)); #endif emit_insn_before (nvptx_gen_vcast (pvar), tail); }
[C++ Patch] PR 81013 ("[7/8 Regression] ICE with invalid union in class hierarchy")
Hi, this error recovery regression is caused by my work for c++/70202 back in 2016. At the time I remember wondering whether in the easy cases of errors right at beginning of xref_basetypes (not those in the loop below handled via dropped_base) we really wanted to proceed or immediately bail out, as far as error recovery in concerned. This new testcase shows that if we don't bail out early we may end up crashing much later, in cgraph.c. Thus the straightforward fix below. Tested x86_64-linux. Thanks, Paolo. / /cp 2018-01-18 Paolo Carlini PR c++/81013 * decl.c (xref_basetypes): Early return upon error about derived union. /testsuite 2018-01-18 Paolo Carlini PR c++/81013 * g++.dg/inherit/union3.C: New. Index: cp/decl.c === --- cp/decl.c (revision 256840) +++ cp/decl.c (working copy) @@ -13801,7 +13801,10 @@ xref_basetypes (tree ref, tree base_list) CLASSTYPE_NON_LAYOUT_POD_P (ref) = true; if (TREE_CODE (ref) == UNION_TYPE) - error ("derived union %qT invalid", ref); + { + error ("derived union %qT invalid", ref); + return; + } } if (max_bases > 1) Index: testsuite/g++.dg/inherit/union3.C === --- testsuite/g++.dg/inherit/union3.C (nonexistent) +++ testsuite/g++.dg/inherit/union3.C (working copy) @@ -0,0 +1,16 @@ +// PR c++/81013 + +struct A +{ + virtual void foo() const; +}; + +union B : A // { dg-error "derived union 'B' invalid" } +{ + void foo() const; +}; + +void bar(const B& b) +{ + b.foo(); +}
Re: [PATCH v3] Ability to remap file names in __FILE__, etc (PR other/70268)
Hello, I see Boris' patch was accepted with some further changes. Any chance my additions for reproducible builds can be included as well? They are described below. It should still apply on top, minus the strchr -> strrchr change which was folded into Boris' accepted version. Ximin Ximin Luo: > Boris Kolpackov: >> Thanks for the review. Third revision of the patch attached. >> >> [..] > Here is a follow-up patch, meant to be applied on top of Boris' patch, to > address some issues we found based on discussions and experiments at the > Reproducible Builds project. > > Most of this patch is re-implementing the patch from [1] originally by Daniel > Kahn Gillmor. Back then it was rejected in favour of a simpler approach > because > its necessity was not realised. However, more experiments since then have > revealed that its basic approach is useful in more general scenarios: > > Higher-level build scripts sometimes like to save CFLAGS etc into the build > output, making the overall build output unreproducible even if GCC is playing > nicely. Rather than add logic to strip -f{file,debug,macro,...}-prefix-map, > into all possible higher-level programs that might want to save CFLAGS, it is > simpler if GCC could read it from a user-specified environment variable. The > fact that the name of the environment variable is still present on the > command-line, should make it obvious what is being done, and avoid confusion > that might arise if an implicit "magic" environment variable is used. > > In the patch thread (see [1] and follow-ups) there was concern that the > prefix > '$' would cause awkward escaping behaviour, so the prefix 'ENV:' was chosen. > A > concern about this conflicting with AmigaOS paths was voiced but not > explored; > of course it can be changed to something else if needed. > > (NetBSD is already carrying this patch in their system GCC [2], presumably for > the same reason mentioned above.) > > [1] https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01168.html > [2] > http://cvsweb.netbsd.org/bsdweb.cgi/src/external/gpl3/gcc/dist/gcc/final.c?rev=1.2&content-type=text/x-cvsweb-markup > > The other thing that this patch does, is to split the old and new prefixs by > the last '=' sign rather than the first. This allows arbitrary old prefixes to > be mapped, which would be more flexible than mapping to arbitrary new > prefixes. > The rust compiler is adopting this strategy too: [3]. > > [3] https://github.com/rust-lang/rust/issues/41555#issuecomment-321078174 > > ChangeLogs > -- > > gcc/ChangeLog: > > 2017-12-19 Ximin Luo > > PR other/70268 > * file-prefix-map.c (add_prefix_map): Support reading old prefix from > the > environment. Split old and new parts by the last '=' rather than the > first. > * file-prefix-map.h: Define macros for the ENV: prefix. > > gcc/testsuite/ChangeLog: > > 2017-12-19 Ximin Luo > > PR other/70268 > * c-c++-common/ffile-prefix-map-env.c: New test. > > Index: gcc-8-20171217/gcc/file-prefix-map.c > === > --- gcc-8-20171217.orig/gcc/file-prefix-map.c > +++ gcc-8-20171217/gcc/file-prefix-map.c > @@ -41,16 +41,40 @@ add_prefix_map (file_prefix_map *&maps, > { >file_prefix_map *map; >const char *p; > + char *env; > + const char *old; > + size_t oldlen; > > - p = strchr (arg, '='); > + p = strrchr (arg, '='); >if (!p) > { >error ("invalid argument %qs to %qs", arg, opt); >return; > } > + if (0 == strncmp(FILE_PREFIX_MAP_ENV_PREFIX, arg, > FILE_PREFIX_MAP_ENV_PREFIX_OFFSET)) > +{ > + const char *arg_offset = arg + FILE_PREFIX_MAP_ENV_PREFIX_OFFSET; > + env = xstrndup (arg_offset, p - arg_offset); > + old = getenv (env); > + if (!old) > + { > + warning (0, "environment variable %qs not set in argument to %s", > +env, opt); > + free (env); > + return; > + } > + oldlen = strlen (old); > + free (env); > +} > + else > +{ > + old = xstrndup (arg, p - arg); > + oldlen = p - arg; > +} > + >map = XNEW (file_prefix_map); > - map->old_prefix = xstrndup (arg, p - arg); > - map->old_len = p - arg; > + map->old_prefix = old; > + map->old_len = oldlen; >p++; >map->new_prefix = xstrdup (p); >map->new_len = strlen (p); > Index: gcc-8-20171217/gcc/file-prefix-map.h > === > --- gcc-8-20171217.orig/gcc/file-prefix-map.h > +++ gcc-8-20171217/gcc/file-prefix-map.h > @@ -18,6 +18,9 @@ > #ifndef GCC_FILE_PREFIX_MAP_H > #define GCC_FILE_PREFIX_MAP_H > > +#define FILE_PREFIX_MAP_ENV_PREFIX "ENV:" > +#define FILE_PREFIX_MAP_ENV_PREFIX_OFFSET > (sizeof(FILE_PREFIX_MAP_ENV_PREFIX) - 1) > + > void add_macro_prefix_map (const char *); > void add_debug_prefix_map (const char *); > void add_file_prefix_map (const char *); > Index: gcc
Re: [PATCH][arm] Fix gcc.target/arm/pr40887.c directives
On 18 January 2018 at 10:25, Kyrill Tkachov wrote: > > On 18/01/18 09:15, Christophe Lyon wrote: >> >> Hi Kyrill, >> >> >> On 17 January 2018 at 12:20, Kyrill Tkachov >> wrote: >>> >>> Hi all, >>> >>> This patch converts gcc.target/arm/pr40887.c to use the proper effective >>> target check and dg-add-options for armv5te >>> so that we avoid situations where we end up trying to compile the test >>> with >>> a Thumb1 hard-float ABI, which makes the >>> compiler complain. >>> >>> This allows the test to pass gracefully for me for my compiler configured >>> with: >>> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard >>> --with-mode=thumb >>> >>> Committing to trunk. >>> >>> Thanks, >>> Kyrill >>> >>> 2018-01-17 Kyrylo Tkachov >>> >>> * gcc.target/arm/pr40887.c: Add armv5te effective target checks and >>> directives. >> >> I think you had to swap dg-options and dg-add-options directives, so I >> committed the attached patch as obvious. > > > Thanks Christophe, > I had spotted it and was planning to fix it by using dg-additional-options, > but it's the same in the end. > Hmmm after my commit, I see again the "unimplemented: Thumb-1 hard-float VFP ABI" error, because the arm_arch_v5te_ok effective target test pass as it uses a testcase containing no actual code. My old patch for check_effective_target_arm_arch_FUNC_ok would be helpful :-) (I'm posting it again separately) Christophe > Kyrill > >> Christophe > >
Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()
On 18/01/18 12:44, Richard Biener wrote: > On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw > wrote: >> >> This patch adds generic support for the new builtin >> __builtin_speculation_safe_load. It provides the overloading of the >> different access sizes and a default fall-back expansion for targets >> that do not support a mechanism for inhibiting speculation. > > + if (ignore) > +{ > + warning_at (input_location, 0, > + "result of __builtin_speculation_safe_load must be used to " > + "ensure correct operation"); > + target = NULL; > +} > > This warning cannot be disabled but these kind of cases could appear via > path isolation or missed optimizations since memory optimizations do not > recognize such speculative loads. So - should it not be emitted way > earlier instead, like from the FEs or during some early warning pass? If you think so, and can recommend where to put it. I'm not particularly familiar with the front-end code, being mostly a back-end maintainer. Otherwise we could probably drop this entirely, but that gives scope for users to do the Wrong Thing(tm)! > > In which case it could be an error as well, no? > > + set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode), > + get_pointer_alignment (arg0))); > > err... so there's no way to do an unaligned speculation safe load? Using > just get_pointer_alignment would be safe here. Other builtins/IFNs > explicitely > pass down the alignment and FEs generate such alignment from their > language constraints. No, last time I looked GCC didn't support taking the address of an unaligned object and casting pointer to char to pointer to int implies you know that it is correctly aligned. > > + set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0; > > sorry, but pointer types are arbitrary in GIMPLE so this is clearly > wrong. Without > more information you have to use zero. Thus, > > __builtin_speculation_safe_load ((int *)p, ...); > > will do the wrong thing if the type of p is not int *. Ok, I'll fix that. > > + /* Mark the memory access as volatile. We don't want the optimizers to > + move it or otherwise substitue an alternative value. */ > + MEM_VOLATILE_P (mem) = 1; > > how is moving or substituting in any way related to inhibiting speculation? > Why's this done for all targets rather than only those that need any such > mitigation? I can move that, but it seemed right to do it here. > > I btw miss implementation for x86_64 (at least) where the agreed upon > mitigation > is to insert 'lfence' before the load. /I'm/ not planning on doing implementations for other targets. I don't know those back-ends. Sorry. It really needs the attention of a port expert. > > Any reason the builtin expansion is not fully left to the targets? Because that would be a significant impediment to deploying the builtin in generic code. The compiler would have to error out if it couldn't handle the builtin at all. By providing a generic implementation we can support everything, even if you do get a warning. > > +/* Suppressing speculation. Users are expected to use the first (N) > + variant, which will be translated internally into one of the other > + types. */ > +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load", > +BT_FN_VOID_VAR, ATTR_NULL) > + > +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, "speculation_safe_load_1", > +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) > +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, "speculation_safe_load_2", > +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) > > any reason you are not simply using an internal function for the > non-_N variants? > Looks like you're closely following the atomic/sync builtins but those > predate IFNs I think. > They also have more constraints on alignment and use a restrictive alias set. Wouldn't that make it more difficult to do backports to older versions of GCC? I know some folk are interested in going back as far as gcc-4.8, and maybe even earlier. > > +rtx > +default_speculation_safe_load (machine_mode mode ATTRIBUTE_UNUSED, > + rtx result, rtx mem, rtx lower_bound, > + rtx upper_bound, rtx cmpptr, bool warn) > +{ > + rtx_code_label *done_label = gen_label_rtx (); > + rtx_code_label *inrange_label = gen_label_rtx (); > + > + if (warn) > +warning_at > + (input_location, 0, > + "this target does not support anti-speculation operations. " > + "Your program will still execute correctly, but speculation " > + "will not be inhibited"); > > so there's no way to inhibit this warning than changing all targets? No. It's trivial to bind this to a hook that just passes all the other arguments through to the generic code with the warning suppressed - we could even add
[ARM,testsuite] Make arm_arch_FUNC_ok more robust
Hi, I'm resurrecting a patch I posted in 2012-09 :) effective_target_arm_arch_FUNC_ok currently contains only #if !defined (DEF) #error FOO #endif which is not sufficient for GCC to complain in case of an unsupported flags combination. For instance, -mfloat-abi=hard -mthumb -march=armv5te leads to: sorry, unimplemented: Thumb-1 hard-float VFP ABI but only if the compiler had actual code to compile. Adding a dummy main() body in the arm_arch_FUNC_ok effective target does the trick: with this patch, this effective target test would fail, making the test unsupported rather than failing later. OK? Christophe 2018-01-18 Christophe Lyon gcc/testsuite: * lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok): Add function body to force error messages in some configurations. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index f597316..bc7d451 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2211,6 +2211,11 @@ foreach { armfunc armflag armdef } { v4 "-march=armv4 -marm" __ARM_ARCH_4__ #if !defined (DEF) #error FOO #endif + int + main (void) + { + return 0; + } } "FLAG" ] }
Re: [ARM,testsuite] Make arm_arch_FUNC_ok more robust
Hi Christophe, On 18/01/18 14:08, Christophe Lyon wrote: Hi, I'm resurrecting a patch I posted in 2012-09 :) That's the month I first started working on GCC :) effective_target_arm_arch_FUNC_ok currently contains only #if !defined (DEF) #error FOO #endif which is not sufficient for GCC to complain in case of an unsupported flags combination. For instance, -mfloat-abi=hard -mthumb -march=armv5te leads to: sorry, unimplemented: Thumb-1 hard-float VFP ABI but only if the compiler had actual code to compile. Adding a dummy main() body in the arm_arch_FUNC_ok effective target does the trick: with this patch, this effective target test would fail, making the test unsupported rather than failing later. OK? Ok. The arm__ok checks should really be able to tell us that we can at least get to an assembly result if we add the option in question. The quirk with the Thumb1 hard-float error is that it only appears when compiling a function, because it's part of the ABI-conformance code that picks where to put the result and arguments of a function. If we have a source-file that only checks preprocessor macros it will not trigger. Anyway, thanks for sending this, it will clean up some ugly FAILs in arm.exp. Kyrill Christophe
[PATCH][arm] Fix gcc.target/arm/pr70278.c
Hi all, This test needs an armv4t Thumb1 target but doesn't add the right effective target checks. This patch adds them so the test is skipped appropriately on Thumb2 hard-float tarets. Committing to trunk. Thanks, Kyrill 2018-01-18 Kyrylo Tkachov * gcc.target/arm/pr70278.c: Add effective target checks for armv4t and arm_thumb1_ok. diff --git a/gcc/testsuite/gcc.target/arm/pr70278.c b/gcc/testsuite/gcc.target/arm/pr70278.c index c44c07b7b59f906e27ee3046a09988afb665eed5..97034166f46ee7d19b7c6819a67c58d7ab580b1a 100644 --- a/gcc/testsuite/gcc.target/arm/pr70278.c +++ b/gcc/testsuite/gcc.target/arm/pr70278.c @@ -1,6 +1,8 @@ /* { dg-do compile } */ /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv4t" } } */ /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */ +/* { dg-require-effective-target arm_arch_v4t_ok } */ +/* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-options "-mthumb" } */ /* { dg-add-options arm_arch_v4t } */ /*
Re: [PATCH 1/3] [builtins] Generic support for __builtin_speculation_safe_load()
On Thu, Jan 18, 2018 at 3:07 PM, Richard Earnshaw (lists) wrote: > On 18/01/18 12:44, Richard Biener wrote: >> On Wed, Jan 17, 2018 at 3:55 PM, Richard Earnshaw >> wrote: >>> >>> This patch adds generic support for the new builtin >>> __builtin_speculation_safe_load. It provides the overloading of the >>> different access sizes and a default fall-back expansion for targets >>> that do not support a mechanism for inhibiting speculation. >> >> + if (ignore) >> +{ >> + warning_at (input_location, 0, >> + "result of __builtin_speculation_safe_load must be used to >> " >> + "ensure correct operation"); >> + target = NULL; >> +} >> >> This warning cannot be disabled but these kind of cases could appear via >> path isolation or missed optimizations since memory optimizations do not >> recognize such speculative loads. So - should it not be emitted way >> earlier instead, like from the FEs or during some early warning pass? > > If you think so, and can recommend where to put it. I'm not > particularly familiar with the front-end code, being mostly a back-end > maintainer. Otherwise we could probably drop this entirely, but that > gives scope for users to do the Wrong Thing(tm)! I'd say the same place you do the __builtin_X -> __builtin_X_{1,2,4,8} disambiguation. >> >> In which case it could be an error as well, no? >> >> + set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode), >> + get_pointer_alignment (arg0))); >> >> err... so there's no way to do an unaligned speculation safe load? Using >> just get_pointer_alignment would be safe here. Other builtins/IFNs >> explicitely >> pass down the alignment and FEs generate such alignment from their >> language constraints. > > No, last time I looked GCC didn't support taking the address of an > unaligned object and casting pointer to char to pointer to int implies > you know that it is correctly aligned. Sure, just take the address of a packed struct member. __builtin_speculation_safe_load (&p->x, ...); where's the cast here? > >> >> + set_mem_alias_set (mem, get_alias_set (TREE_TYPE (TREE_TYPE (arg0; >> >> sorry, but pointer types are arbitrary in GIMPLE so this is clearly >> wrong. Without >> more information you have to use zero. Thus, >> >> __builtin_speculation_safe_load ((int *)p, ...); >> >> will do the wrong thing if the type of p is not int *. > > Ok, I'll fix that. > >> >> + /* Mark the memory access as volatile. We don't want the optimizers to >> + move it or otherwise substitue an alternative value. */ >> + MEM_VOLATILE_P (mem) = 1; >> >> how is moving or substituting in any way related to inhibiting speculation? >> Why's this done for all targets rather than only those that need any such >> mitigation? > > I can move that, but it seemed right to do it here. > >> >> I btw miss implementation for x86_64 (at least) where the agreed upon >> mitigation >> is to insert 'lfence' before the load. > > /I'm/ not planning on doing implementations for other targets. I don't > know those back-ends. Sorry. It really needs the attention of a port > expert. Ok, fair enough. But I'd not like to backport and release an incomplete feature. >> >> Any reason the builtin expansion is not fully left to the targets? > > Because that would be a significant impediment to deploying the builtin > in generic code. The compiler would have to error out if it couldn't > handle the builtin at all. By providing a generic implementation we can > support everything, even if you do get a warning. I meant you are doing part of the expansion in generic code, including expanding the MEM, setting it volatile, expanding the bounds (even if a target only emits a barrier and the load). So I'd expected expand_speculation_safe_load to just call the hook. _Maybe_ do some initial MEM expansion. >> >> +/* Suppressing speculation. Users are expected to use the first (N) >> + variant, which will be translated internally into one of the other >> + types. */ >> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_N, "speculation_safe_load", >> +BT_FN_VOID_VAR, ATTR_NULL) >> + >> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_1, >> "speculation_safe_load_1", >> +BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) >> +DEF_GCC_BUILTIN (BUILT_IN_SPECULATION_SAFE_LOAD_2, >> "speculation_safe_load_2", >> +BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, ATTR_NULL) >> >> any reason you are not simply using an internal function for the >> non-_N variants? >> Looks like you're closely following the atomic/sync builtins but those >> predate IFNs I think. >> They also have more constraints on alignment and use a restrictive alias set. > > Wouldn't that make it more difficult to do backports to older versions > of GCC? I know some folk are interested in going back as far as > gcc-4.8, and maybe even earlier. Possibly. >> >> +rtx >> +default_sp
[PATCH,AIX] Optimize parsing of include files.
Description: * This patch optimizes the time required for parsing the include files. Tests: * AIX: Build: SUCCESS - build made by means of gmake on AIX. ChangeLog: * xcoff.c: Optimize parsing of include files. Cordialement, Tony Reix Bull - ATOS IBM Coop Architect & Technical Leader Office : +33 (0) 4 76 29 72 67 1 rue de Provence - 38432 Échirolles - France www.atos.net Index: libbacktrace/xcoff.c === --- libbacktrace/xcoff.c (revision 256837) +++ libbacktrace/xcoff.c (working copy) @@ -760,6 +760,40 @@ xcoff_fileline (struct backtrace_state *state, uin return callback (data, pc, NULL, 0, NULL); } +/* Compare struct xcoff_incl for qsort. */ + +static int +xcoff_incl_compare (const void *v1, const void *v2) +{ + const struct xcoff_incl *in1 = (const struct xcoff_incl *) v1; + const struct xcoff_incl *in2 = (const struct xcoff_incl *) v2; + + if (in1->begin < in2->begin) +return -1; + else if (in1->begin > in2->begin) +return 1; + else +return 0; +} + +/* Find a lnnoptr in an include file. */ + +static int +xcoff_incl_search (const void *vkey, const void *ventry) +{ + const uintptr_t *key = (const uintptr_t *) vkey; + const struct xcoff_incl *entry = (const struct xcoff_incl *) ventry; + uintptr_t lnno; + + lnno = *key; + if (lnno < entry->begin) +return -1; + else if (lnno > entry->end) +return 1; + else +return 0; +} + /* Add a new mapping to the vector of line mappings that we are building. Returns 1 on success, 0 on failure. */ @@ -809,7 +843,6 @@ xcoff_process_linenos (struct backtrace_state *sta uintptr_t pc; uint32_t lnno; int begincl; - size_t i; aux = (const b_xcoff_auxent *) (fsym + 1); lnnoptr = aux->x_fcn.x_lnnoptr; @@ -839,15 +872,13 @@ xcoff_process_linenos (struct backtrace_state *sta /* If part of a function other than the beginning comes from an include file, the line numbers are absolute, rather than relative to the beginning of the function. */ - for (i = 0; i < vec->count; ++i) - { - incl = (struct xcoff_incl *) vec->vec.base + i; - if (incl->begin <= lnnoptr && lnnoptr <= incl->end) - break; - } + incl = (struct xcoff_incl *) bsearch (&lnnoptr, vec->vec.base, + vec->count, + sizeof (struct xcoff_incl), + xcoff_incl_search); if (begincl == -1) - begincl = (i < vec->count); - if (i < vec->count) + begincl = incl != NULL; + if (incl != NULL) { filename = incl->filename; if (begincl == 1) @@ -935,6 +966,9 @@ xcoff_initialize_fileline (struct backtrace_state i += asym->n_numaux; } + backtrace_qsort (vec.vec.base, vec.count, + sizeof (struct xcoff_incl), xcoff_incl_compare); + filename = NULL; fsym = NULL; for (i = 0; i < nsyms; ++i)
[PATCH] Fix profile_quality sanity check.
Hi. Following patch adds a new enum value so that we don't see following warning: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01211.html Apart from that I decided to not to number values of the enum as it uses default number. Is it welcome? Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin gcc/ChangeLog: 2018-01-18 Martin Liska * profile-count.h (enum profile_quality): Add profile_uninitialized as the first value. Do not number values as they are zero based. --- gcc/profile-count.h | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/gcc/profile-count.h b/gcc/profile-count.h index 7a43917ebbc..e899963118b 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -26,34 +26,36 @@ struct function; /* Quality of the profile count. Because gengtype does not support enums inside of classes, this is in global namespace. */ enum profile_quality { + /* Uninitialized value. */ + profile_uninitialized, /* Profile is based on static branch prediction heuristics and may or may not match reality. It is local to function and can not be compared inter-procedurally. Never used by probabilities (they are always local). */ - profile_guessed_local = 1, + profile_guessed_local, /* Profile was read by feedback and was 0, we used local heuristics to guess better. This is the case of functions not run in profile fedback. Never used by probabilities. */ - profile_guessed_global0 = 2, + profile_guessed_global0, /* Same as profile_guessed_global0 but global count is adjusted 0. */ - profile_guessed_global0adjusted = 3, + profile_guessed_global0adjusted, /* Profile is based on static branch prediction heuristics. It may or may not reflect the reality but it can be compared interprocedurally (for example, we inlined function w/o profile feedback into function with feedback and propagated from that). Never used by probablities. */ - profile_guessed = 4, + profile_guessed, /* Profile was determined by autofdo. */ - profile_afdo = 5, + profile_afdo, /* Profile was originally based on feedback but it was adjusted by code duplicating optimization. It may not precisely reflect the particular code path. */ - profile_adjusted = 6, + profile_adjusted, /* Profile was read from profile feedback or determined by accurate static method. */ - profile_precise = 7 + profile_precise }; /* The base value for branch probability notes and edge probabilities. */
New code merge optimization?
Hello, I am interested in implementing a new pass in gcc to merge identical sequences of code in GCC to be used mainly for RL78. The commercial RL78 compilers have such algorithms implemented and they make quite good use of it. Opportunities arise from the limited capabilities of RL78, for other targets this might be a lot less useful. A while ago I found the following: https://www.gnu.org/software/gcc/projects/cfo.html And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and this is what I found out: For RX: no visible improvements with any of them For RL78: some minor improvements only with -frtl-seqbastr: Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c with "-Os" and "-Os -frtl-seqabstr" (using the modified gcc 4.9.2) The algorithm was effective only in 60 files(out of 1643 files, that's only 0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute) On those 60 files I got an average of 6.5% improvement with the best improvement for pr58574.c (36.4%). What do you think: is it worthwhile porting this to the trunk or I will just waste my time? Or should I start fresh? Maybe start from here: http://llvm.org/docs/MergeFunctions.html? Gimple or rtl? I suppose the preferred way will be to do this in gimple; however based on what I did so far, it's more likely to find identical sequences in rtl (for RL78). Any thoughts? Thank you! Best Regards, Sebastian
Re: New code merge optimization?
Hi, On Thu, Jan 18 2018, Sebastian Perta wrote: > Hello, > > I am interested in implementing a new pass in gcc to merge identical > sequences of code in GCC to be used mainly for RL78. > The commercial RL78 compilers have such algorithms implemented and they make > quite good use of it. > Opportunities arise from the limited capabilities of RL78, for other targets > this might be a lot less useful. > > A while ago I found the following: > https://www.gnu.org/software/gcc/projects/cfo.html > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and > this is what I found out: > For RX: no visible improvements with any of them > For RL78: some minor improvements only with -frtl-seqbastr: > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c with > "-Os" and "-Os -frtl-seqabstr" (using the modified gcc 4.9.2) > The algorithm was effective only in 60 files(out of 1643 files, that's only > 0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute) > On those 60 files I got an average of 6.5% improvement with the best > improvement for pr58574.c (36.4%). > > What do you think: is it worthwhile porting this to the trunk or I will just > waste my time? > Or should I start fresh? Maybe start from here: > http://llvm.org/docs/MergeFunctions.html? Martin Liška contributed identical code folding to GCC quite a few years ago now. Look up the -fipa-icf option. If that is not enough for you, starting from that seems more natural. Martin > > Gimple or rtl? > I suppose the preferred way will be to do this in gimple; however based on > what I did so far, it's more likely to find identical sequences in rtl (for > RL78). > > Any thoughts? Thank you! > > Best Regards, > Sebastian
Re: [C++ Patch] PR 81013 ("[7/8 Regression] ICE with invalid union in class hierarchy")
OK. On Thu, Jan 18, 2018 at 8:43 AM, Paolo Carlini wrote: > Hi, > > this error recovery regression is caused by my work for c++/70202 back in > 2016. At the time I remember wondering whether in the easy cases of errors > right at beginning of xref_basetypes (not those in the loop below handled > via dropped_base) we really wanted to proceed or immediately bail out, as > far as error recovery in concerned. This new testcase shows that if we don't > bail out early we may end up crashing much later, in cgraph.c. Thus the > straightforward fix below. Tested x86_64-linux. > > Thanks, Paolo. > > / >
Re: [PATCH] Fix profile_quality sanity check.
On 01/18/2018 03:59 PM, Martin Liška wrote: Hi. Following patch adds a new enum value so that we don't see following warning: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01211.html Hi, with the patch, I still see the same warning. And not surprisingly, given that profile_precise is still 7 and m_quality is still a 3 bits wide bitfield. So, I don't quite understand why you think that the patch would address the warning. Thanks, - Tom Apart from that I decided to not to number values of the enum as it uses default number. Is it welcome? Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin gcc/ChangeLog: 2018-01-18 Martin Liska * profile-count.h (enum profile_quality): Add profile_uninitialized as the first value. Do not number values as they are zero based. --- gcc/profile-count.h | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) 0001-Fix-profile_quality-sanity-check.patch diff --git a/gcc/profile-count.h b/gcc/profile-count.h index 7a43917ebbc..e899963118b 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -26,34 +26,36 @@ struct function; /* Quality of the profile count. Because gengtype does not support enums inside of classes, this is in global namespace. */ enum profile_quality { + /* Uninitialized value. */ + profile_uninitialized, /* Profile is based on static branch prediction heuristics and may or may not match reality. It is local to function and can not be compared inter-procedurally. Never used by probabilities (they are always local). */ - profile_guessed_local = 1, + profile_guessed_local, /* Profile was read by feedback and was 0, we used local heuristics to guess better. This is the case of functions not run in profile fedback. Never used by probabilities. */ - profile_guessed_global0 = 2, + profile_guessed_global0, /* Same as profile_guessed_global0 but global count is adjusted 0. */ - profile_guessed_global0adjusted = 3, + profile_guessed_global0adjusted, /* Profile is based on static branch prediction heuristics. It may or may not reflect the reality but it can be compared interprocedurally (for example, we inlined function w/o profile feedback into function with feedback and propagated from that). Never used by probablities. */ - profile_guessed = 4, + profile_guessed, /* Profile was determined by autofdo. */ - profile_afdo = 5, + profile_afdo, /* Profile was originally based on feedback but it was adjusted by code duplicating optimization. It may not precisely reflect the particular code path. */ - profile_adjusted = 6, + profile_adjusted, /* Profile was read from profile feedback or determined by accurate static method. */ - profile_precise = 7 + profile_precise }; /* The base value for branch probability notes and edge probabilities. */
Fix ICE with profile info mismatch
Hi, this patch ICE where the profile in cgraph mismatch profile in BB. This is becuase of expansion of speculative devirtualization where we get some roundoff issues. Bootstrapped/regtested x86_64-linux, comitted. Honza PR ipa/83619 * g++.dg/torture/pr83619.C: New testcase. * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge frequencies. Index: cgraph.c === --- cgraph.c(revision 256847) +++ cgraph.c(working copy) @@ -1327,6 +1327,7 @@ cgraph_edge::redirect_call_stmt_to_calle e->speculative = false; e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt, false); + e->count = gimple_bb (e->call_stmt)->count; /* Fix edges for BUILT_IN_CHKP_BNDRET calls attached to the processed call stmt. */ @@ -1346,6 +1347,7 @@ cgraph_edge::redirect_call_stmt_to_calle } e2->speculative = false; + e2->count = gimple_bb (e2->call_stmt)->count; ref->speculative = false; ref->stmt = NULL; /* Indirect edges are not both in the call site hash. Index: testsuite/g++.dg/torture/pr83619.C === --- testsuite/g++.dg/torture/pr83619.C (revision 0) +++ testsuite/g++.dg/torture/pr83619.C (working copy) @@ -0,0 +1,67 @@ +// { dg-do compile } +int a; +class d +{ +public: + virtual unsigned c (); +}; +class e; +class i +{ + void h (); + +public: + void + operator= (e *f) + { +j::c (f); +h (); + } + template struct j + { +static void +c (e *g) +{ + g->c (); +} + }; +}; +class k; +class l +{ +public: + l (int); + k *operator-> (); +}; +class e : public d +{ +}; +class m final : e +{ + unsigned c (); +}; +class k +{ +public: + virtual int o (e *) = 0; +}; +class H : d, k +{ + int o (e *); + i n; +}; +unsigned +m::c () +{ + l b = 0; + b->o (this); + return a; +} +int +H::o (e *p) +{ + n = p; + return a; +} + +
RE: New code merge optimization?
Hi, Thank you! As the description says this finds equivalent functions, I would like to find identical sequences inside functions but at least this will provide all the up to date tools to compare code sequences, thank you again! Best Regards, Sebastian > -Original Message- > From: Martin Jambor [mailto:mjam...@suse.cz] > Sent: 18 January 2018 15:18 > To: Sebastian Perta ; gcc- > patc...@gcc.gnu.org > Cc: l...@gcc.gnu.org > Subject: Re: New code merge optimization? > > Hi, > > On Thu, Jan 18 2018, Sebastian Perta wrote: > > Hello, > > > > I am interested in implementing a new pass in gcc to merge identical > > sequences of code in GCC to be used mainly for RL78. > > The commercial RL78 compilers have such algorithms implemented and > they make > > quite good use of it. > > Opportunities arise from the limited capabilities of RL78, for other targets > > this might be a lot less useful. > > > > A while ago I found the following: > > https://www.gnu.org/software/gcc/projects/cfo.html > > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and > > this is what I found out: > > For RX: no visible improvements with any of them > > For RL78: some minor improvements only with -frtl-seqbastr: > > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c with > > "-Os" and "-Os -frtl-seqabstr" (using the modified gcc 4.9.2) > > The algorithm was effective only in 60 files(out of 1643 files, that's only > > 0.03% of the files currently present in gcc/testsuite/gcc.c-torture/execute) > > On those 60 files I got an average of 6.5% improvement with the best > > improvement for pr58574.c (36.4%). > > > > What do you think: is it worthwhile porting this to the trunk or I will just > > waste my time? > > Or should I start fresh? Maybe start from here: > > http://llvm.org/docs/MergeFunctions.html? > > Martin Liška contributed identical code folding to GCC quite a few years > ago now. Look up the -fipa-icf option. If that is not enough for you, > starting from that seems more natural. > > Martin > > > > > Gimple or rtl? > > I suppose the preferred way will be to do this in gimple; however based on > > what I did so far, it's more likely to find identical sequences in rtl (for > > RL78). > > > > Any thoughts? Thank you! > > > > Best Regards, > > Sebastian Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered No. 04586709.
[PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.
GCC maintainers: The following patch adds missing 128-bit support for the builtins vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be(). It also includes a bug fix required for the new 128-bit arguments for the vec_xst_be() and vec_xl_be() builtins. New test cases are also included. This patch completes the tests of all the various load/store builtins that I have been working on. This patch adds a torture test for the various load store tests to check that they work for -O0, -O1 and -O2 testing. This was done as the work on the load/store builtins found numerous issues that were optimization level dependent which were fixed by the previous commits as well as this commit. The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 8 BE) powerpc64le-unknown-linux-gnu (Power 9 LE) Let me know if the patch looks OK or not. Let me know if you want to include it in stage 4 or wait for the next release. Thanks. Carl Love gcc/ChangeLog: 2018-01-18 Carl Love * config/rs6000/rs6000-builtin.def (ST_ELEMREV_V1TI, LD_ELEMREV_V1TI, LVX_V1TI): Add macro expansion. * config/rs6000/rs6000-c.c (altivec_builtin_types): Add argument definitions for VSX_BUILTIN_VEC_XST_BE, VSX_BUILTIN_VEC_ST, VSX_BUILTIN_VEC_XL, LD_ELEMREV_V1TI builtins. * config/rs6000/rs6000-p8swap.c (insn_is_swappable_p); Change check to determine if the instruction is a byte reversing entry. Fix typo in comment. * config/rs6000/rs6000.c (altivec_expand_builtin): Add case entry for VSX_BUILTIN_ST_ELEMREV_V1TI and VSX_BUILTIN_LD_ELEMREV_V1TI. Add def_builtin calls for new builtins. * config/rs6000/vsx.md (vsx_st_elemrev_v1ti, vsx_ld_elemrev_v1ti): Add define_insn expansion. gcc/testsuite/ChangeLog: 2018-01-18 Carl Love * gcc.target/powerpc/powerpc.exp: Add torture tests for builtins-4-runnable.c, builtins-6-runnable.c, builtins-5-p9-runnable.c, builtins-6-p9-runnable.c. * gcc.target/powerpc/builtins-6-runnable.c: New test file. * gcc.target/powerpc/builtins-4-runnable.c: Add additional tests for signed/unsigned 128-bit and long long int loads. --- gcc/config/rs6000/rs6000-builtin.def |3 + gcc/config/rs6000/rs6000-c.c | 39 + gcc/config/rs6000/rs6000-p8swap.c |5 +- gcc/config/rs6000/rs6000.c | 20 + gcc/config/rs6000/vsx.md | 43 +- .../gcc.target/powerpc/builtins-4-runnable.c | 494 +- .../gcc.target/powerpc/builtins-6-runnable.c | 1001 gcc/testsuite/gcc.target/powerpc/powerpc.exp | 12 + 8 files changed, 1582 insertions(+), 35 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-6-runnable.c diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index b17036c5a..757fd6d50 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1234,6 +1234,7 @@ BU_ALTIVEC_X (LVXL_V8HI, "lvxl_v8hi",MEM) BU_ALTIVEC_X (LVXL_V16QI, "lvxl_v16qi", MEM) BU_ALTIVEC_X (LVX, "lvx", MEM) BU_ALTIVEC_X (LVX_V2DF,"lvx_v2df", MEM) +BU_ALTIVEC_X (LVX_V1TI,"lvx_v1ti", MEM) BU_ALTIVEC_X (LVX_V2DI,"lvx_v2di", MEM) BU_ALTIVEC_X (LVX_V4SF,"lvx_v4sf", MEM) BU_ALTIVEC_X (LVX_V4SI,"lvx_v4si", MEM) @@ -1783,12 +1784,14 @@ BU_VSX_X (STXVW4X_V4SF, "stxvw4x_v4sf", MEM) BU_VSX_X (STXVW4X_V4SI, "stxvw4x_v4si", MEM) BU_VSX_X (STXVW4X_V8HI, "stxvw4x_v8hi", MEM) BU_VSX_X (STXVW4X_V16QI, "stxvw4x_v16qi", MEM) +BU_VSX_X (LD_ELEMREV_V1TI,"ld_elemrev_v1ti", MEM) BU_VSX_X (LD_ELEMREV_V2DF,"ld_elemrev_v2df", MEM) BU_VSX_X (LD_ELEMREV_V2DI,"ld_elemrev_v2di", MEM) BU_VSX_X (LD_ELEMREV_V4SF,"ld_elemrev_v4sf", MEM) BU_VSX_X (LD_ELEMREV_V4SI,"ld_elemrev_v4si", MEM) BU_VSX_X (LD_ELEMREV_V8HI,"ld_elemrev_v8hi", MEM) BU_VSX_X (LD_ELEMREV_V16QI, "ld_elemrev_v16qi", MEM) +BU_VSX_X (ST_ELEMREV_V1TI,"st_elemrev_v1ti", MEM) BU_VSX_X (ST_ELEMREV_V2DF,"st_elemrev_v2df", MEM) BU_VSX_X (ST_ELEMREV_V2DI,"st_elemrev_v2di", MEM) BU_VSX_X (ST_ELEMREV_V4SF,"st_elemrev_v4sf", MEM) diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 123e46aa1..a0f790d39 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -3149,16 +3149,27 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 }, { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LXVD2X_V2DF, RS6000_BTI_V2DF, RS6000_BT
Re: [PATCH][arm] Convert gcc.target/arm/stl-cond.c into an RTL test
On 18 January 2018 at 10:28, Kyrill Tkachov wrote: > Hi Christophe, > > > On 18/01/18 09:25, Christophe Lyon wrote: >> >> Hi Kyrill, >> >> >> On 17 January 2018 at 12:27, Kyrill Tkachov >> wrote: >>> >>> Hi all, >>> >>> This is an awkward testsuite failure. The original bug was that we were >>> failing to put out >>> the conditional code in the conditional form of the STL instruction >>> (oops!). >>> So we wanted to output STLNE, but instead output STL. >>> The testacase relies on if-conversion to conditionalise the insn for STL. >>> However, ever since r251643 the expansion of a non-relaxed atomic store >>> always includes a compiler barrier. That blocks if-conversion in all >>> cases. >>> >>> So there's no easy to get to a conditional STL instruction from a C >>> program. >>> But we do want to test for the original bug fix that if the RTL insn for >>> STL >>> is conditionalised >>> it should output the conditional code. >>> >>> The solution in this patch is to convert the test into an RTL test with >>> the >>> COND_EXEC form >>> of the STL insn and scan the assembly output there. >>> This seems to work fine, and gives us an opportunity to create a >>> gcc.dg/rtl/arm directory >>> in the RTL tests. >>> >>> This now makes the gcc.target/arm/stl-cond.c disappear (as the test is >>> deleted) and >>> the new test in gcc.dg/rtl/arm/stl-cond.c passes. >>> >>> Committing to trunk. >>> Thanks, >>> Kyrill >>> >>> 2018-01-17 Kyrylo Tkachov >>> >>> * gcc.dg/rtl/arm/stl-cond.c: New test. >>> * gcc.target/arm/stl-cond.c: Delete. >> >> I've noticed that the new test is unsupported on armeb, is this >> intentional? > > > I think it can PASS on armeb. I guess it's just a matter of changing the > target in > /* { dg-do compile { target arm-*-* } } */ > to arm*-*-*. > > Such a patch is pre-approved. > Kyrill > OK, done as r256851 after confirming that is does pass on armeb. Christophe >> Christophe > >
Re: New code merge optimization?
On Thu, Jan 18, 2018 at 8:03 AM, Sebastian Perta wrote: > Hi, > > Thank you! > As the description says this finds equivalent functions, I would like to find > identical sequences inside > functions but at least this will provide all the up to date tools to compare > code sequences, thank you again! There is already some identical sequence finding (code hoisting) in GCC 7 (and above), see https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00360.html for the patch. Thanks, Andrew > > Best Regards, > Sebastian > > >> -Original Message- >> From: Martin Jambor [mailto:mjam...@suse.cz] >> Sent: 18 January 2018 15:18 >> To: Sebastian Perta ; gcc- >> patc...@gcc.gnu.org >> Cc: l...@gcc.gnu.org >> Subject: Re: New code merge optimization? >> >> Hi, >> >> On Thu, Jan 18 2018, Sebastian Perta wrote: >> > Hello, >> > >> > I am interested in implementing a new pass in gcc to merge identical >> > sequences of code in GCC to be used mainly for RL78. >> > The commercial RL78 compilers have such algorithms implemented and >> they make >> > quite good use of it. >> > Opportunities arise from the limited capabilities of RL78, for other >> > targets >> > this might be a lot less useful. >> > >> > A while ago I found the following: >> > https://www.gnu.org/software/gcc/projects/cfo.html >> > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and >> > this is what I found out: >> > For RX: no visible improvements with any of them >> > For RL78: some minor improvements only with -frtl-seqbastr: >> > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c with >> > "-Os" and "-Os -frtl-seqabstr" (using the modified gcc 4.9.2) >> > The algorithm was effective only in 60 files(out of 1643 files, that's only >> > 0.03% of the files currently present in >> > gcc/testsuite/gcc.c-torture/execute) >> > On those 60 files I got an average of 6.5% improvement with the best >> > improvement for pr58574.c (36.4%). >> > >> > What do you think: is it worthwhile porting this to the trunk or I will >> > just >> > waste my time? >> > Or should I start fresh? Maybe start from here: >> > http://llvm.org/docs/MergeFunctions.html? >> >> Martin Liška contributed identical code folding to GCC quite a few years >> ago now. Look up the -fipa-icf option. If that is not enough for you, >> starting from that seems more natural. >> >> Martin >> >> > >> > Gimple or rtl? >> > I suppose the preferred way will be to do this in gimple; however based on >> > what I did so far, it's more likely to find identical sequences in rtl >> > (for >> > RL78). >> > >> > Any thoughts? Thank you! >> > >> > Best Regards, >> > Sebastian > > > > Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, > Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered > No. 04586709.
[PATCH, i386]: Backport a small retpoline improvement to gcc-7 branch
Hello! I'd like to backport a small improvement to retpoline functionality to gcc-7 branch. 2018-01-17 Uros Bizjak * config/i386/i386.c (indirect_thunk_name): Declare regno as unsigned int. Compare regno with INVALID_REGNUM. (output_indirect_thunk): Ditto. (output_indirect_thunk_function): Ditto. (ix86_code_end): Declare regno as unsigned int. Use INVALID_REGNUM in the call to output_indirect_thunk_function. OK for branch? Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 256797) +++ config/i386/i386.c (working copy) @@ -12051,16 +12051,16 @@ static int indirect_thunks_bnd_used; /* Fills in the label name that should be used for the indirect thunk. */ static void -indirect_thunk_name (char name[32], int regno, bool need_bnd_p, -bool ret_p) +indirect_thunk_name (char name[32], unsigned int regno, +bool need_bnd_p, bool ret_p) { - if (regno >= 0 && ret_p) + if (regno != INVALID_REGNUM && ret_p) gcc_unreachable (); if (USE_HIDDEN_LINKONCE) { const char *bnd = need_bnd_p ? "_bnd" : ""; - if (regno >= 0) + if (regno != INVALID_REGNUM) { const char *reg_prefix; if (LEGACY_INT_REGNO_P (regno)) @@ -12078,7 +12078,7 @@ static void } else { - if (regno >= 0) + if (regno != INVALID_REGNUM) { if (need_bnd_p) ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno); @@ -12130,7 +12130,7 @@ static void */ static void -output_indirect_thunk (bool need_bnd_p, int regno) +output_indirect_thunk (bool need_bnd_p, unsigned int regno) { char indirectlabel1[32]; char indirectlabel2[32]; @@ -12160,7 +12160,7 @@ static void ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2); - if (regno >= 0) + if (regno != INVALID_REGNUM) { /* MOV. */ rtx xops[2]; @@ -12184,12 +12184,12 @@ static void } /* Output a funtion with a call and return thunk for indirect branch. - If BND_P is true, the BND prefix is needed. If REGNO != -1, the - function address is in REGNO. Otherwise, the function address is + If BND_P is true, the BND prefix is needed. If REGNO != UNVALID_REGNUM, + the function address is in REGNO. Otherwise, the function address is on the top of stack. */ static void -output_indirect_thunk_function (bool need_bnd_p, int regno) +output_indirect_thunk_function (bool need_bnd_p, unsigned int regno) { char name[32]; tree decl; @@ -12238,7 +12238,7 @@ static void ASM_OUTPUT_LABEL (asm_out_file, name); } - if (regno < 0) + if (regno == INVALID_REGNUM) { /* Create alias for __x86.return_thunk/__x86.return_thunk_bnd. */ char alias[32]; @@ -12312,16 +12312,16 @@ static void ix86_code_end (void) { rtx xops[2]; - int regno; + unsigned int regno; if (indirect_thunk_needed) -output_indirect_thunk_function (false, -1); +output_indirect_thunk_function (false, INVALID_REGNUM); if (indirect_thunk_bnd_needed) -output_indirect_thunk_function (true, -1); +output_indirect_thunk_function (true, INVALID_REGNUM); for (regno = FIRST_REX_INT_REG; regno <= LAST_REX_INT_REG; regno++) { - int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1; + unsigned int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1; if ((indirect_thunks_used & (1 << i))) output_indirect_thunk_function (false, regno);
Fix call of edge insertions hooks when cloning
Hi, this patch was attached to the PR for a while. I have regtested it and comitted. Honza Index: ChangeLog === --- ChangeLog (revision 256850) +++ ChangeLog (working copy) @@ -1,5 +1,12 @@ 2018-01-18 Jan Hubicka + PR ipa/82256 + patch by PaX Team + * cgraphclones.c (cgraph_node::create_version_clone_with_body): + Fix call of call_cgraph_insertion_hooks. + +2018-01-18 Jan Hubicka + PR ipa/83619 * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge frequencies. Index: cgraphclones.c === --- cgraphclones.c (revision 256849) +++ cgraphclones.c (working copy) @@ -1007,7 +1007,7 @@ cgraph_node::create_version_clone_with_b /* Update the call_expr on the edges to call the new version node. */ update_call_expr (new_version_node); - symtab->call_cgraph_insertion_hooks (this); + symtab->call_cgraph_insertion_hooks (new_version_node); return new_version_node; }
Re: [PATCH] document -Wclass-memaccess suppression by casting (PR 81327)
On 01/15/2018 03:09 AM, Florian Weimer wrote: * Martin Sebor: +the virtual table. Modifying the representation of such objects may violate ^vtable pointer? The vtable itself is not corrupted, I assume. Well, what happens is undefined, so who knows? ;) But of course in reality, writing to an object can only overwrite what's stored in the object, so only the vtable pointers will be corrupted. I doubt anyone would have misunderstood what was meant but I've committed r256852 for the sake of being accurate. Martin
RE: New code merge optimization?
Hi, Thank you very much! I tried with and without -fcode-hoisting the file I mentioned earlier gcc/testsuite/gcc.c-torture/execute/pr58574.c but it remains the same. But I can certainly extend on this. Sorry I didn't found out this myself! I'll do more research next time before I start asking questions. Best Regards, Sebastian > -Original Message- > From: Andrew Pinski [mailto:pins...@gmail.com] > Sent: 18 January 2018 16:19 > To: Sebastian Perta > Cc: Martin Jambor ; gcc-patches@gcc.gnu.org; > l...@gcc.gnu.org > Subject: Re: New code merge optimization? > > On Thu, Jan 18, 2018 at 8:03 AM, Sebastian Perta > wrote: > > Hi, > > > > Thank you! > > As the description says this finds equivalent functions, I would like to > > find > identical sequences inside > > functions but at least this will provide all the up to date tools to compare > code sequences, thank you again! > > There is already some identical sequence finding (code hoisting) in > GCC 7 (and above), see > https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00360.html > for the patch. > > Thanks, > Andrew > > > > > Best Regards, > > Sebastian > > > > > >> -Original Message- > >> From: Martin Jambor [mailto:mjam...@suse.cz] > >> Sent: 18 January 2018 15:18 > >> To: Sebastian Perta ; gcc- > >> patc...@gcc.gnu.org > >> Cc: l...@gcc.gnu.org > >> Subject: Re: New code merge optimization? > >> > >> Hi, > >> > >> On Thu, Jan 18 2018, Sebastian Perta wrote: > >> > Hello, > >> > > >> > I am interested in implementing a new pass in gcc to merge identical > >> > sequences of code in GCC to be used mainly for RL78. > >> > The commercial RL78 compilers have such algorithms implemented and > >> they make > >> > quite good use of it. > >> > Opportunities arise from the limited capabilities of RL78, for other > targets > >> > this might be a lot less useful. > >> > > >> > A while ago I found the following: > >> > https://www.gnu.org/software/gcc/projects/cfo.html > >> > And I ported all algorithms to gcc 4.9.2 and tried it on RL78 and RX and > >> > this is what I found out: > >> > For RX: no visible improvements with any of them > >> > For RL78: some minor improvements only with -frtl-seqbastr: > >> > Compiling all the C files from gcc/testsuite/gcc.c-torture/execute/*c > with > >> > "-Os" and "-Os -frtl-seqabstr" (using the modified gcc 4.9.2) > >> > The algorithm was effective only in 60 files(out of 1643 files, that's > >> > only > >> > 0.03% of the files currently present in gcc/testsuite/gcc.c- > torture/execute) > >> > On those 60 files I got an average of 6.5% improvement with the best > >> > improvement for pr58574.c (36.4%). > >> > > >> > What do you think: is it worthwhile porting this to the trunk or I will > >> > just > >> > waste my time? > >> > Or should I start fresh? Maybe start from here: > >> > http://llvm.org/docs/MergeFunctions.html? > >> > >> Martin Liška contributed identical code folding to GCC quite a few years > >> ago now. Look up the -fipa-icf option. If that is not enough for you, > >> starting from that seems more natural. > >> > >> Martin > >> > >> > > >> > Gimple or rtl? > >> > I suppose the preferred way will be to do this in gimple; however based > on > >> > what I did so far, it's more likely to find identical sequences in rtl > >> > (for > >> > RL78). > >> > > >> > Any thoughts? Thank you! > >> > > >> > Best Regards, > >> > Sebastian > > > > > > > > Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne > End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under > Registered No. 04586709. Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered No. 04586709.
[Committed][AArch64] Fix fp16 test failures after PR82964 fix
This fixes test failures in gcc.target/aarch64/f16_mov_immediate_*.c after fixing PR82964 (https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01636.html). The check for a scalar floating point constant didn't include 16-bit floating point modes, so use GET_MODE_CLASS instead. This fixes the failures and has no effect otherwise. Committed as trivial fix. ChangeLog: 2018-01-18 Wilco Dijkstra gcc/ PR target/82964 * config/aarch64/aarch64.c (aarch64_legitimate_constant_p): Use GET_MODE_CLASS for scalar floating point. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 671168cc3f32ac3f2669fc79835435fdf618a56d..c6a83c881038873d8b68e36f906783be63ddde56 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11813,8 +11813,7 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x) { /* Support CSE and rematerialization of common constants. */ if (CONST_INT_P (x) - || (CONST_DOUBLE_P (x) - && (mode == SFmode || mode == DFmode || mode == TFmode)) + || (CONST_DOUBLE_P (x) && GET_MODE_CLASS (mode) == MODE_FLOAT) || GET_CODE (x) == CONST_VECTOR) return true;
Re: [PATCH] PR82964: Fix 128-bit immediate ICEs
Christophe Lyon wrote: > After this patch (r256800), I have noticed new failures on aarch64: > gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times > mov\tw[0-9]+, #?19520 3 (found 0 times) Thanks for spotting these, the scripts appear to have missed those (contrib/dg-cmp-results.sh skipped them eventhough they were in contrib/compare_tests...). Anyway these are fixed now. Wilco
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Thu, Jan 18, 2018 at 08:10:50AM -0500, John David Anglin wrote: > On 2018-01-17 3:07 PM, Jakub Jelinek wrote: > > PR83882 complains that PR81715 testcase fails on callee copies parameter > > targets. The following patch ought to fix that, but I have only > > bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase > > with hppa. > > > > John, do you think you could test this on hppa without the callee copies > > default change? > On hppa2.0w-hp-hpux11.11, pr81715.C now passes. > > However, there are some new fails in the gcc and g++ suites: That is really weird, at least in the cross-compiler on throw-1.C I don't see with -mno-caller-copies the new code triggering at all. Tried both hppa-linux and hppa-hpux crosses. Furthermore, throw-1.C has: /* { dg-skip-if "Stack alignment is too small" { hppa*-*-hpux* } } */ so I wonder how it comes you see them run on hpux at all. I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c. What kind of error would be on pr83487-*? If any of those are really caused by my patch, I'd be interest to see e.g. *.gimple and *.optimized dumps with and without the patch (rather than reverting the whole patch, it should be enough to hand edit in function.c the && flag_stack_reuse != SR_NONE part and change it to && 0 && flag_stack_reuse != SR_NONE or similar. > FAIL: g++.dg/torture/stackalign/throw-1.C -O0 execution test > FAIL: g++.dg/torture/stackalign/throw-1.C -O0 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-1.C -O1 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-1.C -O2 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-1.C -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-1.C -O3 -g -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-1.C -Os -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -O0 execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -O0 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -O1 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -O2 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -O3 -g -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-2.C -Os -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -O0 execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -O0 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -O1 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -O2 -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -O3 -g -fpic execution test > FAIL: g++.dg/torture/stackalign/throw-4.C -Os -fpic execution test > > FAIL: gcc.dg/torture/pr52451.c -O0 execution test > FAIL: gcc.dg/torture/pr52451.c -O1 execution test > FAIL: gcc.dg/torture/pr52451.c -O2 execution test > FAIL: gcc.dg/torture/pr52451.c -O3 -g execution test > FAIL: gcc.dg/torture/pr52451.c -Os execution test > > FAIL: gcc.dg/compat/pr83487-1 c_compat_y_tst.o compile > FAIL: gcc.dg/compat/pr83487-2 c_compat_y_tst.o compile > > Dave > > -- > John David Anglin dave.ang...@bell.net Jakub
Re: [PATCH], Fix PR target/pr83862: Fix PowerPC long double signbit with -mabi=ieeelongdouble
On Wed, Jan 17, 2018 at 06:40:12PM -0500, Michael Meissner wrote: > On Wed, Jan 17, 2018 at 04:09:57PM -0600, Segher Boessenkool wrote: > > On Tue, Jan 16, 2018 at 10:55:43PM -0500, Michael Meissner wrote: > > > PR target/83862 pointed out a problem I put into the 128-bit floating > > > point > > > type signbit optimization. The issue is we want to avoid doing a load to > > > a > > > floating point/vector register and then a direct move to do signbit, so we > > > change the load to load the upper 64-bits of the floating point value to > > > get > > > the sign bit. Unfortunately, if the type is IEEE 128-bit and memory is > > > addressed with an indexed address on a little endian system, it generates > > > an > > > illegal address and generates an internal compiler error. > > > > So all this is caused by these splitters running after reload. Why do > > we have to do that? Do we? We should be able to just change it to a > > subreg and shift, early already? > > The part that is failing is trying to optimize the case: > > x = signbit (*p) > > Doing the code with just registers means that you will get: > > vr = load *p > gr = diret move from vr > shift > > With the optimization you get: > > gr = 'upper' word > shift > > If the address is: > > base + index > > In little endian, the compiler tried to generate: > > (base + index) + 8 > > And it didn't have a temporary register to put base+index. Yes, but you won't have this problem if you do this before RA. Is there any good reason to do it only after RA? If you do it before RA you immediately get some_di = load upper from *p shift just as you want (and things might even be optimised further). Segher
Re: [PATCH], Fix PR target/pr83862: Fix PowerPC long double signbit with -mabi=ieeelongdouble
On Thu, Jan 18, 2018 at 12:39:05PM -0600, Segher Boessenkool wrote: > On Wed, Jan 17, 2018 at 06:40:12PM -0500, Michael Meissner wrote: > > On Wed, Jan 17, 2018 at 04:09:57PM -0600, Segher Boessenkool wrote: > > > On Tue, Jan 16, 2018 at 10:55:43PM -0500, Michael Meissner wrote: > > > > PR target/83862 pointed out a problem I put into the 128-bit floating > > > > point > > > > type signbit optimization. The issue is we want to avoid doing a load > > > > to a > > > > floating point/vector register and then a direct move to do signbit, so > > > > we > > > > change the load to load the upper 64-bits of the floating point value > > > > to get > > > > the sign bit. Unfortunately, if the type is IEEE 128-bit and memory is > > > > addressed with an indexed address on a little endian system, it > > > > generates an > > > > illegal address and generates an internal compiler error. > > > > > > So all this is caused by these splitters running after reload. Why do > > > we have to do that? Do we? We should be able to just change it to a > > > subreg and shift, early already? > > > > The part that is failing is trying to optimize the case: > > > > x = signbit (*p) > > > > Doing the code with just registers means that you will get: > > > > vr = load *p > > gr = diret move from vr > > shift > > > > With the optimization you get: > > > > gr = 'upper' word > > shift > > > > If the address is: > > > > base + index > > > > In little endian, the compiler tried to generate: > > > > (base + index) + 8 > > > > And it didn't have a temporary register to put base+index. > > Yes, but you won't have this problem if you do this before RA. Is there > any good reason to do it only after RA? Yes you will, because it is a memory address not a register. > If you do it before RA you immediately get > > some_di = load upper from *p > shift > > just as you want (and things might even be optimised further). And in my experience, splitting such loads and changing the type before RA, often times leads to error. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH] handle local aggregate initialization in strlen (PR 83821)
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01131.html This was submitted in stage 3 but if fixing xfailed assertions in tests by enhancing optimizations is out of scope for the current stage let me know so I can schedule this change for GCC 9. On 01/12/2018 02:30 PM, Martin Sebor wrote: A failure in a test for the recently enhanced -Warray-bounds warning exposed an unnecessarily broad restriction in the strlen pass that prevents it from tracking the length of a member string of locally defined and initialized struct: void f (void) { struct { char s[8]; int i } a = { "1234", 5 }; if (strlen (a.s) != 4) // not folded abort (); } IIUC, the restriction was in place to account for writes into an array changing or invalidating the length of a string stored in its initial elements. This would happen if the write either changed the string's terminating nul byte, or if it reset one of the prior non-nul bytes. To reflect just this intent the restriction can be tightened up to improve the pass' ability to track even the lengths of string members of locally initialized aggregates. Besides leading to better code this change also clears up the test failure. Tested on x86_64-linux. Martin
[PATCH, libstdc++]: Fix configure check for linux/random.h
Hello! linux/random.h uses __u32 types, which are defined through inclusion of linux/types.h header. On some older systems (e.g. CentOS 5.11) linux/random.h doesn't include linux/types.h by itself, and configure declares found linux/random.h as unusable. Attached patch fixes detection by including linux/types.h in the configure check. 2018-01-18 Uros Bizjak * configure.ac (AC_CHECK_HEADERS): Add linux/types.h. Conditionally include linux/types.h when checking linux/random.h header. * config.h.in: Regenerate. * configure: Ditto. * src/c++11/random.cc: Conditionally include linux/types.h. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32} CentOS 5.11 and Fedora 26. OK for mainline? Uros. diff --git a/libstdc++-v3/config.h.in b/libstdc++-v3/config.h.in index e31e054..5a0f067 100644 --- a/libstdc++-v3/config.h.in +++ b/libstdc++-v3/config.h.in @@ -270,6 +270,9 @@ /* Define to 1 if you have the header file. */ #undef HAVE_LINUX_RANDOM_H +/* Define to 1 if you have the header file. */ +#undef HAVE_LINUX_TYPES_H + /* Define to 1 if you have the header file. */ #undef HAVE_LOCALE_H diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure index f5798d3..69845f4 100755 --- a/libstdc++-v3/configure +++ b/libstdc++-v3/configure @@ -21799,7 +21799,7 @@ for ac_header in endian.h execinfo.h float.h fp.h ieeefp.h inttypes.h \ locale.h machine/endian.h machine/param.h nan.h stdint.h stdlib.h string.h \ strings.h sys/ipc.h sys/isa_defs.h sys/machine.h sys/param.h \ sys/resource.h sys/sem.h sys/stat.h sys/time.h sys/types.h unistd.h \ -wchar.h wctype.h linux/random.h +wchar.h wctype.h linux/types.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default" @@ -21814,6 +21814,23 @@ fi done +for ac_header in linux/random.h +do : + ac_fn_c_check_header_compile "$LINENO" "linux/random.h" "ac_cv_header_linux_random_h" "#ifdef HAVE_LINUX_TYPES_H +# include +#endif + +" +if test "x$ac_cv_header_linux_random_h" = x""yes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_LINUX_RANDOM_H 1 +_ACEOF + +fi + +done + + # Only do link tests if native. Else, hardcode. if $GLIBCXX_IS_NATIVE; then diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac index c90c509..e110f57 100644 --- a/libstdc++-v3/configure.ac +++ b/libstdc++-v3/configure.ac @@ -237,7 +237,13 @@ AC_CHECK_HEADERS([endian.h execinfo.h float.h fp.h ieeefp.h inttypes.h \ locale.h machine/endian.h machine/param.h nan.h stdint.h stdlib.h string.h \ strings.h sys/ipc.h sys/isa_defs.h sys/machine.h sys/param.h \ sys/resource.h sys/sem.h sys/stat.h sys/time.h sys/types.h unistd.h \ -wchar.h wctype.h linux/random.h]) +wchar.h wctype.h linux/types.h]) + +AC_CHECK_HEADERS([linux/random.h], [], [], +[[#ifdef HAVE_LINUX_TYPES_H +# include +#endif +]]) # Only do link tests if native. Else, hardcode. if $GLIBCXX_IS_NATIVE; then diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc index 415f794..19daf6d 100644 --- a/libstdc++-v3/src/c++11/random.cc +++ b/libstdc++-v3/src/c++11/random.cc @@ -42,6 +42,10 @@ # include #endif +#ifdef _GLIBCXX_HAVE_LINUX_TYPES_H +# include +#endif + #ifdef _GLIBCXX_HAVE_LINUX_RANDOM_H # include #endif
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On 2018-01-18 12:51 PM, Jakub Jelinek wrote: On Thu, Jan 18, 2018 at 08:10:50AM -0500, John David Anglin wrote: On 2018-01-17 3:07 PM, Jakub Jelinek wrote: PR83882 complains that PR81715 testcase fails on callee copies parameter targets. The following patch ought to fix that, but I have only bootstrapped/regtested it on x86_64-linux and i686-linux + on the testcase with hppa. John, do you think you could test this on hppa without the callee copies default change? On hppa2.0w-hp-hpux11.11, pr81715.C now passes. However, there are some new fails in the gcc and g++ suites: That is really weird, at least in the cross-compiler on throw-1.C I don't see with -mno-caller-copies the new code triggering at all. Tried both hppa-linux and hppa-hpux crosses. Furthermore, throw-1.C has: /* { dg-skip-if "Stack alignment is too small" { hppa*-*-hpux* } } */ Sorry, these fails are my fault. I tweaked the skip to just hppa*64*. Nominally, 32-bit hpux has 64 byte stack alignment but maybe alignment in signal frames is insufficient. so I wonder how it comes you see them run on hpux at all. I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c. Will have to investigate. I see they predate your patch. What kind of error would be on pr83487-*? Digging into log, I see: spawn /test/gnu/gcc/objdir/gcc/xgcc -B/test/gnu/gcc/objdir/gcc/ -fno-diagnostics-show-caret -fdiagnostics-color=never -DSKIP_DECIMAL_FLOAT -c -o c_compat_y_tst.o /test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/compat//pr83487-1_y.c /test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/compat//pr83487-1_y.c:27:1: warning: alignment (16) for a exceeds maximum alignment for global common data. Using 8 output is: /test/gnu/gcc/gcc/gcc/testsuite/gcc.dg/compat//pr83487-1_y.c:27:1: warning: alignment (16) for a exceeds maximum alignment for global common data. Using 8 FAIL: gcc.dg/compat/pr83487-1 c_compat_y_tst.o compile The typical fix is to add "-fno-common" but I'm not sure where it would go in this case. So, I don't see any regressions from you change. Thanks, Dave -- John David Anglin dave.ang...@bell.net
Re: [PATCH, libstdc++]: Fix configure check for linux/random.h
On 18/01/18 20:31 +0100, Uros Bizjak wrote: Hello! linux/random.h uses __u32 types, which are defined through inclusion of linux/types.h header. On some older systems (e.g. CentOS 5.11) linux/random.h doesn't include linux/types.h by itself, and configure declares found linux/random.h as unusable. Attached patch fixes detection by including linux/types.h in the configure check. 2018-01-18 Uros Bizjak * configure.ac (AC_CHECK_HEADERS): Add linux/types.h. Conditionally include linux/types.h when checking linux/random.h header. * config.h.in: Regenerate. * configure: Ditto. * src/c++11/random.cc: Conditionally include linux/types.h. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32} CentOS 5.11 and Fedora 26. OK for mainline? OK, thanks.
Re: Go patch committed: Enable escape analysis for runtime
Hi Ian, > This patch to the Go frontend by Cherry Zhang enables escape analysis > for the runtime package in the Go frontend. The runtime package was > hard-coded non-escape, and the escape analysis was not run for the > runtime package. This patch removes the hard-code, and lets the > escape analysis decide. It is not allowed for local variables and > closures in the runtime to be heap allocated. This patch adds the > check that make sure that they indeed do not escape. > > The escape analysis is always run when compiling the runtime now. > > Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed > to mainline. this broke Solaris bootstrap: /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:182:57: error: n escapes to heap, not allowed in runtime if port_getn(portfd, &events[0], uint32(len(events)), &n, wait) < 0 { ^ /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:176:11: error: zero escapes to heap, not allowed in runtime wait = &zero ^ /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:174:6: error: .runtime.zero escapes to heap, not allowed in runtime var zero timespec ^ /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:179:6: error: .runtime.events escapes to heap, not allowed in runtime var events [128]portevent ^ /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:181:6: error: .runtime.n escapes to heap, not allowed in runtime var n uint32 = 1 ^ Fixed as follows. i386-pc-solaris2.11 and sparc-sun-solaris2.11 builds have completed, make check still running. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University diff --git a/libgo/go/runtime/netpoll_solaris.go b/libgo/go/runtime/netpoll_solaris.go --- a/libgo/go/runtime/netpoll_solaris.go +++ b/libgo/go/runtime/netpoll_solaris.go @@ -84,6 +84,7 @@ func port_associate(port, source int32, //extern port_dissociate func port_dissociate(port, source int32, object uintptr) int32 +//go:noescape //extern port_getn func port_getn(port int32, evs *portevent, max uint32, nget *uint32, timeout *timespec) int32
C++ PATCH for c++/82461, member list-initialization and -std=c++17
In C++17 mode, build_special_member_call builds up an initializer for the object and then uses it directly rather than pass it to the copy constructor. This pattern results in a TARGET_EXPR showing up in potential_constant_expression, which complains about a TARGET_EXPR indicating a temporary of non-literal type, which it isn't in this case because it's being used to initialize a non-temporary object. This patch recognizes that case by checking TARGET_EXPR_DIRECT_INIT_P. Tested x86_64-pc-linux-gnu, applying to trunk. commit 9084980615c1aa29277fee9fe33a0b52bbc0ab94 Author: Jason Merrill Date: Wed Jan 17 17:24:16 2018 -0500 PR c++/82461 - constexpr list-initialized member * constexpr.c (potential_constant_expression_1): Check TARGET_EXPR_DIRECT_INIT_P. diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 8984613aa41..9a548d29bbc 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -5726,7 +5726,8 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now, return RECUR (TREE_OPERAND (t, 1), want_rval); case TARGET_EXPR: - if (!literal_type_p (TREE_TYPE (t))) + if (!TARGET_EXPR_DIRECT_INIT_P (t) + && !literal_type_p (TREE_TYPE (t))) { if (flags & tf_error) { diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-list2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-list2.C new file mode 100644 index 000..780a64dbbc4 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-list2.C @@ -0,0 +1,20 @@ +// PR c++/82461 +// { dg-do compile { target c++11 } } + +class A { +private: +public: + constexpr A() {} + ~A() {} +}; + +class B { +private: + A a; +public: + constexpr B() : a{} {} +// works +// constexpr B() : a() {} + + ~B() {} +};
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On 2018-01-18 2:44 PM, John David Anglin wrote: I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c. Will have to investigate. I see they predate your patch. This is a test issue. The long double support on hpux is a software emulation and it's not supported the fenv.h routines. Is this change okay? Dave -- John David Anglin dave.ang...@bell.net 2018-01-18 John David Anglin * gcc.dg/torture/pr52451.c (main): Skip long double test on hppa*-*-hpux*. Index: gcc.dg/torture/pr52451.c === --- gcc.dg/torture/pr52451.c(revision 256744) +++ gcc.dg/torture/pr52451.c(working copy) @@ -49,7 +49,9 @@ TEST (float, f); TEST (double, ); +#if !defined(__hppa__) || !defined(__hpux__) TEST (long double, l); +#endif return 0; }
Re: [PATCH] Add clobbers for callee copied argument temporaries (PR sanitizer/81715, PR testsuite/83882)
On Thu, Jan 18, 2018 at 03:31:38PM -0500, John David Anglin wrote: > On 2018-01-18 2:44 PM, John David Anglin wrote: > > > I don't have hpux or hppa-linux fenv.h, so can't check pr52451.c. > > Will have to investigate. I see they predate your patch. > This is a test issue. The long double support on hpux is a software > emulation > and it's not supported the fenv.h routines. I think this explanation belongs into the testcase as a comment to explain why you aren't testing it on hppa-hpux. Ok with that change. > 2018-01-18 John David Anglin > > * gcc.dg/torture/pr52451.c (main): Skip long double test on > hppa*-*-hpux*. > > Index: gcc.dg/torture/pr52451.c > === > --- gcc.dg/torture/pr52451.c (revision 256744) > +++ gcc.dg/torture/pr52451.c (working copy) > @@ -49,7 +49,9 @@ > >TEST (float, f); >TEST (double, ); > +#if !defined(__hppa__) || !defined(__hpux__) >TEST (long double, l); > +#endif > >return 0; > } Jakub
[C++ Patch, obvious] Fix comment typo
Hi, I noticed the below, I'm committing the fix it as obvious. Cheers, Paolo. / 2018-01-18 Paolo Carlini * cp-tree.h: Fix comment typo (DECL_NON_TRIVIALLY_INITIALIZED_P vs DECL_NONTRIVIALLY_INITIALIZED_P). Index: cp-tree.h === --- cp-tree.h (revision 256864) +++ cp-tree.h (working copy) @@ -407,7 +407,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX]; ICS_BAD_FLAG (in _CONV) FN_TRY_BLOCK_P (in TRY_BLOCK) BIND_EXPR_BODY_BLOCK (in BIND_EXPR) - DECL_NON_TRIVIALLY_INITIALIZED_P (in VAR_DECL) + DECL_NONTRIVIALLY_INITIALIZED_P (in VAR_DECL) CALL_EXPR_ORDERED_ARGS (in CALL_EXPR, AGGR_INIT_EXPR) DECLTYPE_FOR_REF_CAPTURE (in DECLTYPE_TYPE) CONSTRUCTOR_C99_COMPOUND_LITERAL (in CONSTRUCTOR)
Re: std::vector default default and move constructors
On 16/01/2018 01:20, Jonathan Wakely wrote: On 15/01/18 22:32 +0100, François Dumont wrote: On 15/01/2018 13:29, Jonathan Wakely wrote: In fact it introduces a serious regression because of this line: - vector(vector&& __x) noexcept - : _Base(std::move(__x)) { } + vector(vector&&) = default; Consider what happens if we have an allocator that is not nothrow-move-constructible, which can happen if the allocator has a noexcept(false) move constructor, or more likely just has no move constructor but has a noexcept(false) copy constructor: One of the move constructors needs to be defined with a function body, not defaulted, so that it can also be declared noexcept Here is an updated patch with this change. Do you also want an explicit noexcept on the vector(vector&&) default declaration ? Ok to commit this vector patch ? I was going to propose a similar patch for vector but checking Standard there is not this noexcept qualification on the move constructor. François diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h index acec501..a6282ec 100644 --- a/libstdc++-v3/include/bits/stl_vector.h +++ b/libstdc++-v3/include/bits/stl_vector.h @@ -85,34 +85,56 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER typedef typename __gnu_cxx::__alloc_traits<_Tp_alloc_type>::pointer pointer; - struct _Vector_impl - : public _Tp_alloc_type + struct _Vector_impl_data { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; - _Vector_impl() - : _Tp_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage() - { } - - _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT - : _Tp_alloc_type(__a), _M_start(), _M_finish(), _M_end_of_storage() + _Vector_impl_data() _GLIBCXX_NOEXCEPT + : _M_start(), _M_finish(), _M_end_of_storage() { } #if __cplusplus >= 201103L - _Vector_impl(_Tp_alloc_type&& __a) noexcept - : _Tp_alloc_type(std::move(__a)), - _M_start(), _M_finish(), _M_end_of_storage() - { } + _Vector_impl_data(_Vector_impl_data&& __x) noexcept + : _M_start(__x._M_start), _M_finish(__x._M_finish), + _M_end_of_storage(__x._M_end_of_storage) + { __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); } #endif - void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT + void + _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } + }; + + struct _Vector_impl + : public _Tp_alloc_type, public _Vector_impl_data + { + _Vector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Tp_alloc_type()) ) + : _Tp_alloc_type() + { } + + _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT + : _Tp_alloc_type(__a) + { } + +#if __cplusplus >= 201103L + _Vector_impl(_Vector_impl&& __x) noexcept + : _Tp_alloc_type(std::move(__x)), _Vector_impl_data(std::move(__x)) + { } + + _Vector_impl(_Tp_alloc_type&& __a) noexcept + : _Tp_alloc_type(std::move(__a)) + { } + + _Vector_impl(_Tp_alloc_type&& __a, _Vector_impl&& __rv) noexcept + : _Tp_alloc_type(std::move(__a)), _Vector_impl_data(std::move(__rv)) + { } +#endif #if _GLIBCXX_SANITIZE_STD_ALLOCATOR && _GLIBCXX_SANITIZE_VECTOR template @@ -235,38 +257,42 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _Tp_alloc_type& _M_get_Tp_allocator() _GLIBCXX_NOEXCEPT - { return *static_cast<_Tp_alloc_type*>(&this->_M_impl); } + { return this->_M_impl; } const _Tp_alloc_type& _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT - { return *static_cast(&this->_M_impl); } + { return this->_M_impl; } allocator_type get_allocator() const _GLIBCXX_NOEXCEPT { return allocator_type(_M_get_Tp_allocator()); } - _Vector_base() - : _M_impl() { } +#if __cplusplus >= 201103L + _Vector_base() = default; +#else + _Vector_base() { } +#endif _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT : _M_impl(__a) { } +#if !_GLIBCXX_INLINE_VERSION _Vector_base(size_t __n) : _M_impl() { _M_create_storage(__n); } +#endif _Vector_base(size_t __n, const allocator_type& __a) : _M_impl(__a) { _M_create_storage(__n); } #if __cplusplus >= 201103L + _Vector_base(_Vector_base&&) = default; + +# if !_GLIBCXX_INLINE_VERSION _Vector_base(_Tp_alloc_type&& __a) noexcept : _M_impl(std::move(__a)) { } - _Vector_base(_Vector_base&& __x) noexcept - : _M_impl(std::move(__x._M_get_Tp_allocator())) - { this->_M_impl._M_swap_data(__x._M_impl); } - _Vector_base(_Vector_base&& __x, const allocator_type& __a) : _M_impl(__a) { @@ -278,6 +304,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _M_create_storage(__n); } } +# endif + + _Vector_base(const allocator_type& __a, _Vector_base&& __x) + : _M_impl(_Tp_alloc_type(__a), std::move(__x._M_impl)) +
Re: C++ PATCH for c++/83714, ICE checking return from template
On Wed, Jan 17, 2018 at 4:31 PM, Jason Merrill wrote: > On Wed, Jan 17, 2018 at 3:46 PM, Paolo Carlini > wrote: >> Hi Jason, >> >> On 17/01/2018 00:04, Jason Merrill wrote: >>> >>> Like my recent patch for 83186, we were missing a >>> build_non_dependent_expr. >>> >>> Tested x86_64-pc-linux-gnu, applying to trunk. >> >> Lately I'm seeing (H.J. Lu too) a regression: >> >> FAIL: g++.dg/template/inherit4.C -std=c++11 (test for excess errors) >> FAIL: g++.dg/template/inherit4.C -std=c++14 (test for excess errors) >> >> which seems related to this change of yours: if I comment out the new >> build_non_dependent_expr call the test is accepted again. Could you please >> have a look? > > Hmm, wonder why I didn't see that in my testing. Checking a fix now. We were failing to treat A::foo_ as an instantiation-dependent SCOPE_REF, which it is because of the dependent base B. Tested x86_64-pc-linux-gnu, applying to trunk. commit 1579ec4777b69f2fd55ce1b3e425d6222f1f015a Author: Jason Merrill Date: Wed Jan 17 16:29:45 2018 -0500 Fix template/inherit4.C. PR c++/83714 * search.c (any_dependent_bases_p): Handle null TREE_BINFO. * pt.c (instantiation_dependent_scope_ref_p): True if any_dependent_bases_p. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 3d7d45864c6..0296845a31b 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -23952,6 +23952,9 @@ instantiation_dependent_scope_ref_p (tree t) { if (DECL_P (TREE_OPERAND (t, 1)) && CLASS_TYPE_P (TREE_OPERAND (t, 0)) + /* A dependent base could make a member inaccessible in the current +class. */ + && !any_dependent_bases_p () && accessible_in_template_p (TREE_OPERAND (t, 0), TREE_OPERAND (t, 1))) return false; diff --git a/gcc/cp/search.c b/gcc/cp/search.c index 477e9aef105..920fc15468a 100644 --- a/gcc/cp/search.c +++ b/gcc/cp/search.c @@ -2612,6 +2612,13 @@ any_dependent_bases_p (tree type) if (!type || !CLASS_TYPE_P (type) || !processing_template_decl) return false; + /* If we haven't set TYPE_BINFO yet, we don't know anything about the bases. + Return false because in this situation we aren't actually looking up names + in the scope of the class, so it doesn't matter whether it has dependent + bases. */ + if (!TYPE_BINFO (type)) +return false; + unsigned i; tree base_binfo; FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_BINFOS (TYPE_BINFO (type)), i, base_binfo) diff --git a/gcc/testsuite/g++.dg/torture/pr83619.C b/gcc/testsuite/g++.dg/torture/pr83619.C index a5ca5372f18..5afd3ce0790 100644 --- a/gcc/testsuite/g++.dg/torture/pr83619.C +++ b/gcc/testsuite/g++.dg/torture/pr83619.C @@ -22,7 +22,7 @@ public: static void c (e *g) { - g->c (); + g->c (); // { dg-message "incomplete" } } }; };
C++ PATCH to fix bogus error with constexpr and empty class (PR c++/81933)
The problem in this PR is that we get error: constexpr call flows off the end of the function for a valid testcase, but not in C++17 mode. That's because in C++17 we execute: 4321 if (cxx_dialect >= cxx17 && !BINFO_VIRTUAL_P (binfo)) 4322 { 4323 tree decl = build_base_field_1 (t, basetype, next_field); 4324 DECL_FIELD_OFFSET (decl) = BINFO_OFFSET (binfo); 4325 DECL_FIELD_BIT_OFFSET (decl) = bitsize_zero_node; 4326 SET_DECL_OFFSET_ALIGN (decl, BITS_PER_UNIT); 4327 } which puts some additional fields into some classes. This makes a difference for split_nonconstant_init: 751 if (split_nonconstant_init_1 (dest, init)) 752 init = NULL_TREE; where split_nonconstant_init_1 is true so that means that the DECL_INITIAL will be null. Why is split_nonconstant_init_1 true? Because it ends up calling complete_ctor_at_level_p with 6114 return count_type_elements (type, true) == num_elts; but the count_type_elements result differs between c++14/c++17 precisely because of the fields added in build_base_field. But we should only add those fields when initializing aggregates with bases, which is not what's happening in this testcase. With DECL_INITIAL being null, we error here (result is NULL): 1702 result = *ctx->values->get (slot ? slot : res); 1703 if (result == NULL_TREE && !*non_constant_p) 1704 { 1705 if (!ctx->quiet) 1706 error ("% call flows off the end " 1707"of the function"); 1708 *non_constant_p = true; 1709 } I think the problem is that we're dealing with an empty class here, for which we should just generate an empty ctor, as in few other spots. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2018-01-18 Marek Polacek PR c++/81933 * constexpr.c (cxx_eval_call_expression): When evaluating a call whose result's type is an empty class, return empty constructor. * g++.dg/cpp1y/constexpr-empty4.C: New test. diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c index 8984613aa41..0ff74b98437 100644 --- gcc/cp/constexpr.c +++ gcc/cp/constexpr.c @@ -1701,7 +1701,15 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, result = void_node; else { - result = *ctx->values->get (slot ? slot : res); + tree retval = slot ? slot : res; + /* If the class is empty, we aren't actually loading anything. */ + if (is_really_empty_class (TREE_TYPE (retval))) + { + result = build_constructor (TREE_TYPE (retval), NULL); + TREE_CONSTANT (result) = true; + } + else + result = *ctx->values->get (retval); if (result == NULL_TREE && !*non_constant_p) { if (!ctx->quiet) diff --git gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C index e69de29bb2d..059220f8268 100644 --- gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C +++ gcc/testsuite/g++.dg/cpp1y/constexpr-empty4.C @@ -0,0 +1,51 @@ +// PR c++/81933 +// { dg-do compile { target c++14 } } + +namespace std { +template struct __decay_and_strip { typedef _Tp __type; }; +template struct enable_if { typedef int type; }; +template struct _Head_base { + constexpr _Head_base(_Head) {} +}; +template struct _Tuple_impl; +template +struct _Tuple_impl<_Idx, _Head, _Tail...> : _Tuple_impl<1, _Tail...>, // { dg-warning "direct base" } +_Head_base<_Head> { + typedef _Tuple_impl<1, _Tail...> _Inherited; + typedef _Head_base<_Head> _Base; + constexpr _Tuple_impl(_Head __head, _Tail... __tail) + : _Inherited(__tail...), _Base(__head) {} + _Tuple_impl(const _Tuple_impl &) = default; + _Tuple_impl(_Tuple_impl &&); +}; +template +struct _Tuple_impl<_Idx, _Head> : _Head_base<_Head> { + typedef _Head_base<_Head> _Base; + constexpr _Tuple_impl(_Head __head) : _Base(__head) {} +}; +template struct _TC { + static constexpr bool _NotSameTuple() { return true; } +}; +template class tuple : _Tuple_impl<0, _Elements...> { + typedef _Tuple_impl<0, _Elements...> _Inherited; + +public: + template ::_NotSameTuple()>::type = false> + constexpr tuple(_UElements... __elements) : _Inherited(__elements...) {} + tuple(const tuple &) = default; +}; +template +constexpr tuple::__type...> +make_tuple(_Elements... __args) { + typedef tuple::__type...> __result_type; + return __result_type(__args...); +} +} +struct any_udt {}; +template constexpr auto flatten(Tuples... tuples) { + auto all = std::make_tuple(tuples...); + auto flat(all); + return flat; +} +constexpr auto fail = flatten(any_udt{}, any_udt{}); Marek
Re: [PATCH, rs6000] Add 128-bit support for vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be() builtins.
Hi Carl, On Thu, Jan 18, 2018 at 08:06:17AM -0800, Carl Love wrote: > The following patch adds missing 128-bit support for the builtins > vec_xl(), vec_xl_be(), vec_xst(), vec_xst_be(). It also includes a bug > fix required for the new 128-bit arguments for the vec_xst_be() and > vec_xl_be() builtins. New test cases are also included. This patch > completes the tests of all the various load/store builtins that I have > been working on. This patch adds a torture test for the various load > store tests to check that they work for -O0, -O1 and -O2 testing. This > was done as the work on the load/store builtins found numerous issues > that were optimization level dependent which were fixed by the previous > commits as well as this commit. > Let me know if the patch looks OK or not. Let me know if you want to > include it in stage 4 or wait for the next release. Thanks. Since this is last I think we should include it now; it isn't likely to regress anything, either. But, some comments, and a real bug: > 2018-01-18 Carl Love > * gcc.target/powerpc/powerpc.exp: Add torture tests for >builtins-4-runnable.c, builtins-6-runnable.c, >builtins-5-p9-runnable.c, builtins-6-p9-runnable.c. The indent here is wrong (stray space). > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -15572,6 +15572,12 @@ altivec_expand_builtin (tree exp, rtx target, bool > *expandedp) > unaligned-supporting store, so use a generic expander. For > little-endian, the exact element-reversing instruction must > be used. */ > + case VSX_BUILTIN_ST_ELEMREV_V1TI: > + { > + enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v1ti > + : CODE_FOR_vsx_st_elemrev_v1ti); > + return altivec_expand_stv_builtin (code, exp); > + } Last line has a space too many. Or, actually, all the rest have one too few? > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -1093,6 +1093,18 @@ (define_insn "vsx_ld_elemrev_v2di" > "lxvd2x %x0,%y1" > [(set_attr "type" "vecload")]) > > +(define_insn "vsx_ld_elemrev_v1ti" > + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa") > +(vec_select:V1TI > + (match_operand:V1TI 1 "memory_operand" "Z") > + (parallel [(const_int 0)])))] > + "VECTOR_MEM_VSX_P (V1TImode) && !BYTES_BIG_ENDIAN" > +{ > + return "lxvd2x %x0,%y1; xxpermdi %x0,%x0,%x0,2"; > +} Not ; but \; please. No space after it. We currently have exactly as many xxpermdi,2 as xxswapdi (147 each) but the latter is more readable, please prefer that. > +(define_insn "vsx_st_elemrev_v1ti" > + [(set (match_operand:V1TI 0 "memory_operand" "=Z") > +(vec_select:V1TI > + (match_operand:V1TI 1 "vsx_register_operand" "wa") > + (parallel [(const_int 0)])))] > + "VECTOR_MEM_VSX_P (V2DImode) && !BYTES_BIG_ENDIAN" > +{ > + return "xxpermdi %x1,%x1,%x1,2; stxvd2x %x1,%y0"; > +} This is wrong: it changes operand 1 but the RTL does not mention that. > --- a/gcc/testsuite/gcc.target/powerpc/powerpc.exp > +++ b/gcc/testsuite/gcc.target/powerpc/powerpc.exp > @@ -49,4 +49,16 @@ gcc-dg-runtest [list $srcdir/$subdir/savres.c] "" $alti > > # All done. > torture-finish > + > +torture-init > +# Test load/store builtins at all optimizations > +set-torture-options [list -O0 -O1 -O2] All? -O3 and -Os as well? (And maybe more, but those are important). > +gcc-dg-runtest [list $srcdir/$subdir/builtins-4-runnable.c \ > + $srcdir/$subdir/builtins-6-runnable.c \ > + $srcdir/$subdir/builtins-5-p9-runnable.c \ > + $srcdir/$subdir/builtins-6-p9-runnable.c] "" > $DEFAULT_CFLAGS Weird indent on this line. Rest looks fine :-) Segher
[C++ Patch] PR 83921 ("[7/8 Regression] GCC rejects constexpr initialization of empty aggregate")
Hi, I'm finishing testing on x86_64-linux the below - which anyway seems very unlikely to cause regressions because we aren't really stress testing the relevant checks in potential_constant_expression_1 much, if at all (surely stmtexpr19.C tests static). Anyway, the issue is the following. In 239268 aka "Implement C++17 constexpr lambda" Jason added some checks to potential_constant_expression_1 covering static, thread_local and uninitialized var declaration in constexpr function context. Then extended to constexpr context more generally in 249382 aka "constexpr and static var in statement-expression", with ext/stmtexpr19.C covering the static case. Now, it looks like the check for uninitialized vars in constexpr functions context is more correctly carried out by check_for_uninitialized_const_var instead, because the simple check in potential_constant_expression_1 as-is causes the regression pointed out by this bug. Thus the fix below which just restricts the check in potential_constant_expression_1, and the testcases, one for this bug proper, plus one, very similar to stmtexpr19.C, double checking that we are still diagnosing in the statement-expression context. I also verified under the debugger how for constexpr-83921.C we are actually running check_for_uninitialized_const_var on 'f' - which obviously passes. Thanks, Paolo. /cp 2018-01-18 Paolo Carlini PR c++/83921 * constexpr.c (potential_constant_expression_1): Do not handle uninitialized vars in constexpr function here. /testsuite 2018-01-18 Paolo Carlini PR c++/83921 * g++.dg/cpp1y/constexpr-83921.C: New. * g++.dg/ext/stmtexpr20.C: Likewise. Index: cp/constexpr.c === --- cp/constexpr.c (revision 256865) +++ cp/constexpr.c (working copy) @@ -5708,7 +5708,9 @@ potential_constant_expression_1 (tree t, bool want "% in % context", tmp); return false; } - else if (!DECL_NONTRIVIALLY_INITIALIZED_P (tmp)) + else if (!DECL_NONTRIVIALLY_INITIALIZED_P (tmp) + /* Handled in check_for_uninitialized_const_var. */ + && !var_in_constexpr_fn (tmp)) { if (flags & tf_error) error_at (DECL_SOURCE_LOCATION (tmp), "uninitialized " Index: testsuite/g++.dg/cpp1y/constexpr-83921.C === --- testsuite/g++.dg/cpp1y/constexpr-83921.C(nonexistent) +++ testsuite/g++.dg/cpp1y/constexpr-83921.C(working copy) @@ -0,0 +1,7 @@ +// PR c++/83921 +// { dg-do compile { target c++14 } } + +struct Foo {}; +constexpr void test() { + Foo f; +} Index: testsuite/g++.dg/ext/stmtexpr20.C === --- testsuite/g++.dg/ext/stmtexpr20.C (nonexistent) +++ testsuite/g++.dg/ext/stmtexpr20.C (working copy) @@ -0,0 +1,15 @@ +// PR c++/83921 +// { dg-options "" } +// { dg-do compile { target c++11 } } + +struct test { const int *addr; }; + +const test* setup() +{ + static constexpr test atest = +{ + ({ int inner; &inner; }) // { dg-error "uninitialized" } +}; + + return &atest; +}
[PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)
On Thu, Jan 18, 2018 at 04:59:01PM +0100, Jan Hubicka wrote: > this patch ICE where the profile in cgraph mismatch profile in BB. This is > becuase > of expansion of speculative devirtualization where we get some roundoff > issues. > > Bootstrapped/regtested x86_64-linux, comitted. > Honza > > PR ipa/83619 > * g++.dg/torture/pr83619.C: New testcase. > * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge > frequencies. > --- testsuite/g++.dg/torture/pr83619.C(revision 0) > +++ testsuite/g++.dg/torture/pr83619.C(working copy) ... This testcase FAILs everywhere: /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C: In static member function 'static void i::j< >::c(e*)': /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of incomplete type 'class e' /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:8:7: note: forward declaration of 'class e' FAIL: g++.dg/torture/pr83619.C -O0 (test for excess errors) Excess errors: /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of incomplete type 'class e' The following patch tweaks it so that it doesn't emit the warning, yet still ICEs before your cgraph.c change and PASSes after it. Tested on x86_64-linux and i686-linux, ok for trunk? 2018-01-18 Jakub Jelinek PR ipa/83619 * g++.dg/torture/pr83619.C (e): Define before first use instead of forward declaration. --- gcc/testsuite/g++.dg/torture/pr83619.C.jj 2018-01-18 21:11:54.865206861 +0100 +++ gcc/testsuite/g++.dg/torture/pr83619.C 2018-01-18 23:51:32.693482293 +0100 @@ -5,7 +5,9 @@ class d public: virtual unsigned c (); }; -class e; +class e : public d +{ +}; class i { void h (); @@ -33,9 +35,6 @@ public: l (int); k *operator-> (); }; -class e : public d -{ -}; class m final : e { unsigned c (); Jakub
[C++ PATCH] Fix ICE in potential_constant_expression_1 (PR c++/83918)
Hi! Before location wrappers were introduced, the potential_constant_expression_1 assumption that a VIEW_CONVERT_EXPR must have non-NULL type was right, if it is a type dependent reinterpret cast, it would be REINTERPRET_CAST_EXPR instead. Location wrappers around decls can have NULL type, if the decl they wrap also has NULL type. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Alternatively it could if (TREE_TYPE (t) && ... On a related note, I wonder if the return (RECUR (from, TREE_CODE (t) != VIEW_CONVERT_EXPR)); a few lines below is correct, VCE isn't always an lvalue, only if it is used in non-rvalue contexts. So, shouldn't it be instead: return (RECUR (from, (TREE_CODE (t) == VIEW_CONVERT_EXPR ? want_rval : rval))); Don't have a testcase where it would matter though. 2018-01-18 Jakub Jelinek PR c++/83918 * constexpr.c (potential_constant_expression_1) : Don't check for INTEGER_CST to pointer type conversion for location wrappers. * g++.dg/cpp1z/pr83918.C: New test. --- gcc/cp/constexpr.c.jj 2018-01-17 22:00:13.624228171 +0100 +++ gcc/cp/constexpr.c 2018-01-18 10:38:07.357469094 +0100 @@ -5385,7 +5385,8 @@ potential_constant_expression_1 (tree t, may change to something more specific to type-punning (DR 1312). */ { tree from = TREE_OPERAND (t, 0); - if (POINTER_TYPE_P (TREE_TYPE (t)) + if (!location_wrapper_p (t) + && POINTER_TYPE_P (TREE_TYPE (t)) && TREE_CODE (from) == INTEGER_CST && !integer_zerop (from)) { --- gcc/testsuite/g++.dg/cpp1z/pr83918.C.jj 2018-01-18 10:46:04.405439296 +0100 +++ gcc/testsuite/g++.dg/cpp1z/pr83918.C2018-01-18 10:45:42.934442290 +0100 @@ -0,0 +1,32 @@ +// PR c++/83918 +// { dg-do compile } +// { dg-options "-std=c++17" } + +constexpr unsigned +foo (unsigned x, unsigned y) +{ + return x > y ? x : y; +} + +template struct A; +template struct B; +template +struct A , B > +{ + enum : unsigned + { +u = foo (sizeof (S), sizeof (U)), +v = A , B >::w, +w = foo (u, v) + }; +}; + +template <> +struct A , B <>> +{ + enum : unsigned { w = 0 }; +}; + +constexpr static const auto v { A , + B <9,8,7,6,5,4,3,2,1>>::w }; +static_assert (v == sizeof (int)); Jakub
[C++ PATCH] Avoid spurious -Wignored-qualifiers warning on artificial C++17 P0138R2 cast (PR c++/83919)
Hi! These casts for P0138R2 isn't something the user typed in their code, so diagnosing -Wignored-qualifiers on these looks wrong. -Wuseless-cast has been handled similarly in the past already. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-01-18 Jakub Jelinek PR c++/83919 * typeck.c (convert_for_assignment): Suppress warn_ignored_qualifiers for direct enum init. * decl.c (reshape_init): Likewise. * g++.dg/cpp0x/pr83919.C: New test. --- gcc/cp/typeck.c.jj 2018-01-17 22:00:06.863228592 +0100 +++ gcc/cp/typeck.c 2018-01-18 10:58:39.976333499 +0100 @@ -8689,6 +8689,7 @@ convert_for_assignment (tree type, tree if (check_narrowing (ENUM_UNDERLYING_TYPE (type), elt, complain)) { warning_sentinel w (warn_useless_cast); + warning_sentinel w2 (warn_ignored_qualifiers); rhs = cp_build_c_cast (type, elt, complain); } else --- gcc/cp/decl.c.jj2018-01-18 00:41:40.564015137 +0100 +++ gcc/cp/decl.c 2018-01-18 11:10:10.579713182 +0100 @@ -6091,6 +6091,7 @@ reshape_init (tree type, tree init, tsub if (check_narrowing (ENUM_UNDERLYING_TYPE (type), elt, complain)) { warning_sentinel w (warn_useless_cast); + warning_sentinel w2 (warn_ignored_qualifiers); return cp_build_c_cast (type, elt, tf_warning_or_error); } else --- gcc/testsuite/g++.dg/cpp0x/pr83919.C.jj 2018-01-18 11:03:08.048295967 +0100 +++ gcc/testsuite/g++.dg/cpp0x/pr83919.C2018-01-18 10:59:57.465322655 +0100 @@ -0,0 +1,10 @@ +// PR c++/83919 +// { dg-do compile { target c++11 } } +// { dg-options "-Wignored-qualifiers" } + +enum class Conf; +struct foo +{ + foo (const Conf& conf) : x{conf} {} // { dg-bogus "type qualifiers ignored on cast result type" } + const Conf x; +}; Jakub
[PATCH] Fix unwind info in x86 interrupt functions (PR debug/83728)
Hi! Last summer i386 INCOMING_FRAME_SP_OFFSET macro has been changed, so that it is one word for most of the functions (as previously always), but 2 words for functions with interrupt attribute. Unfortunately this breaks the unwind info, as can be seen on the gcc.dg/guality/pr68037-1.c testcase. Our infrastructure assumes we have just one set of cfi instructions in the CIE (we can have multiple CIEs, but only for various flags, not for different cfi instructions), so if the first function being assembled is an interrupt function, we start with the 2*word initial offset in the CIE we emit (when -fno-dwarf2-cfi-asm), or just assume the CIE has that (with -fdwarf2-cfi-asm when it is GAS that emits it) even when it doesn't. For -fdwarf2-cfi-asm the effect is that until first CFA adjustment the offset is off by a word, after that correct. In both cases all CFA offsets are off by a word in non-interrupt functions. Or, if the first function doesn't have interrupt attribute, CFI in non-interrupt functions is correct, but interrupt functions are wrong. I've looked around and it seems stormy16 is the only other target that bases the INCOMING_FRAME_SP_OFFSET value not just on ABI changing switches, but on properties of the current function. For these 2 targets the following patch introduces another macro, DEFAULT_INCOMING_FRAME_SP_OFFSET, that is meant to be the same for all functions of the same ABI and where GAS supports .cfi_startproc it should also match what GAS does. For the hopefully minority of functions that need something else (i.e. interrupt functions) we emit a CFI instruction right at the start of the function. Bootstrapped/regtested on x86_64-linux and i686-linux, fixes the pr68037-1.c FAILs, ok for trunk? 2018-01-18 Jakub Jelinek PR debug/81570 PR debug/83728 * dwarf2cfi.c (DEFAULT_INCOMING_FRAME_SP_OFFSET): Define to INCOMING_FRAME_SP_OFFSET if not defined. (scan_trace): Add ENTRY argument. If true and DEFAULT_INCOMING_FRAME_SP_OFFSET != INCOMING_FRAME_SP_OFFSET, emit a note to adjust the CFA offset. (create_cfi_notes): Adjust scan_trace callers. (create_cie_data): Use DEFAULT_INCOMING_FRAME_SP_OFFSET rather than INCOMING_FRAME_SP_OFFSET in the CIE. * config/i386/i386.h (DEFAULT_INCOMING_FRAME_SP_OFFSET): Define. * config/stormy16/stormy16.h (DEFAULT_INCOMING_FRAME_SP_OFFSET): Likewise. * doc/tm.texi.in (DEFAULT_INCOMING_FRAME_SP_OFFSET): Document. * doc/tm.texi: Regenerated. --- gcc/dwarf2cfi.c.jj 2018-01-04 00:43:17.632703231 +0100 +++ gcc/dwarf2cfi.c 2018-01-18 15:57:21.165911020 +0100 @@ -52,6 +52,10 @@ along with GCC; see the file COPYING3. #ifndef INCOMING_RETURN_ADDR_RTX #define INCOMING_RETURN_ADDR_RTX (gcc_unreachable (), NULL_RTX) #endif + +#ifndef DEFAULT_INCOMING_FRAME_SP_OFFSET +#define DEFAULT_INCOMING_FRAME_SP_OFFSET INCOMING_FRAME_SP_OFFSET +#endif /* A collected description of an entire row of the abstract CFI table. */ struct GTY(()) dw_cfi_row @@ -2484,7 +2488,7 @@ scan_insn_after (rtx_insn *insn) instructions therein. */ static void -scan_trace (dw_trace_info *trace) +scan_trace (dw_trace_info *trace, bool entry) { rtx_insn *prev, *insn = trace->head; dw_cfa_location this_cfa; @@ -2503,6 +2507,17 @@ scan_trace (dw_trace_info *trace) this_cfa = cur_row->cfa; cur_cfa = &this_cfa; + /* If the current function starts with a non-standard incoming frame + sp offset, emit a note before the first instruction. */ + if (entry + && DEFAULT_INCOMING_FRAME_SP_OFFSET != INCOMING_FRAME_SP_OFFSET) +{ + add_cfi_insn = insn; + gcc_assert (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_DELETED); + this_cfa.offset = INCOMING_FRAME_SP_OFFSET; + def_cfa_1 (&this_cfa); +} + for (prev = insn, insn = NEXT_INSN (insn); insn; prev = insn, insn = NEXT_INSN (insn)) @@ -2671,12 +2686,12 @@ create_cfi_notes (void) /* Always begin at the entry trace. */ ti = &trace_info[0]; - scan_trace (ti); + scan_trace (ti, true); while (!trace_work_list.is_empty ()) { ti = trace_work_list.pop (); - scan_trace (ti); + scan_trace (ti, false); } queued_reg_saves.release (); @@ -2980,7 +2995,12 @@ create_cie_data (void) /* On entry, the Canonical Frame Address is at SP. */ memset (&loc, 0, sizeof (loc)); loc.reg = dw_stack_pointer_regnum; - loc.offset = INCOMING_FRAME_SP_OFFSET; + /* create_cie_data is called just once per TU, and when using .cfi_startproc + is even done by the assembler rather than the compiler. If the target + has different incoming frame sp offsets depending on what kind of + function it is, use a single constant offset for the target and + if needed, adjust before the first instruction in insn stream. */ + loc.offset = DEFAULT_INCOMING_FRAME_SP_OFFSET; def_cfa_1 (&loc); if (targetm.debug_unwind_i
[PATCH] -Warray-bounds: Fix false positive in some "switch" stmts (PR tree-optimization/83510)
PR tree-optimization/83510 reports that r255649 (for PR tree-optimization/83312) introduced a false positive for -Warray-bounds for array accesses within certain switch statements: those for which value-ranges allow more than one case to be reachable, but for which one or more of the VR-unreachable cases contain out-of-range array accesses. In the reproducer, after the switch in f is inlined into g, we have 3 cases for the switch (case 9, case 10-19, and default), within a loop that ranges from 0..9. With both the old and new code, vr_values::simplify_switch_using_ranges clears the EDGE_EXECUTABLE flag on the edge to the "case 10-19" block. This happens during the dom walk within the substitute_and_fold_engine. With the old code, the clearing of that EDGE_EXECUTABLE flag led to the /* Skip blocks that were found to be unreachable. */ code in the old implementation of vrp_prop::check_all_array_refs skipping the "case 10-19" block. With the new code, we have a second dom walk, and that dom_walker's ctor sets all edges to be EDGE_EXECUTABLE, losing that information. Then, dom_walker::before_dom_children (here, the subclass' check_array_bounds_dom_walker::before_dom_children) can return one edge, if there's a unique successor edge, and dom_walker::walk filters the dom walk to just that edge. Here we have two VR-valid edges (case 9 and default), and an VR-invalid successor edge (case 10-19). There's no *unique* valid successor edge, and hence taken_edge is NULL, and the filtering in dom_walker::walk doesn't fire. Hence we've lost the filtering of the "case 10-19" BB, hence the false positive. The issue is that we have two dom walks: first within vr_values' substitute_and_fold_dom_walker (which has skip_unreachable_blocks == false), then another within vrp_prop::check_all_array_refs (with skip_unreachable_blocks == true). Each has different "knowledge" about ruling out edges due to value-ranges, but we aren't combining that information. The former "knows" about out-edges at a particular control construct (e.g. at a switch), the latter "knows" about dominance, but only about unique successors (hence the problem when two out of three switch cases are valid). This patch combines the information by preserving the EDGE_EXECUTABLE flags from the first dom walk, and using it in the second dom walk, potentially rejecting additional edges. Doing so fixes the false positive. I attempted an alternative fix, merging the two dom walks into one, but that led to crashes in identify_jump_threads, so I went with this, as a less invasive fix. Successfully bootstrapped®rtested on x86_64-pc-linux-gnu. OK for trunk? gcc/ChangeLog: PR tree-optimization/83510 * domwalk.c (set_all_edges_as_executable): New function. (dom_walker::dom_walker): Add new param "preserve_flags". Move setup of edge flags to set_all_edges_as_executable and guard it with !preserve_flags. * domwalk.h (dom_walker::dom_walker): Add new param "preserve_flags", defaulting to false. (set_all_edges_as_executable): New decl. * tree-vrp.c (check_array_bounds_dom_walker::check_array_bounds_dom_walker): Pass "true" for new param of dom_walker's ctor. (vrp_prop::vrp_finalize): Call set_all_edges_as_executable if check_all_array_refs will be called. gcc/testsuite/ChangeLog: PR tree-optimization/83510 * gcc.c-torture/compile/pr83510.c: New test case. --- gcc/domwalk.c | 30 +++-- gcc/domwalk.h | 11 ++ gcc/testsuite/gcc.c-torture/compile/pr83510.c | 172 ++ gcc/tree-vrp.c| 21 +++- 4 files changed, 224 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr83510.c diff --git a/gcc/domwalk.c b/gcc/domwalk.c index 102a293..988ff71 100644 --- a/gcc/domwalk.c +++ b/gcc/domwalk.c @@ -169,12 +169,29 @@ sort_bbs_postorder (basic_block *bbs, int n) qsort (bbs, n, sizeof *bbs, cmp_bb_postorder); } +/* Set EDGE_EXECUTABLE on every edge within FN's CFG. */ + +void +set_all_edges_as_executable (function *fn) +{ + basic_block bb; + FOR_ALL_BB_FN (bb, fn) +{ + edge_iterator ei; + edge e; + FOR_EACH_EDGE (e, ei, bb->succs) + e->flags |= EDGE_EXECUTABLE; +} +} + /* Constructor for a dom walker. If SKIP_UNREACHBLE_BLOCKS is true, then we need to set - EDGE_EXECUTABLE on every edge in the CFG. */ + EDGE_EXECUTABLE on every edge in the CFG, unless + PRESERVE_FLAGS is true. */ dom_walker::dom_walker (cdi_direction direction, bool skip_unreachable_blocks, + bool preserve_flags, int *bb_index_to_rpo) : m_dom_direction (direction), m_skip_unreachable_blocks (skip_unreachable_blocks), @@ -200,14 +217,9 @@ dom_walker::dom_walker (cdi_direction direction, if (!m_skip_u
[C++ PATCH] Fix ICE in joust with -Wconversion (PR c++/81167)
Hi! As mentioned in the PR, we ICE on this testcase because w->fn is a conversion operator, w->convs[0]->type is a reference to a class type, but because that conversion is ck_ref_bind, source_type looks through it and finds ck_identity with the class type. Then we because w->fn is not a constructor do source = TREE_TYPE (source);, assuming we got a pointer type like on the other 5 testcases in check-c++-all with -Wconversion that cover this code, so source is NULL and we die in calling warning with bogus arguments. The following patch fixes it by only using TREE_TYPE on pointer/reference types. I must say I don't understand this fully, but conversion operators should be used on class types, so that is what we are looking for with source, right? No idea about the ! DECL_CONSTRUCTOR_P (w->fn) check though and nothing in the testsuite covers that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-01-18 Jakub Jelinek PR c++/81167 * call.c (joust): Only use TREE_TYPE (source) if source is a POINTER_TYPE_P. * g++.dg/cpp0x/pr81167.C: New test. --- gcc/cp/call.c.jj2018-01-17 22:00:13.626228171 +0100 +++ gcc/cp/call.c 2018-01-18 21:01:51.596136303 +0100 @@ -10082,7 +10082,7 @@ joust (struct z_candidate *cand1, struct else if (warn) { tree source = source_type (w->convs[0]); - if (! DECL_CONSTRUCTOR_P (w->fn)) + if (! DECL_CONSTRUCTOR_P (w->fn) && POINTER_TYPE_P (source)) source = TREE_TYPE (source); if (warning (OPT_Wconversion, "choosing %qD over %qD", w->fn, l->fn) && warning (OPT_Wconversion, " for conversion from %qH to %qI", --- gcc/testsuite/g++.dg/cpp0x/pr81167.C.jj 2018-01-18 21:07:48.723189500 +0100 +++ gcc/testsuite/g++.dg/cpp0x/pr81167.C2018-01-18 21:07:18.610187374 +0100 @@ -0,0 +1,24 @@ +// PR c++/81167 +// { dg-do compile { target c++11 } } +// { dg-options "-Wconversion" } + +struct bar; + +struct foo +{ + foo () {} + foo (const bar &) {} +}; + +struct bar +{ + operator foo () && { return foo (); } +}; + +void test () +{ + foo f = bar (); +// { dg-warning "choosing 'bar::operator foo\\(\\) &&' over 'foo::foo\\(const bar&\\)'" "" { target *-*-* } .-1 } +// { dg-warning "for conversion from 'bar' to 'foo'" "" { target *-*-* } .-2 } +// { dg-message "because conversion sequence for the argument is better" "" { target *-*-* } .-3 } +} Jakub
Re: [PATCH] Fix pr83619.C (was Re: Fix ICE with profile info mismatch)
On 01/18/2018 03:57 PM, Jakub Jelinek wrote: > On Thu, Jan 18, 2018 at 04:59:01PM +0100, Jan Hubicka wrote: >> this patch ICE where the profile in cgraph mismatch profile in BB. This is >> becuase >> of expansion of speculative devirtualization where we get some roundoff >> issues. >> >> Bootstrapped/regtested x86_64-linux, comitted. >> Honza >> >> PR ipa/83619 >> * g++.dg/torture/pr83619.C: New testcase. >> * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Update edge >> frequencies. >> --- testsuite/g++.dg/torture/pr83619.C (revision 0) >> +++ testsuite/g++.dg/torture/pr83619.C (working copy) > ... > > This testcase FAILs everywhere: > /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C: In static member function > 'static void i::j< >::c(e*)': > /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of > incomplete type 'class e' > /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:8:7: note: forward > declaration of 'class e' > FAIL: g++.dg/torture/pr83619.C -O0 (test for excess errors) > Excess errors: > /.../gcc/gcc/testsuite/g++.dg/torture/pr83619.C:25:8: warning: invalid use of > incomplete type 'class e' > > The following patch tweaks it so that it doesn't emit the warning, yet still > ICEs before your cgraph.c change and PASSes after it. > Tested on x86_64-linux and i686-linux, ok for trunk? > > 2018-01-18 Jakub Jelinek > > PR ipa/83619 > * g++.dg/torture/pr83619.C (e): Define before first use instead of > forward declaration. OK. Note I think someone has filed a regression against this failure. SO you might want to check the recent P3s to find and close it out. jeff
Re: [PATCH,PTX] Add support for CUDA 9
On 12/19/2017 04:39 PM, Tom de Vries wrote: > On 12/20/2017 12:25 AM, Cesar Philippidis wrote: >> og7-ptx-cuda9.diff >> >> >> 2017-12-19 Cesar Philippidis >> >> gcc/ >> * config/nvptx/nvptx.c (output_init_frag): Don't use generic address >> spaces for function labels. >> >> gcc/testsuite/ >> * gcc.target/nvptx/indirect_call.c: New test. >> >> >> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c >> index dfb27ef..a7b4c09 100644 >> --- a/gcc/config/nvptx/nvptx.c >> +++ b/gcc/config/nvptx/nvptx.c >> @@ -1894,9 +1894,15 @@ output_init_frag (rtx sym) >> if (sym) >> { >> - fprintf (asm_out_file, "generic("); >> + bool function = SYMBOL_REF_DECL (sym) >> + && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL); > > Please indent using parentheses like this: > > bool function = (SYMBOL_REF_DECL (sym) > && (TREE_CODE (SYMBOL_REF_DECL (sym)) == ...)); Done. >> + if (!function) >> + fprintf (asm_out_file, "generic("); >> output_address (VOIDmode, sym); >> - fprintf (asm_out_file, val ? ") + " : ")"); >> + if (!function) >> + fprintf (asm_out_file, val ? ") + " : ")"); >> + else if (val) >> + fprintf (asm_out_file, " + "); > > > Please use: > > if (!function) > fprintf (asm_out_file, ")"); > if (val) > fprintf (asm_out_file, " + "); Done. >> } >> if (!sym || val) >> diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c >> b/gcc/testsuite/gcc.target/nvptx/indirect_call.c >> new file mode 100644 >> index 000..39992a7 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c >> @@ -0,0 +1,19 @@ >> +/* { dg-options "-O2 -msoft-stack" } */ >> +/* { dg-do run } */ >> + >> +int >> +f1 (int a) >> +{ >> + return a + 1; >> +} >> + >> +int (*f2)(int) = f1; >> + >> +int >> +main () >> +{ >> + if (f2 (100) != 101) >> + __builtin_abort(); >> + >> + return 0; >> +} >> Here's the updated patch with the changes that you requested. There are no new regressions in trunk. I tested it on my desktop running driver 387.34 on a Pascal GPU. Is this OK for trunk? Cesar 2018-01-18 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (output_init_frag): Don't use generic address spaces for function labels. gcc/testsuite/ * gcc.target/nvptx/indirect_call.c: New test. diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 86fc13f4fc0..4cb87c8ad07 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -1899,9 +1899,15 @@ output_init_frag (rtx sym) if (sym) { - fprintf (asm_out_file, "generic("); + bool function = (SYMBOL_REF_DECL (sym) + && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL)); + if (!function) + fprintf (asm_out_file, "generic("); output_address (VOIDmode, sym); - fprintf (asm_out_file, val ? ") + " : ")"); + if (!function) + fprintf (asm_out_file, ")"); + if (val) + fprintf (asm_out_file, " + "); } if (!sym || val) diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c b/gcc/testsuite/gcc.target/nvptx/indirect_call.c new file mode 100644 index 000..39992a7137b --- /dev/null +++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c @@ -0,0 +1,19 @@ +/* { dg-options "-O2 -msoft-stack" } */ +/* { dg-do run } */ + +int +f1 (int a) +{ + return a + 1; +} + +int (*f2)(int) = f1; + +int +main () +{ + if (f2 (100) != 101) +__builtin_abort(); + + return 0; +}
Re: Go patch committed: Enable escape analysis for runtime
On Thu, Jan 18, 2018 at 11:53 AM, Rainer Orth wrote: > >> This patch to the Go frontend by Cherry Zhang enables escape analysis >> for the runtime package in the Go frontend. The runtime package was >> hard-coded non-escape, and the escape analysis was not run for the >> runtime package. This patch removes the hard-code, and lets the >> escape analysis decide. It is not allowed for local variables and >> closures in the runtime to be heap allocated. This patch adds the >> check that make sure that they indeed do not escape. >> >> The escape analysis is always run when compiling the runtime now. >> >> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed >> to mainline. > > this broke Solaris bootstrap: > > /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:182:57: > error: n escapes to heap, not allowed in runtime > if port_getn(portfd, &events[0], uint32(len(events)), &n, wait) < 0 { > ^ > /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:176:11: > error: zero escapes to heap, not allowed in runtime >wait = &zero >^ > /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:174:6: error: > .runtime.zero escapes to heap, not allowed in runtime > var zero timespec > ^ > /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:179:6: error: > .runtime.events escapes to heap, not allowed in runtime > var events [128]portevent > ^ > /vol/gcc/src/hg/trunk/local/libgo/go/runtime/netpoll_solaris.go:181:6: error: > .runtime.n escapes to heap, not allowed in runtime > var n uint32 = 1 > ^ > > Fixed as follows. i386-pc-solaris2.11 and sparc-sun-solaris2.11 builds > have completed, make check still running. Thanks, patch committed. Ian
libgo patch commited: Recognize AIX objects and archives
This patch by Tony Reix adds code to the go tool to recognize AIX objects and archives. Bootstrapped on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256872) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -6b4f1391e2cd637fa5ed797bb3ab43d9ede61997 +cf33aedb66b5121708f5640443c1336080a47fe5 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/go/cmd/go/internal/work/exec.go === --- libgo/go/cmd/go/internal/work/exec.go (revision 256794) +++ libgo/go/cmd/go/internal/work/exec.go (working copy) @@ -1282,6 +1282,7 @@ func (b *Builder) cover(a *Action, dst, var objectMagic = [][]byte{ {'!', '<', 'a', 'r', 'c', 'h', '>', '\n'}, // Package archive + {'<', 'b', 'i', 'g', 'a', 'f', '>', '\n'}, // Package AIX big archive {'\x7F', 'E', 'L', 'F'}, // ELF {0xFE, 0xED, 0xFA, 0xCE}, // Mach-O big-endian 32-bit {0xFE, 0xED, 0xFA, 0xCF}, // Mach-O big-endian 64-bit @@ -1291,6 +1292,8 @@ var objectMagic = [][]byte{ {0x00, 0x00, 0x01, 0xEB}, // Plan 9 i386 {0x00, 0x00, 0x8a, 0x97}, // Plan 9 amd64 {0x00, 0x00, 0x06, 0x47}, // Plan 9 arm + {0x01, 0xDF}, // XCOFF32 + {0x01, 0xF7}, // XCOFF64 } func isObject(s string) bool {
libgo patch committed: Add go:noescape for AIX
This patch by Tony Reix adds some go:noescape declarations for AIX. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256873) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -cf33aedb66b5121708f5640443c1336080a47fe5 +8195b62d353026256037fa44409c4ad95762e6b7 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/go/runtime/netpoll_aix.go === --- libgo/go/runtime/netpoll_aix.go (revision 256593) +++ libgo/go/runtime/netpoll_aix.go (working copy) @@ -34,12 +34,15 @@ const _PS_DELETE = 0x2 //extern pollset_create func pollset_create(maxfd int32) pollset_t +//go:noescape //extern pollset_ctl func pollset_ctl(ps pollset_t, pollctl_array *poll_ctl, array_length int32) int32 +//go:noescape //extern pollset_poll func pollset_poll(ps pollset_t, polldata_array *pollfd, array_length int32, timeout int32) int32 +//go:noescape //extern pipe func libc_pipe(fd *int32) int32 Index: libgo/go/runtime/stubs2.go === --- libgo/go/runtime/stubs2.go (revision 256593) +++ libgo/go/runtime/stubs2.go (working copy) @@ -10,6 +10,7 @@ package runtime import "unsafe" +//go:noescape func read(fd int32, p unsafe.Pointer, n int32) int32 func closefd(fd int32) int32
Re: Go patch committed: Update to Go1.10beta1
On Thu, Jan 11, 2018 at 8:56 AM, Rainer Orth wrote: > >>> This patch updates libgo to the Go1.10beta1 release. The final Go >>> 1.10 release is expected around February 1, so it's not clear how the >>> release timing is going to work with GCC 8. In any case this updates >>> GCC to something pretty close to the final Go 1.10 release. >>> >>> A few changes to the frontend were required to match changes in the >>> runtime map code, and to handle some no-writebarrier cases in the >>> runtime package. >>> >>> As usual with these updates the complete patch is too large to include >>> in this e-mail message. I've just included the changes to >>> gccgo-specific code. >>> >>> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed >>> to mainline. >> >> the patch broke Solaris bootstrap: >> >> mv: cannot stat 'os/signal/internal/pty.s-gox.tmp': No such file or directory >> make[4]: *** [Makefile:3348: os/signal/internal/pty.s-gox] Error 1 >> make[4]: *** Waiting for unfinished jobs >> >> Fixed trivially as follows, which allowed the build to complete. make >> check still running... > > there's another build failure, only affecting Solaris 10: > > /vol/gcc/src/hg/trunk/local/libgo/go/os/user/cgo_lookup_unix.go:274:20: > error: integer constant overflow > sp.Pw_uid = 1<<32 - 2 > ^ > /vol/gcc/src/hg/trunk/local/libgo/go/os/user/cgo_lookup_unix.go:275:20: > error: integer constant overflow > sp.Pw_gid = 1<<32 - 3 > ^ > make[8]: *** [Makefile:3313: os/user.lo] Error 1 > > sysinfo.go has > > type Passwd struct { Pw_name *int8; Pw_passwd *int8; Pw_uid int32; Pw_gid > int32; Pw_age *int8; Pw_comment *int8; Pw_gecos *int8; Pw_dir *int8; Pw_shell > *int8; } > > while on S11 there is > > type Passwd struct { Pw_name *int8; Pw_passwd *int8; Pw_uid uint32; Pw_gid > uint32; Pw_age *int8; Pw_comment *int8; Pw_gecos *int8; Pw_dir *int8; > Pw_shell *int8; } I fixed this by clobbering the types used in Go, as follows. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu, committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256874) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -8195b62d353026256037fa44409c4ad95762e6b7 +bce8720d4eb662f31026e9c7be6ce4d0aeb4ae3b The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/mksysinfo.sh === --- libgo/mksysinfo.sh (revision 256593) +++ libgo/mksysinfo.sh (working copy) @@ -685,9 +685,13 @@ if ! grep "const EAI_OVERFLOW " ${OUT} > fi # The passwd struct. +# Force uid and gid from int32 to uint32 for consistency; they are +# int32 on Solaris 10 but uint32 everywhere else including Solaris 11. grep '^type _passwd ' gen-sysinfo.go | \ sed -e 's/_passwd/Passwd/' \ -e 's/ pw_/ Pw_/g' \ + -e 's/ Pw_uid int32/ Pw_uid uint32/' \ + -e 's/ Pw_gid int32/ Pw_gid uint32/' \ >> ${OUT} # The group struct.
Re: Go patch committed: Update to Go1.10beta1
On Thu, Jan 18, 2018 at 12:02 AM, Rainer Orth wrote: > >> On Thu, Jan 11, 2018 at 1:46 AM, Rainer Orth >> wrote: >>> On Wed, Jan 10, 2018 at 5:42 AM, Ian Lance Taylor wrote: > > Whoops, there's a bug on big-endian 32-bit systems. I'm testing > https://golang.org/cl/87135. Committed as follows. >>> >>> thanks, that fixed quite a lot of the failures. >>> >>> However, many others remain, too many to report here. I've filed PR >>> go/83787 to capture those. >> >> Thanks. I found the problem: there is a new function makechan that >> takes a size argument of type int, and the old makechan, that took >> int64, is now makechan64. Since the size argument was the last one, >> this worked fine except on 32-bit big-endian systems. Fixed with this >> patch. Bootstrapped and tested on x86_64-pc-linux-gnu and >> sparc-sun-solaris2.12. Committed to mainline. > > great, thanks for fixing this. > > One related question: I mentioned in passing in the PR that four go test > invocations in gotools would hang indefinitely. I noticed that they > weren't called with -test.timeout, unlike libgo's gotest. However, in > exceptional cases like this this probably can't be helped short of > adding an external timeout mechanism. Thanks, I missed that. This patch adds -test.timeout arguments to at least set a timeout where possible. I used the same 240s value used by libgo by default. Bootstrapped and ran gotools testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian 2018-01-18 Ian Lance Taylor * Makefile.am (GOTOOLS_TEST_TIMEOUT): Define. (check-go-tool): Pass -test.timeout with GOTOOLS_TEST_TIMEOUT. (check-cgo-test, check-carchive-test, check-vet): Likewise. * Makefile.in: Rebuild. Index: Makefile.am === --- Makefile.am (revision 256398) +++ Makefile.am (working copy) @@ -161,6 +161,9 @@ uninstall-local: GOTESTFLAGS = +# Number of seconds before tests time out. +GOTOOLS_TEST_TIMEOUT = 240 + # Run tests using the go tool, and frob the output to look like that # generated by DejaGNU. The main output of this is two files: # gotools.sum and gotools.log. @@ -234,11 +237,11 @@ check-go-tool: go$(EXEEXT) $(noinst_PROG cp $(libgodir)/objabi.go check-go-dir/src/cmd/internal/objabi/ @abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \ abs_checkdir=`cd check-go-dir && $(PWD_COMMAND)`; \ - echo "cd check-go-dir/src/cmd/go && $(ECHO_ENV) GOPATH=$${abs_checkdir} $(abs_builddir)/go$(EXEEXT) test -test.short -test.v" > cmd_go-testlog + echo "cd check-go-dir/src/cmd/go && $(ECHO_ENV) GOPATH=$${abs_checkdir} $(abs_builddir)/go$(EXEEXT) test -test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v" > cmd_go-testlog $(CHECK_ENV) \ GOPATH=`cd check-go-dir && $(PWD_COMMAND)`; \ export GOPATH; \ - (cd check-go-dir/src/cmd/go && $(abs_builddir)/go$(EXEEXT) test -test.short -test.v) >> cmd_go-testlog 2>&1 || echo "--- $${fl}: go test cmd/go (0.00s)" >> cmd_go-testlog + (cd check-go-dir/src/cmd/go && $(abs_builddir)/go$(EXEEXT) test -test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v) >> cmd_go-testlog 2>&1 || echo "--- $${fl}: go test cmd/go (0.00s)" >> cmd_go-testlog grep '^--- ' cmd_go-testlog | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' | sort -k 2 # check-runtime runs `go test runtime` in our environment. @@ -269,11 +272,11 @@ check-cgo-test: go$(EXEEXT) $(noinst_PRO $(MKDIR_P) cgo-test-dir/misc/cgo cp -r $(libgomiscdir)/cgo/test cgo-test-dir/misc/cgo/ @abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \ - echo "cd cgo-test-dir/misc/cgo/test && $(ECHO_ENV) GOTRACEBACK=2 $(abs_builddir)/go$(EXEEXT) test -test.short -test.v" > cgo-testlog + echo "cd cgo-test-dir/misc/cgo/test && $(ECHO_ENV) GOTRACEBACK=2 $(abs_builddir)/go$(EXEEXT) test -test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v" > cgo-testlog $(CHECK_ENV) \ GOTRACEBACK=2; \ export GOTRACEBACK; \ - (cd cgo-test-dir/misc/cgo/test && $(abs_builddir)/go$(EXEEXT) test -test.short -test.v) >> cgo-testlog 2>&1 || echo "--- $${fl}: go test misc/cgo/test (0.00s)" >> cgo-testlog + (cd cgo-test-dir/misc/cgo/test && $(abs_builddir)/go$(EXEEXT) test -test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v) >> cgo-testlog 2>&1 || echo "--- $${fl}: go test misc/cgo/test (0.00s)" >> cgo-testlog grep '^--- ' cgo-testlog | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' | sort -k 2 # check-carchive-test runs `go test misc/cgo/testcarchive/carchive_test.go` @@ -283,11 +286,11 @@ check-carchive-test: go$(EXEEXT) $(noins $(MKDIR_P) carchive-test-dir/misc/cgo cp -r $(libgomiscdir)/cgo/testcarchive carchive-test-dir/misc/cgo/ @abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \ - echo "cd carchive-test-dir/misc/cgo/testcarchive && $(ECHO_ENV) LIBRARY_PATH=`echo $${abs_libgodir}/.libs` $(
libgo patch committed: Use rlimit64 when we use getrlimit64
This libgo patch changes the syscall package to use the rlimit64 type, if available, when we use getrlimit64. This makes no difference on most systems, because renames the type appropriately anyhow, but apparently it makes a difference on AIX. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu, and got a confirmation that this fixes the problem on AIX. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256875) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -bce8720d4eb662f31026e9c7be6ce4d0aeb4ae3b +87525458bcd5ab4beb5b95e7d58e3dfdbc1bd478 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/Makefile.am === --- libgo/Makefile.am (revision 256794) +++ libgo/Makefile.am (working copy) @@ -674,7 +674,7 @@ s-errno: sysinfo.go: s-sysinfo; @true s-sysinfo: $(srcdir)/mksysinfo.sh gen-sysinfo.go errno.i - GOOS=$(GOOS) $(SHELL) $(srcdir)/mksysinfo.sh + GOARCH=$(GOARCH) GOOS=$(GOOS) $(SHELL) $(srcdir)/mksysinfo.sh $(SHELL) $(srcdir)/mvifdiff.sh tmp-sysinfo.go sysinfo.go $(STAMP) $@ Index: libgo/mksysinfo.sh === --- libgo/mksysinfo.sh (revision 256875) +++ libgo/mksysinfo.sh (working copy) @@ -1120,8 +1120,16 @@ if test "$timex" != ""; then fi # The rlimit struct. -grep '^type _rlimit ' gen-sysinfo.go | \ -sed -e 's/_rlimit/Rlimit/' \ +# On systems that use syscall/libcall_posix_largefile.go, use rlimit64 +# if it exists. +rlimit="_rlimit" +if test "${GOOS}" = "aix" || test "${GOOS}" = "linux" || (test "${GOOS}" = "solaris" && (test "${GOARCH}" = "386" || test "${GOARCH}" = "sparc")); then + if grep '^type _rlimit64 ' gen-sysinfo.go > /dev/null 2>&1; then +rlimit="_rlimit64" + fi +fi +grep "^type ${rlimit} " gen-sysinfo.go | \ +sed -e "s/${rlimit}/Rlimit/" \ -e 's/rlim_cur/Cur/' \ -e 's/rlim_max/Max/' \ >> ${OUT} @@ -1130,7 +1138,13 @@ grep '^type _rlimit ' gen-sysinfo.go | \ grep '^const _RLIMIT_' gen-sysinfo.go | sed -e 's/^\(const \)_\(RLIMIT_[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT} grep '^const _RLIM_' gen-sysinfo.go | +grep -v '^const _RLIM_INFINITY ' | sed -e 's/^\(const \)_\(RLIM_[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT} +if test "${rlimit}" = "_rlimit64" && grep '^const _RLIM64_INFINITY ' gen-sysinfo.go > /dev/null 2>&1; then + echo 'const RLIM_INFINITY = _RLIM64_INFINITY' >> ${OUT} +elif grep '^const _RLIM_INFINITY ' gen-sysinfo-go; then + echo 'const RLIM_INFINITY = _RLIM_INFINITY' >> ${OUT} +fi # The sysinfo struct. grep '^type _sysinfo ' gen-sysinfo.go | \
[PATCH, fortran] Support Fortran 2018 teams
All, Attached please find what I hope is the final version of the patch that adds partial support for teams of images, the last remaining major Fortran 2018 feature without current support in gfortran (or any other Fortran compiler of which I’m aware). I submitted the first draft of this patch last September. Regression tests show no unexpected failures (i.e., the output of ‘grep -w FAIL gfortran.sum’ is empty). The Changelogs are below. Among the details remaining for future commits are the following: support for the GET_TEAM intrinsic function is incomplete, the TEAM_NUMBER intrinsic function doesn’t yet accept the optional argument, and a SYNC TEAM test is needed. Using any of the unsupported features generates a compile-time error message from gfortran or a runtime error message from OpenCoarrays. There are no internal compiler errors of which I’m aware. And there remains one big, dissertation-worthy project in handling the use of teams in the presence of failed images. Ok for trunk? Damian 2018-01-18 Damian Rouson Alessandro Fanfarillo Partial support for Fortran 2018 teams features. * array.c (gfc_match_array_ref): add team attribute in coarray transfers. * check.c (gfc_check_get_team, gfc_check_team_number): add new functions for get_team and team_number. * dump-parse-tree.c (show_code_node): add new statements: form team, change team, end team, and sync team. * expr.c (gfc_find_team_co): add new function. * gfortran.h: add new statements. * intrinsic.c (add_functions): add get_team and team_number functions. * intrinsic.h: add get_team and team_number prototypes for check, simplify, and resolve. * iresolve.c (gfc_resolve_get_team, gfc_resolve_team_number): add new functions. * iso-fortran-env.def: add the team_type derived type. * match.c (gfc_match_if, gfc_match_form_team, gfc_match_team_number) (gfc_match_end_team, gfc_match_sync_team, gfc_match_change_team): add change team, end team, form team, sync team match and functions. * match.h: add new prototypes for change team, end team, form team, and sync team. * parse.c (decode_statement): add cases for change team, end team, form team, and sync team. * resolve.c: add cases for exec form team, change team, end team, and sync team. * simplify.c (gfc_simplify_get_team): add new function for get team. * st.c (gfc_free_statement): add cases exec for change team, end team, form team, sync team. * trans-decl.c (gfor_fndecl_caf_form_team) (gfor_fndecl_caf_change_team, gfor_fndecl_caf_end_team) (gfor_fndecl_caf_sync_team, gfor_fndecl_caf_get_team) (gfor_fndecl_caf_team_number): add functions and definitions. * trans-intrinsic.c (conv_caf_send, conv_intrinsic_team_number): add new function and team_type argument support. * trans-stmt.c (gfc_trans_form_team, gfc_trans_change_team) (gfc_trans_end_team, gfc_trans_sync_team): add new functions. * trans-stmt.h: add new prototypes. * trans-types.c (gfc_get_derived_type): check condition for team_type. * trans.c (trans_code): new exec cases for form team, change team, end team, and sync team. * trans.h: add new prototypes. 2018-01-18 Damian Rouson Alessandro Fanfarillo Fortran 2018 teams tests. * gfortran.dg/team_number_1.f90: new test for team_number. * gfortran.dg/team_change_1.f90: new test for change team. * gfortran.dg/team_end_1.f90: new test for end team. * gfortran.dg/team_form_1.f90: new test for form team. teams-patch.diff Description: Binary data
Re: [PATCH v3] Ability to remap file names in __FILE__, etc (PR other/70268)
Ximin Luo writes: > Higher-level build scripts sometimes like to save CFLAGS etc into the build > output, making the overall build output unreproducible even if GCC is playing > nicely. Rather than add logic to strip -f{file,debug,macro,...}-prefix-map, > into all possible higher-level programs that might want to save CFLAGS, it is > simpler if GCC could read it from a user-specified environment variable. This does feel like we are trying to fix the issue in the wrong place. Also, won't such broken packages normally store all options (including -I) rather than just CFLAGS? And if the answer is no, then perhaps you could put -ffile-prefix-map into CPPFLAGS instead of CFLAGS? Kind of even feels right seeing that it affects the preprocessor. Boris