[PATCH] Add mov[us]wb store intrinsics
Hi, These patch adds these 9 new intrinsics. Ok for trunk? gcc/ * config/i386/avx512bwintrin.h (_mm512_mask_cvtepi16_storeu_epi8, _mm512_mask_cvtsepi16_storeu_epi8, _mm512_mask_cvtusepi16_storeu_epi8): New intrinsics. * config/i386/avx512vlbwintrin.h (_mm256_mask_cvtepi16_storeu_epi8, _mm_mask_cvtsepi16_storeu_epi8, _mm256_mask_cvtsepi16_storeu_epi8, _mm_mask_cvtusepi16_storeu_epi8, _mm256_mask_cvtusepi16_storeu_epi8, _mm_mask_cvtepi16_storeu_epi8): New intrinsics. * config/i386/i386-builtin-types.def (PV8Q, V8QI): New pointer type. (VOID_FTYPE_PV32QI_V32HI_USI, VOID_FTYPE_PV8QI_V8HI_UQI, VOID_FTYPE_PV16QI_V16HI_UHI): New function types. * config/i386/i386-builtin.def (__builtin_ia32_pmovwb128mem_mask, __builtin_ia32_pmovwb256mem_mask, __builtin_ia32_pmovswb128mem_mask, __builtin_ia32_pmovswb256mem_mask, __builtin_ia32_pmovuswb128mem_mask, __builtin_ia32_pmovuswb256mem_mask, __builtin_ia32_pmovuswb512mem_mask, __builtin_ia32_pmovswb512mem_mask, __builtin_ia32_pmovwb512mem_mask): New builtins. * gcc/config/i386/i386.c (ix86_expand_special_args_builtin): Handle new types. gcc/testsuite/ * gcc.target/i386/avx512bw-vpmovswb-1.c: Add new intrinsics to test. * gcc.target/i386/avx512bw-vpmovswb-2.c: Ditto. * gcc.target/i386/avx512bw-vpmovuswb-1.c: Ditto. * gcc.target/i386/avx512bw-vpmovuswb-2.c: Ditto. * gcc.target/i386/avx512bw-vpmovwb-1.c: Ditto. * gcc.target/i386/avx512bw-vpmovwb-2.c: Ditto. Thanks, Julia 0001-cvtusepi.patch Description: 0001-cvtusepi.patch
[PATCH] More PR80928 testsuite fallout
2017-06-08 Richard Biener PR tree-optimization/80928 * gcc.dg/vect/slp-perm-8.c: Do not expect check loop to be vectorized. Index: gcc/testsuite/gcc.dg/vect/slp-perm-8.c === --- gcc/testsuite/gcc.dg/vect/slp-perm-8.c (revision 248949) +++ gcc/testsuite/gcc.dg/vect/slp-perm-8.c (working copy) @@ -53,8 +53,7 @@ int main (int argc, const char* argv[]) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { vect_perm_byte && vect_char_mult } } } } */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_byte && {! vect_char_mult } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_byte } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_perm_byte && {! vect_load_lanes } } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target vect_load_lanes } } } */ /* { dg-final { scan-tree-dump "note: Built SLP cancelled: can use load/store-lanes" "vect" { target { vect_perm_byte && vect_load_lanes } } } } */
Re: [PATCH, rs6000] Fold vector shifts in GIMPLE
On Wed, Jun 7, 2017 at 4:14 PM, Bill Schmidt wrote: > >> On Jun 6, 2017, at 11:37 AM, Will Schmidt wrote: >> >> On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote: On Jun 1, 2017, at 2:48 AM, Richard Biener wrote: On Wed, May 31, 2017 at 10:01 PM, Will Schmidt wrote: > Hi, > > Add support for early expansion of vector shifts. Including > vec_sl (shift left), vec_sr (shift right), vec_sra (shift > right algebraic), vec_rl (rotate left). > Part of this includes adding the vector shift right instructions to > the list of those instructions having an unsigned second argument. > > The VSR (vector shift right) folding is a bit more complex than > the others. This is due to requiring arg0 be unsigned for an algebraic > shift before the gimple RSHIFT_EXPR assignment is built. Jakub, do we sanitize that undefinedness of left shifts of negative values and/or overflow of left shift of nonnegative values? >> >> >> On Thu, 2017-06-01 at 10:17 +0200, Jakub Jelinek wrote: >>> We don't yet, see PR77823 - all I've managed to do before stage1 was over >>> was instrumentation of signed arithmetic integer overflow on vectors, >>> division, shift etc. are tasks maybe for this stage1. >>> >>> That said, shift instrumentation in particular is done early because every >>> FE has different rules, and so if it is coming from target builtins that are >>> folded into something, it wouldn't be instrumented anyway. >> >> >> On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote: Will, how is that defined in the intrinsics operation? It might need similar treatment as the abs case. >>> >>> Answering for Will -- vec_sl is defined to simply shift bits off the end to >>> the >>> left and fill with zeros from the right, regardless of whether the source >>> type >>> is signed or unsigned. The result type is signed iff the source type is >>> signed. So a negative value can become positive as a result of the >>> operation. >>> >>> The same is true of vec_rl, which will naturally rotate bits regardless of >>> signedness. >> >> [I'd rather make the negative left shift case implementation defined given C and C++ standards do not agree to 100% AFAIK] >> >> With the above answers, how does this one stand? >> >> [ I have no issue adding the TYPE_OVERFLOW_WRAPS logic to treat some of >> the cases differently, I'm just unclear on whether none/some/all of the >> shifts will require that logic. :-) ] > > I have to defer to Richard here, I don't know the subtleties well enough. I'd say play safe and guard folding of left shifts with TYPE_OVERFLOW_WRAPS. Richard. > Bill > >> >> thanks, >> -Will >> >> >> >> Richard. > [gcc] > > 2017-05-26 Will Schmidt > > * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling > for early expansion of vector shifts (sl,sr,sra,rl). > (builtin_function_type): Add vector shift right instructions > to the unsigned argument list. > > [gcc/testsuite] > > 2017-05-26 Will Schmidt > > * testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New. > * testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New. > * testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New. > * testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New. > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 8adbc06..6ee0bfd 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator > *gsi) > gsi_replace (gsi, g, true); > return true; > } > +/* Flavors of vec_rotate_left . */ > +case ALTIVEC_BUILTIN_VRLB: > +case ALTIVEC_BUILTIN_VRLH: > +case ALTIVEC_BUILTIN_VRLW: > +case P8V_BUILTIN_VRLD: > + { > + arg0 = gimple_call_arg (stmt, 0); > + arg1 = gimple_call_arg (stmt, 1); > + lhs = gimple_call_lhs (stmt); > + gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1); > + gimple_set_location (g, gimple_location (stmt)); > + gsi_replace (gsi, g, true); > + return true; > + } > + /* Flavors of vector shift right algebraic. vec_sra{b,h,w} -> > vsra{b,h,w}. */ > +case ALTIVEC_BUILTIN_VSRAB: > +case ALTIVEC_BUILTIN_VSRAH: > +case ALTIVEC_BUILTIN_VSRAW: > +case P8V_BUILTIN_VSRAD: > + { > + arg0 = gimple_call_arg (stmt, 0); > + arg1 = gimple_call_arg (stmt, 1); > + lhs = gimple_call_lhs (stmt); > + gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1); > + gimple_set_location (g, gimple_location (stmt)); > + gsi_replace (gsi, g, true); > + return tru
Re: [PATCH] handle bzero/bcopy in DSE and aliasing (PR 80933, 80934)
On Thu, Jun 8, 2017 at 4:33 AM, Martin Sebor wrote: > On 06/07/2017 02:12 PM, Martin Sebor wrote: >> >> On 06/07/2017 02:01 PM, Marc Glisse wrote: >>> >>> On Wed, 7 Jun 2017, Bernhard Reutner-Fischer wrote: >>> On 7 June 2017 16:46:53 CEST, Martin Sebor wrote: > > On 06/07/2017 02:23 AM, Richard Biener wrote: >> >> On Wed, Jun 7, 2017 at 5:26 AM, Martin Sebor > > wrote: Note I'd be _much_ more sympathetic to simply canonicalizing all of bzero and bcopy to memset / memmove and be done with all the above complexity. >>> >>> >>> >>> Attached is an updated patch along these lines. Please let me >>> know if it matches your expectations. >> >> >> I think you attached the wrong patch. > > > Yes I did, sorry. The correct one is attached. Under POSIX.1-2008 "optimizing" bzero or bcmp is IMO plain wrong. It's like optimizing foo() to a random built-in but maybe that's just me. If your libc provides a define to a standard function for these under a compat knob then fine but otherwise you should fix that. *shrug*. Joseph? >>> >>> >>> The patch optimizes __builtin_bzero, which should be ok. The question >>> (independent from this patch) is then under what conditions bzero should >>> be detected as a builtin. >> >> >> Yes. The problem is that unlike for C and C++, GCC doesn't have >> a mechanism to select the target version of POSIX. I think it >> should. >> >> But there is a subtle problem with the patch that needs fixing. >> Bcopy should not be transformed to memcpy but rather memmove. >> I'll fix that before committing. > > > Attached is an updated patch with this fix. I also added a cast > from bcopy and bzero to void to detect accidental uses of the > return value. Tested on x86_64-linux. Please do not add foldings to builtins.c but instead add them to gimple-fold.c. + /* Call memset and return the result cast to void to detect its use + (bzero returns void). */ + tree call = build_call_expr_loc (loc, fn, 3, dst, integer_zero_node, len); + return fold_convert (void_type_node, call); ??? How can the result be used if the original call result was not? Thanks, Richard. > Martin
Re: [PATCH][GCC][AARCH64]Bad code-gen for structure/block/unaligned memory access
Tamar Christina writes: > Hi All, > > This patch allows larger bitsizes to be used as copy size > when the target does not have SLOW_UNALIGNED_ACCESS. > > It also provides an optimized routine for MEM to REG > copying which avoid reconstructing the value piecewise on the stack > and instead uses a combination of shifts and ORs. > > This now generates > > adrpx0, .LANCHOR0 > add x0, x0, :lo12:.LANCHOR0 > sub sp, sp, #16 > ldr w1, [x0, 120] > str w1, [sp, 8] > ldr x0, [x0, 112] > ldr x1, [sp, 8] > add sp, sp, 16 > > instead of: > > adrpx3, .LANCHOR0 > add x3, x3, :lo12:.LANCHOR0 > mov x0, 0 > mov x1, 0 > sub sp, sp, #16 > ldr x2, [x3, 112] > ldr w3, [x3, 120] > add sp, sp, 16 > ubfxx5, x2, 8, 8 > bfi x0, x2, 0, 8 > ubfxx4, x2, 16, 8 > lsr w9, w2, 24 > bfi x0, x5, 8, 8 > ubfxx7, x2, 32, 8 > ubfxx5, x2, 40, 8 > ubfxx8, x3, 8, 8 > bfi x0, x4, 16, 8 > bfi x1, x3, 0, 8 > ubfxx4, x2, 48, 8 > ubfxx6, x3, 16, 8 > bfi x0, x9, 24, 8 > bfi x1, x8, 8, 8 > lsr x2, x2, 56 > lsr w3, w3, 24 > bfi x0, x7, 32, 8 > bfi x1, x6, 16, 8 > bfi x0, x5, 40, 8 > bfi x1, x3, 24, 8 > bfi x0, x4, 48, 8 > bfi x0, x2, 56, 8 > > To load a 12 1-byte element struct. Nice! [...] > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > 4f769a40a4e9de83cb5aacfd3ff58301c2feeb78..8906d9a9445ed36f43302708d1f6212bcf017bdc > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -13498,6 +13498,41 @@ aarch64_expand_movmem (rtx *operands) >base = copy_to_mode_reg (Pmode, XEXP (src, 0)); >src = adjust_automodify_address (src, VOIDmode, base, 0); > > + /* Optimize routines for MEM to REG copies. */ > + if (n < 8 && !REG_P (XEXP (operands[0], 0))) This seems to be checking that the address of the original destination memory isn't a plain base register. Why's it important to reject that case but allow e.g. base+offset? > + { > + unsigned int max_align = UINTVAL (operands[2]); > + max_align = n < max_align ? max_align : n; Might be misunderstanding, but isn't max_align always equal to n here, since n was set by: n = UINTVAL (operands[2]); Indentation of the enclosing { ... } is slightly off. > + machine_mode mov_mode, dest_mode > + = smallest_mode_for_size (max_align * BITS_PER_UNIT, MODE_INT); > + rtx result = gen_reg_rtx (dest_mode); > + emit_insn (gen_move_insn (result, GEN_INT (0))); > + > + unsigned int shift_cnt = 0; > + for (; n > shift_cnt; shift_cnt += GET_MODE_SIZE (mov_mode)) > + { > + int nearest = 0; > + /* Find the mode to use, but limit the max to TI mode. */ > + for (unsigned max = 1; max <= (n - shift_cnt) && max <= 16; max *= 2) > + nearest = max; In the if statement above, you required n < 8, so can max ever by > 16 here? > + mov_mode = smallest_mode_for_size (nearest * BITS_PER_UNIT, MODE_INT); > + rtx reg = gen_reg_rtx (mov_mode); > + > + src = adjust_address (src, mov_mode, 0); > + emit_insn (gen_move_insn (reg, src)); > + src = aarch64_progress_pointer (src); > + > + reg = gen_rtx_ASHIFT (dest_mode, reg, > + GEN_INT (shift_cnt * BITS_PER_UNIT)); This seems to be mixing modes: reg has mode "mov_mode" but the result has mode "dest_mode". That isn't well-formed: the mode of a shift result needs to be the same as the mode of the operand. I think the load would need to be a zero-extend of "src" into "reg", with "reg" having mode "dest_mode". > + result = gen_rtx_IOR (dest_mode, reg, result); > + } > + > +dst = adjust_address (dst, dest_mode, 0); > +emit_insn (gen_move_insn (dst, result)); dest_mode was chosen by smallest_mode_for_size, so can be bigger than n. Doesn't that mean that we'll write beyond the end of the copy region when n is an awkward number? > diff --git a/gcc/expr.c b/gcc/expr.c > index > 91d7ea217229fac62380b5d4b646961bf7c836c1..b1df4651e7942346007cda1cce8ee5a19297ab16 > 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -2743,7 +2743,9 @@ copy_blkmode_to_reg (machine_mode mode, tree src) > >n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; >dst_words = XALLOCAVEC (rtx, n_regs); > - bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); > + bitsize = BITS_PER_WORD; > + if (SLOW_UNALIGNED_ACCESS (BLKmode, TYPE_ALIGN (TREE_TYPE (src > +bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD); I think this ought to be testing word_mode instead of BLKmode. (Testing BLKmode doesn't really make sense in general, because the mode doesn't have a meaningful alignment.) Thanks,
Re: add VxWorks specific crtstuff implementation files for Ada runtimes
> On 06 Jun 2017, at 23:02, Olivier Hainque wrote: >> + * Copyright (C) 2016, Free Software Foundation, Inc. >> >> Date update? About to commit the attached patch, to convey the year during which the sources entered the tree. Thanks for the suggestion. With Kind Regards, Olivier -- 2017-06-08 Olivier Hainque ada/ * vx_crtbegin_auto.c: Update year in copyright notice. * vx_crtbegin.c: Likewise. * vx_crtbegin.inc: Likewise. * vx_crtend.c: Likewise. vxcrt-copy.diff Description: Binary data
[PATCH] Fix PR81007
Folding during gimplification can invoke the devirt machinery which doesn't deal with errorneous state. Thus avoid ICEing by not folding from gimplification in case we've seen errors. Similarly do not build cgraph edges in those cases as that invokes the devirt machinery as well (we stop compilation after lowering anyway in case errors were reported). The patch also fixes ordering of passes. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2017-06-08 Richard Biener PR middle-end/81007 * gimplify.c (maybe_fold_stmt): Do not fold after errors. * cgraphbuild.c: Include diagnostic-core.h. (pass_build_cgraph_edges::gate): Add, do not run after errors. * passes.def (all_lowering_passes): Run pass_build_cgraph_edges last again. * g++.dg/pr81007.C: New testcase. Index: gcc/gimplify.c === --- gcc/gimplify.c (revision 249003) +++ gcc/gimplify.c (working copy) @@ -3067,6 +3067,10 @@ gimplify_arg (tree *arg_p, gimple_seq *p static bool maybe_fold_stmt (gimple_stmt_iterator *gsi) { + /* Do not fold if we may have invalid IL somewhere. */ + if (seen_error ()) +return false; + struct gimplify_omp_ctx *ctx; for (ctx = gimplify_omp_ctxp; ctx; ctx = ctx->outer_context) if ((ctx->region_type & (ORT_TARGET | ORT_PARALLEL | ORT_TASK)) != 0) Index: gcc/cgraphbuild.c === --- gcc/cgraphbuild.c (revision 249003) +++ gcc/cgraphbuild.c (working copy) @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. #include "gimple-walk.h" #include "ipa-utils.h" #include "except.h" +#include "diagnostic-core.h" /* Context of record_reference. */ struct record_reference_ctx @@ -305,6 +306,7 @@ public: /* opt_pass methods: */ virtual unsigned int execute (function *); + virtual bool gate (function *) { return ! seen_error (); } }; // class pass_build_cgraph_edges unsigned int Index: gcc/passes.def === --- gcc/passes.def (revision 249003) +++ gcc/passes.def (working copy) @@ -42,9 +42,9 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_build_cfg); NEXT_PASS (pass_warn_function_return); NEXT_PASS (pass_expand_omp); - NEXT_PASS (pass_build_cgraph_edges); NEXT_PASS (pass_sprintf_length, false); NEXT_PASS (pass_walloca, /*strict_mode_p=*/true); + NEXT_PASS (pass_build_cgraph_edges); TERMINATE_PASS_LIST (all_lowering_passes) /* Interprocedural optimization passes. */ Index: gcc/testsuite/g++.dg/pr81007.C === --- gcc/testsuite/g++.dg/pr81007.C (nonexistent) +++ gcc/testsuite/g++.dg/pr81007.C (working copy) @@ -0,0 +1,14 @@ +// { dg-do compile } + +struct A +{ + A p; // { dg-error "incomplete" } + virtual void foo(); +}; + +struct B : A {}; + +void bar(B& b) +{ + b.foo(); +}
[PATCH] Fix PR66623
The following fixes unsafe vectorization of reductions in outer loop vectorization. It combines it with a bit of cleanup in the affected function. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2017-06-08 Richard Biener PR tree-optimization/66623 * tree-vect-loop.c (vect_is_simple_reduction): Cleanup, refactor check_reduction into two parts, properly computing whether we have to check reduction validity for outer loop vectorization. * gcc.dg/vect/pr66623.c: New testcase. Index: gcc/tree-vect-loop.c === --- gcc/tree-vect-loop.c(revision 249003) +++ gcc/tree-vect-loop.c(working copy) @@ -2727,8 +2727,6 @@ vect_is_simple_reduction (loop_vec_info { struct loop *loop = (gimple_bb (phi))->loop_father; struct loop *vect_loop = LOOP_VINFO_LOOP (loop_info); - edge latch_e = loop_latch_edge (loop); - tree loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e); gimple *def_stmt, *def1 = NULL, *def2 = NULL, *phi_use_stmt = NULL; enum tree_code orig_code, code; tree op1, op2, op3 = NULL_TREE, op4 = NULL_TREE; @@ -2742,11 +2740,6 @@ vect_is_simple_reduction (loop_vec_info *double_reduc = false; *v_reduc_type = TREE_CODE_REDUCTION; - /* Check validity of the reduction only for the innermost loop. */ - bool check_reduction = ! flow_loop_nested_p (vect_loop, loop); - gcc_assert ((check_reduction && loop == vect_loop) - || (!check_reduction && flow_loop_nested_p (vect_loop, loop))); - name = PHI_RESULT (phi); /* ??? If there are no uses of the PHI result the inner loop reduction won't be detected as possibly double-reduction by vectorizable_reduction @@ -2775,13 +2768,15 @@ vect_is_simple_reduction (loop_vec_info { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"reduction used in loop.\n"); +"reduction value used in loop.\n"); return NULL; } phi_use_stmt = use_stmt; } + edge latch_e = loop_latch_edge (loop); + tree loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e); if (TREE_CODE (loop_arg) != SSA_NAME) { if (dump_enabled_p ()) @@ -2795,18 +2790,22 @@ vect_is_simple_reduction (loop_vec_info } def_stmt = SSA_NAME_DEF_STMT (loop_arg); - if (!def_stmt) + if (gimple_nop_p (def_stmt)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"reduction: no def_stmt.\n"); +"reduction: no def_stmt\n"); return NULL; } if (!is_gimple_assign (def_stmt) && gimple_code (def_stmt) != GIMPLE_PHI) { if (dump_enabled_p ()) - dump_gimple_stmt (MSG_NOTE, TDF_SLIM, def_stmt, 0); + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "reduction: unhandled reduction operation: "); + dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, def_stmt, 0); + } return NULL; } @@ -2822,6 +2821,7 @@ vect_is_simple_reduction (loop_vec_info } nloop_uses = 0; + gphi *lcphi = NULL; FOR_EACH_IMM_USE_FAST (use_p, imm_iter, name) { gimple *use_stmt = USE_STMT (use_p); @@ -2829,6 +2829,11 @@ vect_is_simple_reduction (loop_vec_info continue; if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))) nloop_uses++; + else + { + gcc_assert (!lcphi); + lcphi = as_a (use_stmt); + } if (nloop_uses > 1) { if (dump_enabled_p ()) @@ -2873,6 +2878,24 @@ vect_is_simple_reduction (loop_vec_info return NULL; } + /* If we are vectorizing an inner reduction we are executing that + in the original order only in case we are not dealing with a + double reduction. */ + bool check_reduction = true; + if (flow_loop_nested_p (vect_loop, loop)) +{ + check_reduction = false; + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (lcphi)) + { + gimple *use_stmt = USE_STMT (use_p); + if (is_gimple_debug (use_stmt)) + continue; + if (! flow_bb_inside_loop_p (vect_loop, gimple_bb (use_stmt))) + check_reduction = true; + } +} + + bool nested_in_vect_loop = flow_loop_nested_p (vect_loop, loop); code = orig_code = gimple_assign_rhs_code (def_stmt); /* We can handle "res -= x[i]", which is non-associative by @@ -2887,27 +2910,8 @@ vect_is_simple_reduction (loop_vec_info if (code == COND_EXPR) { - if (check_reduction) + if (! nested_in_vect_loop) *v_reduc_type = COND_REDUCTION; -} - else if (!commutative_tree_code (code) || !associative_tree_code (code)) -{ - if (dump_enabled_p ()) - report_vect_op (MSG_MISSED_OPTIMIZATION, def_stmt, -
RFA: Enhance information recorded by -frecord-gcc-switches
Hi Guys, The -frecord-gcc-switches option records the gcc command line. It does not however expand options like -O2 into the optimizations that this enables. Thus if a user wants to know if a specific optimization was used when creating an object file, (or library or executable), they will have to reverse engineer the compilation process. Which may or may not be possible. The attached patch is a proposal to address this problem by making -frecord-gcc-switches also record all the enabled options. This does make object files bigger, but this cannot be helped. The enhancement is not enabled by default however, instead a second command line option must be used. In a possibly contentious move I chose to reuse the -fverbose-asm option, rather than creating a new one. I did this because a) it simplifies the patch, b) we have more than enough switch recording options already, c) it does not conflict with the current use of -fverbose-asm and d) it ties in nicely with the name of the option. Tested, with no regressions on an x86_64-pc-linux-gnu target, and built for a variety of other targets. OK to apply ? Cheers Nick gcc/ChangeLog 2017-06-08 Nick Clifton * varasm.c (dump_enabled_to_asm_out_file): New function. Prints enabled options to the asm_out_file. (elf_record_gcc_switches): If verbose-asm is set then also dump all enabled options to the asm file. * toplec.c (print_switch_values): Convert from static to global. * doc/invoke.texi (-fverbose-asm): Mention its effect on the -frecord-gcc-switches option. (-frecord-gcc-switches): Refactor the description and add details of how -fverbose-asm modifies its behaviour. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 249007) +++ gcc/doc/invoke.texi (working copy) @@ -12281,10 +12281,13 @@ who actually need to read the generated assembly code (perhaps while debugging the compiler itself). -@option{-fno-verbose-asm}, the default, causes the -extra information to be omitted and is useful when comparing two assembler -files. +This switch can also be used to modify the behaviour of the +@option{-frecord-gcc-switches} switch, making it record extra +information in the object file. +@option{-fno-verbose-asm}, the default, causes the extra information +to be omitted and is useful when comparing two assembler files. + The added comments include: @itemize @bullet @@ -12370,17 +12373,26 @@ @item -frecord-gcc-switches @opindex frecord-gcc-switches -This switch causes the command line used to invoke the -compiler to be recorded into the object file that is being created. -This switch is only implemented on some targets and the exact format -of the recording is target and binary file format dependent, but it -usually takes the form of a section containing ASCII text. This -switch is related to the @option{-fverbose-asm} switch, but that -switch only records information in the assembler output file as -comments, so it never reaches the object file. -See also @option{-grecord-gcc-switches} for another -way of storing compiler options into the object file. +This switch causes the command line used to invoke the compiler to be +recorded into the object file that is being created. This switch is +only implemented on some targets and the exact format of the recording +is target and binary file format dependent, but it usually takes the +form of a section containing ASCII text. +The related switch @option{-fverbose-asm} switch performs a similar +task but it only records the information in the assembler output file +as comments, so it never reaches the object file. + +If both @option{-fverbose-asm} and @option{-frecord-gcc-switches} are +enabled together then @option{-frecord-gcc-switches} will record all +enabled switches, not just those specified on the command line. Thus +if the command line includes @option{-O2} then all optimizations +enabled by that switch will be recorded in the object file, along with +the presence of @option{-O2} itself. + +See also @option{-grecord-gcc-switches} for another way of storing +compiler options into an object file. + @item -fpic @opindex fpic @cindex global offset table Index: gcc/toplev.c === --- gcc/toplev.c (revision 249007) +++ gcc/toplev.c (working copy) @@ -809,7 +809,9 @@ Each line begins with INDENT and ends with TERM. Each switch is separated from the next by SEP. */ -static void +void print_switch_values (print_switch_fn_type); + +void print_switch_values (print_switch_fn_type print_fn) { int pos = 0; Index: gcc/varasm.c === --- gcc/varasm.c (revision 249007) +++ gcc/varasm.c (working copy) @@ -7545,6 +7545,32 @@ v.release (); } +/* Print TEXT to asm_out_file if TYP
[PATCH] testsuite: example plugin for spellchecking comments
On Tue, 2017-05-02 at 14:11 -0700, Mike Stump wrote: > On May 2, 2017, at 12:08 PM, David Malcolm > wrote: > > > > As a proof of concept, the patch uses this to add a spellchecker > > for comments. > > :-) > > > I didn't bother with the autotool detection for enchant, and > > just hacked it in for now. > > Only other comment would be, rather then requiring it, would be nice > to make it optional. I can host gcc in an environment that is a bare > metal target on newlib. autoconf can smell it, and tell that a ton > of things are missing. Given that Enchant seems a stretch as a hard dependency, it seems better to either: (a) have this code live as a plugin, or (b) attempt to dynamically load libenchant on-demand (a "soft" dependency ?) This patch implements (a): the plugin approach. We don't currently install any "official" plugins; ideally, if a plugin, I'd like it to be one of the things that "make install" installs: it add dependencies that are too much to impose on everyone's cc1 et al (hence a plugin), but which a significant number of people might want to opt in to (hence in-tree and "supported", for some meaning of that term). We don't have any mechanism or policies to allow that yet, so in the meantime, here's a revised version of the spellchecker code which adds it to the *testsuite*, as an example plugin. (specifically the C++ testsuite, to support both C and C++ style comments without needing to silence the compat flag). Changes: * moved it all to a plugin, within the testsuite * added test coverage * added option -fplugin-arg-spellcheck-show-provider * added option -fplugin-arg-spellcheck-language * fixed bug with misspelling at very end of a comment * fixed bug with apostrophes breaking words * fixed quoting of suggestions * don't spellcheck within system headers * don't spellcheck DejaGnu directives * don't spellcheck URLs * removed the -Wfixme and -Wtodo idea libenchant uses pkg-config to expose the flags needed to build against it, so the patch adds a new "dg-pkgconfig" directive to plugin-support.exp. This attempts to use pkg-config to locate the appropriate flags for building against libenchant. If it fails, the test is marked as unsupported. Successfully bootstrapped®rtested on x86_64-pc-linux-gnu. OK for trunk? gcc/testsuite/ChangeLog: * g++.dg/plugin/plugin.exp (plugin_test_list): Add the following... * g++.dg/plugin/spellcheck-1.C: New test case. * g++.dg/plugin/spellcheck-2.C: New test case. * g++.dg/plugin/spellcheck-3.C: New test case. * g++.dg/plugin/spellcheck-4.C: New test case. * g++.dg/plugin/spellcheck-5.C: New test case. * g++.dg/plugin/spellcheck-6.C: New test case. * g++.dg/plugin/spellcheck.c: New test plugin. * lib/plugin-support.exp (plugin-get-options): Add handling for "dg-pkgconfig" to the special-cased directive handling. Verbosely log the resulting value of dg-extra-tool-flags. (plugin-test-execute): Handle "unsupported" errors within plugin-get-options by marking the testcase as unsupported and immediately returning. --- gcc/testsuite/g++.dg/plugin/plugin.exp | 7 + gcc/testsuite/g++.dg/plugin/spellcheck-1.C | 14 ++ gcc/testsuite/g++.dg/plugin/spellcheck-2.C | 13 ++ gcc/testsuite/g++.dg/plugin/spellcheck-3.C | 2 + gcc/testsuite/g++.dg/plugin/spellcheck-4.C | 19 ++ gcc/testsuite/g++.dg/plugin/spellcheck-5.C | 4 + gcc/testsuite/g++.dg/plugin/spellcheck-6.C | 5 + gcc/testsuite/g++.dg/plugin/spellcheck.c | 361 + gcc/testsuite/lib/plugin-support.exp | 32 ++- 9 files changed, 456 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck-1.C create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck-2.C create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck-3.C create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck-4.C create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck-5.C create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck-6.C create mode 100644 gcc/testsuite/g++.dg/plugin/spellcheck.c diff --git a/gcc/testsuite/g++.dg/plugin/plugin.exp b/gcc/testsuite/g++.dg/plugin/plugin.exp index e40cba3..e3f030f 100644 --- a/gcc/testsuite/g++.dg/plugin/plugin.exp +++ b/gcc/testsuite/g++.dg/plugin/plugin.exp @@ -69,6 +69,13 @@ set plugin_test_list [list \ show-template-tree-color.C \ show-template-tree-color-no-elide-type.C } \ { comment_plugin.c comments-1.C } \ +{ spellcheck.c \ + spellcheck-1.C \ + spellcheck-2.C \ + spellcheck-3.C \ + spellcheck-4.C \ + spellcheck-5.C \ + spellcheck-6.C } \ ] foreach plugin_test $plugin_test_list { diff --git a/gcc/testsuite/g++.dg/plugin/spellcheck-1.C b/gcc/testsuite/g++.dg/plugin/spellcheck-1.C new file mode 100644 index 000..242ab13 --- /dev/null +++ b/gcc/testsuite/g++.dg/plugin/spellchec
Re: Fix profile updating in cfgbuild
Hi, this is what I comitted Index: ChangeLog === --- ChangeLog (revision 249007) +++ ChangeLog (working copy) @@ -1,3 +1,7 @@ +2017-06-08 Jan Hubicka + + * cfgbuild.c (find_many_sub_basic_blocks): Fix thinko. + 2017-06-08 Martin Liska PR gcov-profile/80911 Index: cfgbuild.c === --- cfgbuild.c (revision 249007) +++ cfgbuild.c (working copy) @@ -655,7 +655,7 @@ find_many_sub_basic_blocks (sbitmap bloc initialized_src = true; } else - uninitialized_src = false; + uninitialized_src = true; bb->frequency += EDGE_FREQUENCY (e); } /* When some edges are missing with read profile, this is
Re: [PATCH 1/1] Remove redundant definition of srcrootpre
I paid extra attention to it because it appeared in the last line of a build failure likely caused by shell tools differences. cd rarely outputs the new directory for me.
Re: RFA: Enhance information recorded by -frecord-gcc-switches
On Thu, Jun 8, 2017 at 11:21 AM, Nick Clifton wrote: > Hi Guys, > > The -frecord-gcc-switches option records the gcc command line. It > does not however expand options like -O2 into the optimizations that > this enables. Thus if a user wants to know if a specific optimization > was used when creating an object file, (or library or executable), > they will have to reverse engineer the compilation process. Which may > or may not be possible. > > The attached patch is a proposal to address this problem by making > -frecord-gcc-switches also record all the enabled options. This does > make object files bigger, but this cannot be helped. The enhancement > is not enabled by default however, instead a second command line > option must be used. In a possibly contentious move I chose to reuse > the -fverbose-asm option, rather than creating a new one. I did this > because a) it simplifies the patch, b) we have more than enough switch > recording options already, c) it does not conflict with the current > use of -fverbose-asm and d) it ties in nicely with the name of the > option. > > Tested, with no regressions on an x86_64-pc-linux-gnu target, and > built for a variety of other targets. > > OK to apply ? I think individually enabled passes by -On are not really useful information as you generally cannot replace -On by a set of other switches on the command-line. Richard. > Cheers > Nick > > gcc/ChangeLog > 2017-06-08 Nick Clifton > > * varasm.c (dump_enabled_to_asm_out_file): New function. Prints > enabled options to the asm_out_file. > (elf_record_gcc_switches): If verbose-asm is set then also dump > all enabled options to the asm file. > * toplec.c (print_switch_values): Convert from static to global. > * doc/invoke.texi (-fverbose-asm): Mention its effect on the > -frecord-gcc-switches option. > (-frecord-gcc-switches): Refactor the description and add details > of how -fverbose-asm modifies its behaviour. >
Unreviewed patches
The following patches have remained unreviewed for a week or more: [build] Support --sysroot with Solaris ld https://gcc.gnu.org/ml/gcc-patches/2017-05/msg02342.html This needs a build maintainer unless one considers it obvious. Support $SYSROOT for = in -I etc. https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00011.html This needs a cpp or C maintainer. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH, testsuite] Fix no_trampolines test in check_effective_target_trampolines
Hi, Consider check_effective_target_trampolines: ... # Return 1 if according to target_info struct and explicit target list # target is supposed to support trampolines. proc check_effective_target_trampolines { } { if [target_info exists no_trampolines] { return 0 } if { [istarget avr-*-*] || [istarget msp430-*-*] || [istarget hppa2.0w-hp-hpux11.23] || [istarget hppa64-hp-hpux11.23] } { return 0; } return 1 } ... If I disable the nvptx target test in check_effective_target_trampolines and run tests for target nvptx, then the proc returns true instead of false, and I get 'UNSUPPORTED -> FAIL' changes for tests that require the effective target. This is due to the fact that https://github.com/MentorEmbedded/nvptx-tools/blob/master/nvptx-none-run.exp defines 'gcc,no_trampolines': ... set_board_info gcc,no_trampolines 1 ... but check_effective_target_trampolines tests no_trampolines (without the 'gcc,' in front): The effective target trampolines was introduced in 2008, but we've been testing 'gcc,no_trampolines' in gcc_target_compile since at least 1997. [ To complicate matters objc_target_compile tests for 'objc,no_trampolines' to set -DNO_TRAMPOLINES, but AFAICT that macro is not used anywhere in the objc test suites, so I think that's dead code. ] The original submission of the keyword is here ( https://gcc.gnu.org/ml/gcc-patches/2005-04/msg01925.html ) and uses 'target_info exists no_trampolines'. But in a follow-up comment ( https://gcc.gnu.org/ml/gcc-patches/2005-04/msg01978.html ) there is the suggestion to add 'set_board_info gcc,no_trampolines 1' to the board file to trigger the test in check_effective_target_trampolines. So that sounds like the missing 'gcc,' is accidental rather than intentional. In a further follow-up comment ( https://gcc.gnu.org/ml/gcc-patches/2005-04/msg02063.html ) Mike suggest to test for target triplets, which indeed was added in the next version. This probably explains why the missing 'gcc,' wasn't found in any further testing. As things are now, boards for targets that are not listed in check_effective_target_trampolines but still want the no_trampolines behavior need to do: - 'set_board_info gcc,no_trampolines 1' to trigger the test in gcc_target_compile and add -DNO_TRAMPOLINES to the flags - 'set_board_info no_trampolines 1' to trigger the test in check_effective_target_trampolines Given that: - it's better to have to define one variable than two - it looks like an accident that the 'gcc,' was dropped - the one with the 'gcc,' prefix has been around longer, and is mentioned in dejagnu docs I propose to test for 'gcc,no_trampolines' instead of 'no_trampolines' in check_effective_target_trampolines. I don't think a backward compatibility test for 'no_trampolines' is required, because the affected boardfiles most likely already define both. Tested by checking that the patch fixes the 'UNSUPPORTED -> FAIL' regression mentioned above for a single testcase. OK for trunk? Thanks, - Tom Fix no_trampolines test in check_effective_target_trampolines 2017-06-08 Tom de Vries * lib/target-supports.exp (check_effective_target_trampolines): Test for 'gcc,no_trampolines' instead of 'no_trampolines'. --- gcc/testsuite/lib/target-supports.exp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 8b99f35..d0b35be 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -491,7 +491,7 @@ proc check_gc_sections_available { } { # target is supposed to support trampolines. proc check_effective_target_trampolines { } { -if [target_info exists no_trampolines] { +if [target_info exists gcc,no_trampolines] { return 0 } if { [istarget avr-*-*]
Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)
On Wed, Jun 07, 2017 at 08:02:42PM +0100, Pedro Alves wrote: > Hi Marek, > > Nice warning! Just to confirm, would the patch warn with code like: Thanks. BTW, if you (or anyone) could come up with a better name, I'm all ears. > const char * > target_xfer_status_to_string (enum target_xfer_status status) > { > #define CASE(X) case X: return #X >switch (status) > { >CASE(TARGET_XFER_E_IO); >CASE(TARGET_XFER_UNAVAILABLE); > default: >return ""; > } > #undef CASE > }; > > ? > > I think it shouldn't, but I couldn't tell from the tests, Nope, it doesn't warn (neither C nor C++). I should probably add this test. Marek
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
On 2017.01.19 at 18:20 +, Joseph Myers wrote: > On Thu, 19 Jan 2017, Tamar Christina wrote: > > > Hi Joseph, > > > > I made the requested changes and did a quick pass over the rest > > of the fp cases. > > I've no further comments, but watch out for any related test failures > being reported. g++.dg/opt/pr60849.C started ICEing on both X86_64 and ppc64le. -- Markus
Re: [PATCH][GCC][AArch64] optimize float immediate moves (1 /4) - infrastructure.
Tamar Christina writes: > @@ -4613,6 +4615,66 @@ aarch64_legitimize_address_displacement (rtx *disp, > rtx *off, machine_mode mode) >return true; > } > > +/* Return the binary representation of floating point constant VALUE in > INTVAL. > + If the value cannot be converted, return false without setting INTVAL. > + The conversion is done in the given MODE. */ > +bool > +aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT *intval) > +{ > + machine_mode mode = GET_MODE (value); > + if (GET_CODE (value) != CONST_DOUBLE > + || !SCALAR_FLOAT_MODE_P (mode) > + || GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT) > +return false; > + > + unsigned HOST_WIDE_INT ival = 0; > + > + /* Only support up to DF mode. */ > + gcc_assert (GET_MODE_BITSIZE (mode) <= 64); > + int needed = GET_MODE_BITSIZE (mode) == 64 ? 2 : 1; > + > + long res[2]; > + real_to_target (res, > + CONST_DOUBLE_REAL_VALUE (value), > + REAL_MODE_FORMAT (mode)); > + > + ival = zext_hwi (res[needed - 1], 32); > + for (int i = needed - 2; i >= 0; i--) > +{ > + ival <<= 32; > + ival |= zext_hwi (res[i], 32); > +} > + > + *intval = ival; > + return true; > +} > + > +/* Return TRUE if rtx X is an immediate constant that can be moved in > + created using a single MOV(+MOVK) followed by an FMOV. */ Typo. > +bool > +aarch64_float_const_rtx_p (rtx x) > +{ > + machine_mode mode = GET_MODE (x); > + if (mode == VOIDmode) > +return false; > + > + /* Determine whether it's cheaper to write float constants as > + mov/movk pairs over ldr/adrp pairs. */ > + unsigned HOST_WIDE_INT ival; > + > + if (GET_CODE (x) == CONST_DOUBLE > + && SCALAR_FLOAT_MODE_P (mode) > + && aarch64_reinterpret_float_as_int (x, &ival)) > +{ > + machine_mode imode = mode == HFmode ? SImode : int_mode_for_mode > (mode); > + int num_instr = aarch64_internal_mov_immediate > + (NULL_RTX, GEN_INT (ival), false, imode); Sorry to be a broken record on this, but since you explicitly zero-extend the real_to_target results from 32 bits, this will lead to an invalid 32-bit constant when the top bit of an SImode value is set, e.g. (const_int 0x8000_) instead of (const_int 0x__8000_). Using gen_int_mode would avoid this. In general it's better to use gen_int_mode instead of GEN_INT whenever the mode is known, to avoid this kind of corner case. There's another instance later in the patch. Thanks, Richard
[PATCH, testsuite] Remove NO_TRAMPOLINES
Hi, this patch removes the additional_flags=-DNO_TRAMPOLINES addition, and instead uses the effective target trampolines. Tested on x86_64. Tested on nvptx. OK for trunk? Thanks, - Tom Remove NO_TRAMPOLINES 2017-06-07 Tom de Vries * gcc.c-torture/compile/930506-2.c: Use dg-require-effective-target trampolines instead of NO_TRAMPOLINES. * gcc.c-torture/execute/2822-1.c: Same. * gcc.c-torture/execute/920428-2.c: Same. * gcc.c-torture/execute/920501-7.c: Same. * gcc.c-torture/execute/920612-2.c: Same. * gcc.c-torture/execute/921017-1.c: Same. * gcc.c-torture/execute/921215-1.c: Same. * gcc.c-torture/execute/931002-1.c: Same. * gcc.c-torture/execute/comp-goto-2.c: Same. * gcc.c-torture/execute/nestfunc-1.c: Same. * gcc.c-torture/execute/nestfunc-2.c: Same. * gcc.c-torture/execute/nestfunc-3.c: Same. * gcc.c-torture/execute/nestfunc-5.c: Same. * gcc.c-torture/execute/nestfunc-6.c: Same. * gcc.c-torture/execute/pr24135.c: Same. * gcc.dg/Wtrampolines.c: Same. * gcc.dg/torture/stackalign/comp-goto-1.c: Same. * gcc.dg/torture/stackalign/nested-5.c: Same. * gcc.dg/torture/stackalign/nested-6.c: Same. * gcc.dg/torture/stackalign/non-local-goto-3.c: Same. * gcc.dg/torture/stackalign/non-local-goto-4.c: Same. * gcc.dg/torture/stackalign/non-local-goto-5.c: Same. * gcc.dg/trampoline-1.c: Same. * gcc.dg/tree-prof/pr44777.c: Same. * gcc.target/i386/pr67770.c: Same. * lib/gcc.exp (gcc_target_compile): Remove appending of -DNO_TRAMPOLINES to additional_flags. * lib/objc.exp (objc_target_compile): Same. --- gcc/testsuite/gcc.c-torture/compile/930506-2.c | 6 ++ gcc/testsuite/gcc.c-torture/execute/2822-1.c | 7 +++ gcc/testsuite/gcc.c-torture/execute/920428-2.c | 5 + gcc/testsuite/gcc.c-torture/execute/920501-7.c | 6 ++ gcc/testsuite/gcc.c-torture/execute/920612-2.c | 6 -- gcc/testsuite/gcc.c-torture/execute/921017-1.c | 5 +++-- gcc/testsuite/gcc.c-torture/execute/921215-1.c | 5 +++-- gcc/testsuite/gcc.c-torture/execute/931002-1.c | 5 +++-- gcc/testsuite/gcc.c-torture/execute/comp-goto-2.c | 7 +++ gcc/testsuite/gcc.c-torture/execute/nestfunc-1.c | 4 ++-- gcc/testsuite/gcc.c-torture/execute/nestfunc-2.c | 6 +++--- gcc/testsuite/gcc.c-torture/execute/nestfunc-3.c | 3 +-- gcc/testsuite/gcc.c-torture/execute/nestfunc-5.c | 6 ++ gcc/testsuite/gcc.c-torture/execute/nestfunc-6.c | 8 ++-- gcc/testsuite/gcc.c-torture/execute/pr24135.c | 6 ++ gcc/testsuite/gcc.dg/Wtrampolines.c| 6 +- gcc/testsuite/gcc.dg/torture/stackalign/comp-goto-1.c | 6 ++ gcc/testsuite/gcc.dg/torture/stackalign/nested-5.c | 5 + gcc/testsuite/gcc.dg/torture/stackalign/nested-6.c | 7 +-- gcc/testsuite/gcc.dg/torture/stackalign/non-local-goto-3.c | 5 + gcc/testsuite/gcc.dg/torture/stackalign/non-local-goto-4.c | 6 ++ gcc/testsuite/gcc.dg/torture/stackalign/non-local-goto-5.c | 5 + gcc/testsuite/gcc.dg/trampoline-1.c| 5 - gcc/testsuite/gcc.dg/tree-prof/pr44777.c | 7 +++ gcc/testsuite/gcc.target/i386/pr67770.c| 5 + gcc/testsuite/lib/gcc.exp | 3 --- gcc/testsuite/lib/objc.exp | 3 --- 27 files changed, 49 insertions(+), 99 deletions(-) diff --git a/gcc/testsuite/gcc.c-torture/compile/930506-2.c b/gcc/testsuite/gcc.c-torture/compile/930506-2.c index e11e62f..bc982ac 100644 --- a/gcc/testsuite/gcc.c-torture/compile/930506-2.c +++ b/gcc/testsuite/gcc.c-torture/compile/930506-2.c @@ -1,4 +1,5 @@ -#ifndef NO_TRAMPOLINES +/* { dg-require-effective-target trampolines } */ + int f1() { { int ___() { foo(1); } bar(___); } @@ -10,6 +11,3 @@ int f2(int j) { int ___() { foo(j); } bar(___); } return( { int ___() { foo(j); } bar(___);} ); } -#else -int x; -#endif diff --git a/gcc/testsuite/gcc.c-torture/execute/2822-1.c b/gcc/testsuite/gcc.c-torture/execute/2822-1.c index f4a084b..e99bcc2 100644 --- a/gcc/testsuite/gcc.c-torture/execute/2822-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/2822-1.c @@ -1,4 +1,5 @@ -#ifndef NO_TRAMPOLINES +/* { dg-require-effective-target trampolines } */ + int f0(int (*fn)(int *), int *p) { return (*fn) (p); @@ -16,13 +17,11 @@ int f1(void) return f0(f2, &i); } -#endif int main() { -#ifndef NO_TRAMPOLINES if (f1() != 2) abort (); -#endif + return 0; } diff --git a/gcc/testsuite/gcc.c-torture/execute/920428-2.c b/gcc/testsuite/gcc.c-torture/execute/920428-2.c index 99a3925..c6a5bda 100644 --- a/gcc/testsuite/gcc.c-torture/execute/920428-2.c +++ b/gcc/testsuite/gcc.c-torture/execute/920428-2.c @@ -1,9 +1,6 @@ /* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target trampolines } */ -#if !d
Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)
On 06/08/2017 11:29 AM, Marek Polacek wrote: > On Wed, Jun 07, 2017 at 08:02:42PM +0100, Pedro Alves wrote: >> Hi Marek, >> >> Nice warning! Just to confirm, would the patch warn with code like: > > Thanks. BTW, if you (or anyone) could come up with a better name, > I'm all ears. AFAICS, the warning's intent is catching the case of a a macro expanding to multiple (top level) statements, not lines. Both the comments in the code and the description of the warning talk in those terms: +/* () This warning warns about + cases when a macro expands to multiple statements not wrapped in + do {} while (0) or ({ }) and is used as a body of if/else/for/while + conditionals. For example, +Wmultiline-expansion +C ObjC C++ ObjC++ Var(warn_multiline_expansion) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) +Warn about macros expanding to multiple statements in a body of a conditional such as if, else, while, or for. So it'd seem clearer to me if the warning was named around "-Wmulti-statement-something" instead of "-Wmultiline-something"? -Wmulti-statement-expansion -Wmulti-statement-macros -Wmulti-statement-macro -Wmulti-statement-macro-expansion Particularly when one could argue that "multiline expansion" in context of macros doesn't make any sense, given macros always expand to a single line: #define SAME_LINE \ (__LINE__ \ == __LINE__) static_assert (SAME_LINE, ""); > Nope, it doesn't warn (neither C nor C++). I should probably add this test. Thanks for confirming. A test would be nice, to make sure we don't regress. Thanks, Pedro Alves
Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)
On Thu, Jun 08, 2017 at 12:01:03PM +0100, Pedro Alves wrote: > On 06/08/2017 11:29 AM, Marek Polacek wrote: > > On Wed, Jun 07, 2017 at 08:02:42PM +0100, Pedro Alves wrote: > >> Hi Marek, > >> > >> Nice warning! Just to confirm, would the patch warn with code like: > > > > Thanks. BTW, if you (or anyone) could come up with a better name, > > I'm all ears. > > AFAICS, the warning's intent is catching the case of a > a macro expanding to multiple (top level) statements, not lines. True. I felt that it was implicitly understood what's meant by that, but I'll change that. Martin pointed this out, too. > Both the comments in the code and the description of the > warning talk in those terms: > > +/* () This warning warns about > + cases when a macro expands to multiple statements not wrapped in > + do {} while (0) or ({ }) and is used as a body of if/else/for/while > + conditionals. For example, > > +Wmultiline-expansion > +C ObjC C++ ObjC++ Var(warn_multiline_expansion) Warning LangEnabledBy(C > ObjC C++ ObjC++,Wall) > +Warn about macros expanding to multiple statements in a body of a > conditional such as if, else, while, or for. > > So it'd seem clearer to me if the warning was named around > "-Wmulti-statement-something" instead of "-Wmultiline-something"? > > -Wmulti-statement-expansion > -Wmulti-statement-macros > -Wmulti-statement-macro > -Wmulti-statement-macro-expansion I think I'll go with -Wmultistatement-expansion (without the dash). > Particularly when one could argue that "multiline expansion" in > context of macros doesn't make any sense, given macros always > expand to a single line: > > #define SAME_LINE\ > (__LINE__ \ >== __LINE__) > > static_assert (SAME_LINE, ""); Sure. > > Nope, it doesn't warn (neither C nor C++). I should probably add this test. > > Thanks for confirming. A test would be nice, to make sure we > don't regress. I'll post a new version with the warning renamed and the new test added. Thanks, Marek
Re: [PATCH][x86][PR73350][PR80862]
Hello Julia, On 05 Jun 10:13, Koval, Julia wrote: > Hi, > > 1 is replace 8 spaces with tab suggested by ./check_GNU_style.sh, should I > still fix it back? > 2,3,4 Done Thanks a lot! Your patch is OK for trunk. I've checked it in for you (r249009.). > CSE is working, spec 2k6 on skylake-avx512 has same score. Great. -- Thanks, K
Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)
On 06/08/2017 12:19 PM, Marek Polacek wrote: > On Thu, Jun 08, 2017 at 12:01:03PM +0100, Pedro Alves wrote: >> On 06/08/2017 11:29 AM, Marek Polacek wrote: >>> On Wed, Jun 07, 2017 at 08:02:42PM +0100, Pedro Alves wrote: Hi Marek, Nice warning! Just to confirm, would the patch warn with code like: >>> >>> Thanks. BTW, if you (or anyone) could come up with a better name, >>> I'm all ears. >> >> AFAICS, the warning's intent is catching the case of a >> a macro expanding to multiple (top level) statements, not lines. > > True. I felt that it was implicitly understood what's meant by that, > but I'll change that. Martin pointed this out, too. I don't think it's implicit, because it could raise questions, like: >> +Wmultiline-expansion >> +C ObjC C++ ObjC++ Var(warn_multiline_expansion) Warning LangEnabledBy(C >> ObjC C++ ObjC++,Wall) >> +Warn about macros expanding to multiple statements in a body of a >> conditional such as if, else, while, or for. " Hmm, the description doesn't actually describe what is meant by "multiple lines". Does "multiline" here mean that the warning triggers with: #define FOO() \ foo1(); \ foo2() but not #define FOO() foo1(); foo2() ? " It's perhaps obvious to seasoned hackers that that's not the intention, but not all users are experts. So a general rule I try to follow (in GDB) is: make sure that the description of an option matches the option's name. I.e., if the words used to name the option name don't appear in the option's description, then that's a good indication something is off and is bound to confuse someone. >> >> So it'd seem clearer to me if the warning was named around >> "-Wmulti-statement-something" instead of "-Wmultiline-something"? >> >> -Wmulti-statement-expansion >> -Wmulti-statement-macros >> -Wmulti-statement-macro >> -Wmulti-statement-macro-expansion > > I think I'll go with -Wmultistatement-expansion (without the dash). Fine with me, FWIW. > I'll post a new version with the warning renamed and the new test added. Great, thanks much! Thanks, Pedro Alves
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
On 8 June 2017 at 12:30, Markus Trippelsdorf wrote: > On 2017.01.19 at 18:20 +, Joseph Myers wrote: >> On Thu, 19 Jan 2017, Tamar Christina wrote: >> >> > Hi Joseph, >> > >> > I made the requested changes and did a quick pass over the rest >> > of the fp cases. >> >> I've no further comments, but watch out for any related test failures >> being reported. > > g++.dg/opt/pr60849.C started ICEing on both X86_64 and ppc64le. > Same on arm/aarch64, but there are also other regressions on big-endian configs: See http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/249005/report-build-info.html > -- > Markus
Re: [PATCH 1/4 v3][PR 67328] Generate bittests in range checks if possible
On Tue, May 30, 2017 at 7:25 AM, Richard Sandiford wrote: > Yuri Gribov writes: >> Added special case to build_range_check. Fixed couple of existing >> tests where it changed codegen. >> >> -I >> >> From b7819f341e2ffa0437be497024f61d0a4e1be588 Mon Sep 17 00:00:00 2001 >> From: Yury Gribov >> Date: Fri, 26 May 2017 07:49:46 +0100 >> Subject: [PATCH 1/4] Generate bittests in range checks if possible. >> >> gcc/testsuite/ >> 2017-05-26 Yury Gribov >> >> * c-c++-common/fold-masked-cmp-1.c: New test. >> * c-c++-common/fold-masked-cmp-2.c: New test. >> * gcc.dg/pr46309.c: Fix pattern. >> * gcc.dg/pr46309-2.c: Fix pattern. >> >> gcc/ >> 2017-05-26 Yury Gribov >> >> * fold-const.c (maskable_range_p): New function. >> (build_range_check): Generate bittests if possible. >> --- >> gcc/fold-const.c | 41 >> - >> gcc/testsuite/c-c++-common/fold-masked-cmp-1.c | 41 >> + >> gcc/testsuite/c-c++-common/fold-masked-cmp-2.c | 42 >> ++ >> gcc/testsuite/gcc.dg/pr46309-2.c | 2 +- >> gcc/testsuite/gcc.dg/pr46309.c | 2 +- >> 5 files changed, 125 insertions(+), 3 deletions(-) >> create mode 100644 gcc/testsuite/c-c++-common/fold-masked-cmp-1.c >> create mode 100644 gcc/testsuite/c-c++-common/fold-masked-cmp-2.c >> >> diff --git a/gcc/fold-const.c b/gcc/fold-const.c >> index efc0b10..c334dc6 100644 >> --- a/gcc/fold-const.c >> +++ b/gcc/fold-const.c >> @@ -4745,6 +4745,38 @@ make_range (tree exp, int *pin_p, tree *plow, tree >> *phigh, >>*pin_p = in_p, *plow = low, *phigh = high; >>return exp; >> } >> + >> +/* Returns TRUE is [LOW, HIGH] range check can be optimized to > > s/is/if/ > >> + a bitwise check i.e. when >> + LOW == 0xXX...X00...0 >> + HIGH == 0xXX...X11...1 >> + Return corresponding mask in MASK and stem in VALUE. */ >> + >> +static bool >> +maskable_range_p (const_tree low, const_tree high, tree type, tree *mask, >> tree *value) >> +{ >> + if (TREE_CODE (low) != INTEGER_CST >> + || TREE_CODE (high) != INTEGER_CST) >> +return false; >> + >> + widest_int lo = wi::to_widest (low), >> +hi = wi::to_widest (high); > > It shouldn't be necessary to go to widest here, since AIUI both > values have the same type. Done. >> + widest_int end_mask = lo ^ hi, >> +stem_mask = ~end_mask; >> + if ((end_mask & (end_mask + 1)) != 0 >> + || (lo & end_mask) != 0) >> +return false; >> + >> + widest_int stem = lo & stem_mask; >> + if (stem != (hi & stem_mask)) >> +return false; > > FWIW, I think this is equivalent to: > > if (lo + (lo & -lo) != hi + 1) > return false; > > but I guess it's a matter of taste whether that's clearer or not. The change is cure but frankly I'd rather keep original version for the sake of readability... >> + *mask = wide_int_to_tree (type, stem_mask); >> + *value = wide_int_to_tree (type, stem); >> + >> + return true; I've attached an updated patch (rebased and retested on x64). Ok to commit? -Yury 0001-Generate-bittests-in-range-checks-if-possible.patch Description: Binary data
Re: [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest
On Wed, May 31, 2017 at 12:19 PM, Richard Biener wrote: > On Mon, 29 May 2017, Yuri Gribov wrote: > >> This no longer fixes the PR but still works in some cases as >> demonstrated by the test. So I decided to keep it. > > As Richard noticed you don't need widest_ints but can use wide_ints. > Please use == 0 instead of ! on wide-ints as well. > > +(for cmp (le gt) > + (simplify > .. > + (switch > + (if (cmp == LE_EXPR) > + (eq:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) { > build_zero_cst (ty); })) > + (if (cmp == GT_EXPR) > + (ne:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) { > build_zero_cst (ty); }) > > long lines plus you can simplify this with using > > (for cmp (le gt) > eqcmp (eq ne) > ... > > (eqcmp (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) > {build_zero_cst (ty); } > > no need to spell out :type on the result as well. Hi Richard, I fixed the issues (attached), rebased and retested on x64. Ok to commit? -Yury 0003-Optimize-some-masked-comparisons-to-efficient-bittes.patch Description: Binary data
RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
Thanks, I'm looking at the failure. My final validate seems to have only run the GCC tests. > -Original Message- > From: Christophe Lyon [mailto:christophe.l...@linaro.org] > Sent: 08 June 2017 13:00 > To: Markus Trippelsdorf > Cc: Joseph Myers; Tamar Christina; Jeff Law; GCC Patches; Wilco Dijkstra; > rguent...@suse.de; Michael Meissner; nd > Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like > numbers in GIMPLE. > > On 8 June 2017 at 12:30, Markus Trippelsdorf > wrote: > > On 2017.01.19 at 18:20 +, Joseph Myers wrote: > >> On Thu, 19 Jan 2017, Tamar Christina wrote: > >> > >> > Hi Joseph, > >> > > >> > I made the requested changes and did a quick pass over the rest of > >> > the fp cases. > >> > >> I've no further comments, but watch out for any related test failures > >> being reported. > > > > g++.dg/opt/pr60849.C started ICEing on both X86_64 and ppc64le. > > > > Same on arm/aarch64, but there are also other regressions on big-endian > configs: > See http://people.linaro.org/~christophe.lyon/cross- > validation/gcc/trunk/249005/report-build-info.html > > > > -- > > Markus
[PATCH] Fix failing ubsan/pr80932.c test
This test was failing e.g. on x86_64 with -m32. Instead of limiting to arches where long is 64-bit, I converted the test to use ints, which still demonstrates the same bug in folding, but should now pass everywhere. Tested on x86_64-linux, applying to trunk. 2017-06-08 Marek Polacek PR sanitize/80932 * c-c++-common/ubsan/pr80932.c: Test with ints, not with long ints. diff --git gcc/testsuite/c-c++-common/ubsan/pr80932.c gcc/testsuite/c-c++-common/ubsan/pr80932.c index a833712..92903f7 100644 --- gcc/testsuite/c-c++-common/ubsan/pr80932.c +++ gcc/testsuite/c-c++-common/ubsan/pr80932.c @@ -4,10 +4,10 @@ int x = 1; -long int +int foo (void) { - return ((long) (13801962912760474560ULL * x) - (long) (15334142073106273231ULL * x)) * -6; + return ((int) (2855545792U * x) - (int) (3269399503U * x)) * -5; } int Marek
Re: [PATCH 1/3] Come up with selftests for predict.c.
On 06/06/2017 12:44 PM, David Malcolm wrote: > On Tue, 2017-06-06 at 10:55 +0200, marxin wrote: > > Some minor nits below. > >> gcc/ChangeLog: >> >> 2017-05-26 Martin Liska >> >> * predict.c (struct branch_predictor): New struct. >> (test_prediction_value_range): New test. >> (predict_tests_c_tests): New function. >> * selftest-run-tests.c (selftest::run_tests): Run the function. >> * selftest.h: Declare new tests. >> --- >> gcc/predict.c| 39 >> +++ >> gcc/selftest-run-tests.c | 1 + >> gcc/selftest.h | 1 + >> 3 files changed, 41 insertions(+) >> >> diff --git a/gcc/predict.c b/gcc/predict.c >> index ac35fa41129..ea445e94e46 100644 >> --- a/gcc/predict.c >> +++ b/gcc/predict.c >> @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see >> #include "tree-scalar-evolution.h" >> #include "ipa-utils.h" >> #include "gimple-pretty-print.h" >> +#include "selftest.h" >> >> /* Enum with reasons why a predictor is ignored. */ >> >> @@ -3803,3 +3804,41 @@ force_edge_cold (edge e, bool impossible) >> impossible ? "impossible" : "cold"); >> } >> } >> + >> +#if CHECKING_P >> + >> +namespace selftest { >> + >> +/* Test that value range of predictor values defined in predict.def >> is >> + within range [51, 100]. */ >> + >> +struct branch_predictor >> +{ >> + const char *name; >> + unsigned probability; >> +}; >> + >> +#define DEF_PREDICTOR(ENUM, NAME, HITRATE, FLAGS) { NAME, HITRATE }, >> + >> +static void >> +test_prediction_value_range () >> +{ >> + branch_predictor predictors[] = { >> +#include "predict.def" >> +{NULL, -1} >> + }; >> + >> + for (unsigned i = 0; predictors[i].name != NULL; i++) >> +ASSERT_TRUE (100 * predictors[i].probability / REG_BR_PROB_BASE >>> 50); > > Minor nit: should there be a #undef DEF_PREDICTOR after the #include? Hello. Yes, that should be undefined. > > Minor nits: the comment says it verifies that things are in the range > 51, 100. Should it be >= 51 rather than > 50? Should there be a test > that it's <= 100? (I'm not familiar with the predict code, I just > noticed this when reading the comment). Should be in range (50,100], fixed accordingly. > >> +} >> + >> +/* Run all of the selfests within this file. */ >> + >> +void >> +predict_tests_c_tests () >> +{ > > Minor nit: to follow the pattern used in the other selftests, we've > been naming these "run all selftests within this file" functions after > the filename. Given this is in predict.c, this should be > "predict_c_tests ()" to follow the pattern. I followed your naming scheme. Thanks for review, Martin > >> + test_prediction_value_range (); >> +} >> + >> +} // namespace selftest >> +#endif /* CHECKING_P. */ >> diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c >> index f62bc72b072..af511a47681 100644 >> --- a/gcc/selftest-run-tests.c >> +++ b/gcc/selftest-run-tests.c >> @@ -92,6 +92,7 @@ selftest::run_tests () >> targetm.run_target_selftests (); >> >>store_merging_c_tests (); >> + predict_tests_c_tests (); > > ...and here. > > >>/* Run any lang-specific selftests. */ >>lang_hooks.run_lang_selftests (); >> diff --git a/gcc/selftest.h b/gcc/selftest.h >> index dad53e9fe09..e84b309359d 100644 >> --- a/gcc/selftest.h >> +++ b/gcc/selftest.h >> @@ -196,6 +196,7 @@ extern void tree_c_tests (); >> extern void tree_cfg_c_tests (); >> extern void vec_c_tests (); >> extern void wide_int_cc_tests (); >> +extern void predict_tests_c_tests (); > > (etc) > >> extern int num_passes; > > Thanks; hope this is constructive. > Dave > >From 05d47c50fbc72fcf4f91e1cd923bf9453e13eddd Mon Sep 17 00:00:00 2001 From: marxin Date: Tue, 6 Jun 2017 10:55:10 +0200 Subject: [PATCH] Come up with selftests for predict.c. gcc/ChangeLog: 2017-05-26 Martin Liska * predict.c (struct branch_predictor): New struct. (test_prediction_value_range): New test. (predict_c_tests): New function. * selftest-run-tests.c (selftest::run_tests): Run the function. * selftest.h: Declare new tests. --- gcc/predict.c| 44 gcc/selftest-run-tests.c | 1 + gcc/selftest.h | 1 + 3 files changed, 46 insertions(+) diff --git a/gcc/predict.c b/gcc/predict.c index ac35fa41129..5685012d491 100644 --- a/gcc/predict.c +++ b/gcc/predict.c @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-scalar-evolution.h" #include "ipa-utils.h" #include "gimple-pretty-print.h" +#include "selftest.h" /* Enum with reasons why a predictor is ignored. */ @@ -3803,3 +3804,46 @@ force_edge_cold (edge e, bool impossible) impossible ? "impossible" : "cold"); } } + +#if CHECKING_P + +namespace selftest { + +/* Test that value range of predictor values defined in predict.def is + within range (50, 100]. */ + +struct branch_predictor +{ + const char *name; + unsigned prob
Statically propagate basic blocks which are likely executed 0 times
Hi, this patch adds static code to detect basic block with 0 execution count. Those are basic block either reached only by EH or those which leads to call of function decorated with cold attribute. Function decorated by noreturn is not sufficient, because exit(0) is a call that is executed in most of program executions. Note that internally we have cold and unlikely functions where the first is optimized for size but the second is known to be unlikely executed by the program and is offloaded to special unlikely section. Perhaps we want to expose this to user and also distinguish between cold and unlikely function attributes. I guess this however can be done incrementally. As a followup I will decoreate trap/abort and friends with cold attributes. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza * predict.c (maybe_hot_bb_p): Do not check profile status. (maybe_hot_edge_p): Likewise. (probably_never_executed): Check for zero counts even if profile is not read. (unlikely_executed_edge_p): New function. (unlikely_executed_stmt_p): New function. (unlikely_executed_bb_p): New function. (set_even_probabilities): Use unlikely predicates. (combine_predictions_for_bb): Likewise. (predict_paths_for_bb): Likewise. (predict_paths_leading_to_edge): Likewise. (determine_unlikely_bbs): New function. (estimate_bb_frequencies): Use it. (compute_function_frequency): Use zero counts even if profile is not read. * profile-count.h: Fix typo. * g++.dg/tree-ssa/counts-1.C: New testcase. * gcc.dg/tree-ssa/counts-1.c: New testcase. Index: predict.c === --- predict.c (revision 249009) +++ predict.c (working copy) @@ -189,8 +189,8 @@ bool maybe_hot_bb_p (struct function *fun, const_basic_block bb) { gcc_checking_assert (fun); - if (profile_status_for_fn (fun) == PROFILE_READ) -return maybe_hot_count_p (fun, bb->count); + if (!maybe_hot_count_p (fun, bb->count)) +return false; return maybe_hot_frequency_p (fun, bb->frequency); } @@ -200,8 +200,8 @@ maybe_hot_bb_p (struct function *fun, co bool maybe_hot_edge_p (edge e) { - if (profile_status_for_fn (cfun) == PROFILE_READ) -return maybe_hot_count_p (cfun, e->count); + if (!maybe_hot_count_p (cfun, e->count)) +return false; return maybe_hot_frequency_p (cfun, EDGE_FREQUENCY (e)); } @@ -213,11 +213,10 @@ probably_never_executed (struct function profile_count count, int) { gcc_checking_assert (fun); + if (count == profile_count::zero ()) +return true; if (count.initialized_p () && profile_status_for_fn (fun) == PROFILE_READ) { - if (count == profile_count::zero ()) - return true; - int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); if (count.apply_scale (unlikely_count_fraction, 1) >= profile_info->runs) return false; @@ -763,6 +762,69 @@ dump_prediction (FILE *file, enum br_pre fprintf (file, "\n"); } +/* Return true if E is unlikely executed. */ + +static bool +unlikely_executed_edge_p (edge e) +{ + return e->count == profile_count::zero () +|| (e->flags & (EDGE_EH | EDGE_FAKE)); +} + +/* Return true if STMT is known to be unlikely executed. */ + +static bool +unlikely_executed_stmt_p (gimple *stmt) +{ + if (!is_gimple_call (stmt)) +return false; + /* NORETURN attribute enough is not strong enough: exit() may be quite + likely executed once during program run. */ + if (gimple_call_fntype (stmt) + && lookup_attribute ("cold", + TYPE_ATTRIBUTES (gimple_call_fntype (stmt))) + && !lookup_attribute ("cold", DECL_ATTRIBUTES (current_function_decl))) +return true; + tree decl = gimple_call_fndecl (stmt); + if (!decl) +return false; + if (lookup_attribute ("cold", DECL_ATTRIBUTES (decl)) + && !lookup_attribute ("cold", DECL_ATTRIBUTES (current_function_decl))) +return true; + + cgraph_node *n = cgraph_node::get (decl); + if (!n) +return false; + enum availability avail; + n = n->ultimate_alias_target (&avail); + if (avail < AVAIL_AVAILABLE) +return NULL; + if (!n->analyzed + || n->decl == current_function_decl) +return false; + return n->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED; +} + +/* Return true if BB is unlikely executed. */ + +static bool +unlikely_executed_bb_p (basic_block bb) +{ + if (bb->count == profile_count::zero ()) +return true; + if (bb == ENTRY_BLOCK_PTR_FOR_FN (cfun) || bb == EXIT_BLOCK_PTR_FOR_FN (cfun)) +return false; + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); + !gsi_end_p (gsi); gsi_next (&gsi)) +{ + if (unlikely_executed_stmt_p (gsi_stmt (gsi))) +return true; + if (stmt_can_terminate_bb_p (gsi_stmt (gsi))) + return false; +} + return
Re: [PATCH] Add mov[us]wb store intrinsics
Hello Julia, On 08 Jun 07:16, Koval, Julia wrote: > Hi, > These patch adds these 9 new intrinsics. Ok for trunk? Your patch is OK for trunk. I've checked it in for you. -- Thanks, K PS: Could you pls add [i386] or [x86] mark to the mail title?
Re: [PATCH 1/3] Come up with selftests for predict.c.
On Thu, 2017-06-08 at 14:30 +0200, Martin Liška wrote: > On 06/06/2017 12:44 PM, David Malcolm wrote: > > On Tue, 2017-06-06 at 10:55 +0200, marxin wrote: > > > > Some minor nits below. > > > > > gcc/ChangeLog: > > > > > > 2017-05-26 Martin Liska > > > > > > * predict.c (struct branch_predictor): New struct. > > > (test_prediction_value_range): New test. > > > (predict_tests_c_tests): New function. > > > * selftest-run-tests.c (selftest::run_tests): Run the function. > > > * selftest.h: Declare new tests. > > > --- > > > gcc/predict.c| 39 > > > +++ > > > gcc/selftest-run-tests.c | 1 + > > > gcc/selftest.h | 1 + > > > 3 files changed, 41 insertions(+) > > > > > > diff --git a/gcc/predict.c b/gcc/predict.c > > > index ac35fa41129..ea445e94e46 100644 > > > --- a/gcc/predict.c > > > +++ b/gcc/predict.c > > > @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not > > > see > > > #include "tree-scalar-evolution.h" > > > #include "ipa-utils.h" > > > #include "gimple-pretty-print.h" > > > +#include "selftest.h" > > > > > > /* Enum with reasons why a predictor is ignored. */ > > > > > > @@ -3803,3 +3804,41 @@ force_edge_cold (edge e, bool impossible) > > >impossible ? "impossible" : "cold"); > > > } > > > } > > > + > > > +#if CHECKING_P > > > + > > > +namespace selftest { > > > + > > > +/* Test that value range of predictor values defined in > > > predict.def > > > is > > > + within range [51, 100]. */ > > > + > > > +struct branch_predictor > > > +{ > > > + const char *name; > > > + unsigned probability; > > > +}; > > > + > > > +#define DEF_PREDICTOR(ENUM, NAME, HITRATE, FLAGS) { NAME, > > > HITRATE }, > > > + > > > +static void > > > +test_prediction_value_range () > > > +{ > > > + branch_predictor predictors[] = { > > > +#include "predict.def" > > > +{NULL, -1} > > > + }; > > > + > > > + for (unsigned i = 0; predictors[i].name != NULL; i++) > > > +ASSERT_TRUE (100 * predictors[i].probability / > > > REG_BR_PROB_BASE > > > > 50); > > > > Minor nit: should there be a #undef DEF_PREDICTOR after the > > #include? > > Hello. > > Yes, that should be undefined. > > > > > Minor nits: the comment says it verifies that things are in the > > range > > 51, 100. Should it be >= 51 rather than > 50? Should there be a > > test > > that it's <= 100? (I'm not familiar with the predict code, I just > > noticed this when reading the comment). > > Should be in range (50,100], fixed accordingly. > > > > > > +} > > > + > > > +/* Run all of the selfests within this file. */ > > > + > > > +void > > > +predict_tests_c_tests () > > > +{ > > > > Minor nit: to follow the pattern used in the other selftests, we've > > been naming these "run all selftests within this file" functions > > after > > the filename. Given this is in predict.c, this should be > > "predict_c_tests ()" to follow the pattern. > > I followed your naming scheme. > > Thanks for review, > Martin Thanks for fixing these things; looks much better. Note that I'm not technically a "reviewer" or "maintainer" for this part of the compiler, so I can't formally approve the patch (I was just pointing out some things I noticed). Dave
Re: Containers default initialization
On 07/06/17 22:44 +0200, François Dumont wrote: On 05/06/2017 13:31, Jonathan Wakely wrote: On 04/06/17 22:26 +0200, François Dumont wrote: Hi I have eventually adapt the test to all containers and the result is successful for map/set/unordered_map/unordered_set. It is failing for deque/list/forward_list/vector/vector. I even try to change the test to look at the difference between an explicit call to the default constructor done through the placement new call and an implicit call done on normal declaration. I wondered if we would have the same kind of difference we have between a int i; and a int i(); I tried to set the stack to ~0 before declaring the instance. I know there is no guarantee on the content of the stack for the following declaration but do you think it is reliable enough to commit it ? Ok to commit the successful tests ? No, I'm seeing failures for some of these if I add // { dg-options "-O0" } Franckly I don't understand the result of those tests. I would have expect map/set to fail and others to succeed. We might need help from compiler guys, no ? I think your tests are just insufficient. With optimisation enabled (the testsuite uses -O2 by default) the compiler can remove the memset just before the __aligned_buffer goes out of scope, because it is unobservable in a correct program. This is similar to the situation described at https://gcc.gnu.org/gcc-6/porting_to.html#flifetime-dse If I change the placement new-expressions to use default-init instead of value-init and use -O0 then I see all four tests FAIL here: test_type *tmp = ::new(buf._M_addr()) test_type; // not test_type() Ok, I didn't know we could do this. So I have added value_init.cc tests showing the problem for all containers. The patch also contains the fix for rb_tree so that map/set are now successful. Looks there is really a misunderstanding on what the compiler is doing. If container calls _Node_allocator() it will be transform by compiler into default initialization if container default container is being called or into value init if container is value initialized Oh dear, we have a compiler bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65816 which takes place only if I call this: test_type *tmp = ::new(buf._M_addr()) test_type {}; To force default/value init looks like gcc forces you to explicitly build an allocator instance like in the attached patch. I'm getting uncomfortable with all this churn just to be able to use =default, but requiring more new base classes and #if blocks. We have one known regression already, which needs your patch to fix it. (Admittedly, it's not a very serious regression, because I think real world allocators that are stateful but don't have user-provided default constructors are probably very very unlikely). I'm leaning towards just make the exception specifications correct using _GLIBCXX_NOEXCEPT_IF and not doing any refactoring. That also has the benefit that the code might help guide the future direction of the standard, by telling us what conditions could be used if we add noexcept to these constructors in the standard. Index: include/bits/stl_tree.h === --- include/bits/stl_tree.h (revision 248855) +++ include/bits/stl_tree.h (working copy) @@ -687,9 +687,17 @@ #if __cplusplus < 201103L _Rb_tree_impl() + : _Node_allocator() { } #else - _Rb_tree_impl() = default; + _Rb_tree_impl() + noexcept( + noexcept(_Node_allocator()) && noexcept(_Base_key_compare()) ) + : _Rb_tree_impl(_Key_compare(), _Node_allocator()) If we didn't have the PR65816 bug this should be simply: _Rb_tree_impl() _GLIBCXX_NOEXCEPT_IF( is_nothrow_default_constructible<_Key_compare>::value && is_nothrow_default_constructible<_Node_allocator>::value ) : _Node_allocator() { } i.e. the same for C++98 and other modes. Because of the compiler bug that doesn't work. I think we should fix the compiler. + { } +#endif + +#if __cplusplus >= 201103L There's no need for an #endif/#if here, the previous block is already C++11-and-later code. _Rb_tree_impl(_Rb_tree_impl&&) = default; #endif Index: testsuite/23_containers/map/allocator/default_init.cc === --- testsuite/23_containers/map/allocator/default_init.cc (nonexistent) +++ testsuite/23_containers/map/allocator/default_init.cc (working copy) @@ -0,0 +1,52 @@ +// Copyright (C) 2017 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any la
Re: [PATCH] Fix PR81007
On Thu, 8 Jun 2017, Richard Biener wrote: > > Folding during gimplification can invoke the devirt machinery which > doesn't deal with errorneous state. Thus avoid ICEing by not folding > from gimplification in case we've seen errors. > > Similarly do not build cgraph edges in those cases as that invokes > the devirt machinery as well (we stop compilation after lowering anyway > in case errors were reported). > > The patch also fixes ordering of passes. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. Bah. The cgraphbuild.c hunks cause FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 111) FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 111) FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 63) FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 82) FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 90) FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 90) ... looks like the code processing queued cgraph nodes enqueues further nodes by looking at callees and thus omp lowering doesn't register split out functions with the cgraph? No time to dig in right now. But relying on the folding machinery not to ICE looks fragile to me. Eventually, given the ICE is reached by cgraph edge building, we have to plug that hole anyways. Richard. > Richard. > > 2017-06-08 Richard Biener > > PR middle-end/81007 > * gimplify.c (maybe_fold_stmt): Do not fold after errors. > * cgraphbuild.c: Include diagnostic-core.h. > (pass_build_cgraph_edges::gate): Add, do not run after errors. > * passes.def (all_lowering_passes): Run pass_build_cgraph_edges > last again. > > * g++.dg/pr81007.C: New testcase. > > Index: gcc/gimplify.c > === > --- gcc/gimplify.c(revision 249003) > +++ gcc/gimplify.c(working copy) > @@ -3067,6 +3067,10 @@ gimplify_arg (tree *arg_p, gimple_seq *p > static bool > maybe_fold_stmt (gimple_stmt_iterator *gsi) > { > + /* Do not fold if we may have invalid IL somewhere. */ > + if (seen_error ()) > +return false; > + >struct gimplify_omp_ctx *ctx; >for (ctx = gimplify_omp_ctxp; ctx; ctx = ctx->outer_context) > if ((ctx->region_type & (ORT_TARGET | ORT_PARALLEL | ORT_TASK)) != 0) > Index: gcc/cgraphbuild.c > === > --- gcc/cgraphbuild.c (revision 249003) > +++ gcc/cgraphbuild.c (working copy) > @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. > #include "gimple-walk.h" > #include "ipa-utils.h" > #include "except.h" > +#include "diagnostic-core.h" > > /* Context of record_reference. */ > struct record_reference_ctx > @@ -305,6 +306,7 @@ public: >/* opt_pass methods: */ >virtual unsigned int execute (function *); > > + virtual bool gate (function *) { return ! seen_error (); } > }; // class pass_build_cgraph_edges > > unsigned int > Index: gcc/passes.def > === > --- gcc/passes.def(revision 249003) > +++ gcc/passes.def(working copy) > @@ -42,9 +42,9 @@ along with GCC; see the file COPYING3. >NEXT_PASS (pass_build_cfg); >NEXT_PASS (pass_warn_function_return); >NEXT_PASS (pass_expand_omp); > - NEXT_PASS (pass_build_cgraph_edges); >NEXT_PASS (pass_sprintf_length, false); >NEXT_PASS (pass_walloca, /*strict_mode_p=*/true); > + NEXT_PASS (pass_build_cgraph_edges); >TERMINATE_PASS_LIST (all_lowering_passes) > >/* Interprocedural optimization passes. */ > Index: gcc/testsuite/g++.dg/pr81007.C > === > --- gcc/testsuite/g++.dg/pr81007.C(nonexistent) > +++ gcc/testsuite/g++.dg/pr81007.C(working copy) > @@ -0,0 +1,14 @@ > +// { dg-do compile } > + > +struct A > +{ > + A p; // { dg-error "incomplete" } > + virtual void foo(); > +}; > + > +struct B : A {}; > + > +void bar(B& b) > +{ > + b.foo(); > +} > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH v2] Implement no_sanitize function attribute
On 06/02/2017 12:40 PM, Richard Biener wrote: > On Wed, May 31, 2017 at 4:13 PM, Martin Liška wrote: >> On 05/31/2017 03:31 PM, Richard Biener wrote: >>> On Wed, May 31, 2017 at 2:28 PM, Martin Liška wrote: On 05/31/2017 02:04 PM, Richard Biener wrote: > On Wed, May 31, 2017 at 1:51 PM, Jakub Jelinek wrote: >> On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote: >>> Just wanting to add that "ab-"using options/variables to implement >>> what are really >>> function attributes doesn't look very clean. Unless the plan is to get >>> rid of >>> function attributes in favor of per-function options. >> >> Function attribute here is one thing (the way user writes it) and that >> combined with the command line options determines the sanitization >> performed >> (the function attributes only say what sanitization flags should be >> ignored). The proposed per-function variable is just a cache of this >> information, because parsing function attributes every time is way too >> expensive. > > True, but isn't that just an excuse to not improve attribute list > representation? > > Ideally we'd have sth like attributes.def and a sorted vector of > integer id, args > pairs. Using a sorted vector of the existing stuff (compared to the tree > list) > might also help. Then it would be tree-wise very similar to CONSTRUCTOR which also contains vector of (index, value) pairs? > > Yes, we'd get (quite?) a bit less attribute list sharing this way but > we can still > share the actual tree-whatever thing that represents the args. Any estimation how difficult such transformation would be? >>> >>> attribute lists are dealt with in quite some places (with or without >>> helpers) so I guess it would be somewhat invasive but largely >>> mechanical. Using a .def file vs. the current strings can be >>> done separately -- after all we can also sort strings. I suspect >>> doing the string -> ID transform pays off faster (still linear search >>> but integer comparison instead of string compare). >> >> Ok, I'm ready to do the transformation in this stage1. That said, will you be >> Jakub fine with the original patch (rebase will be needed) as it is, using >> DECL_ATTRIBUTE? > > It looks ok to me though I miss __attribute__((no_sanitize("all"))) (or is no > argument equal to 'all' -- spelling out all those opts in the testcases looks > awkward). Hi. "all" value is supported, I fixed small issue with that and it's covered by a test-case now. > > I'd appreciate a 2nd eye though, the patch is large. > > The use of sanitize_flags_p (...) in pass_ubsan::execute would appreciate > sth like > > flags = sanitize_flags (..., fun->decl); > > so it can cache across different flag settings. Done that. Martin > > Thanks, > Richard. > >> Thanks, >> Martin >> >>> >>> Richard. >>> Martin > > Richard. > >> >> Jakub >> >From 2c4a169b1415d92e554c3adf7b5ea1143ae825df Mon Sep 17 00:00:00 2001 From: marxin Date: Thu, 1 Jun 2017 11:36:36 +0200 Subject: [PATCH] Implement no_sanitize function attribute gcc/testsuite/ChangeLog: 2017-06-01 Martin Liska * c-c++-common/ubsan/attrib-2.c (float_cast2): Enhance the test by adding no_sanitize attribute. * gcc.dg/asan/use-after-scope-4.c: Likewise. gcc/c-family/ChangeLog: 2017-06-01 Martin Liska * c-attribs.c (add_no_sanitize_value): New function. (handle_no_sanitize_attribute): Likewise. (handle_no_sanitize_address_attribute): Use the function. (handle_no_sanitize_thread_attribute): New function. (handle_no_address_safety_analysis_attribute): Use add_no_sanitize_value. (handle_no_sanitize_undefined_attribute): Likewise. * c-common.h: Declare new functions. * c-ubsan.c (ubsan_instrument_division): Use sanitize_flags_p. (ubsan_instrument_shift): Likewise. (ubsan_instrument_bounds): Likewise. (ubsan_maybe_instrument_array_ref): Likewise. (ubsan_maybe_instrument_reference_or_call): Likewise. gcc/ChangeLog: 2017-06-01 Martin Liska * asan.c (asan_sanitize_stack_p): Use sanitize_flags_p. (gate_asan): Likewise. * asan.h (asan_no_sanitize_address_p): Remove the function. * builtins.def: Fix coding style. * common.opt: Use renamed enum value. * convert.c (convert_to_integer_1): Use sanitize_flags_p. * doc/extend.texi: Document no_sanitize attribute. * flag-types.h (enum sanitize_code): Rename SANITIZE_NONDEFAULT to SANITIZE_UNDEFINED_NONDEFAULT. * gcc.c (sanitize_spec_function): Use the renamed enum value. * gimple-fold.c (optimize_atomic_compare_exchange_p): Use sanitize_flags_p. * gimplify.c (gimplify_function_tree): Likewise. * ipa-inline.c (sanitize_attrs_match_for_inline_p): Likewise. * opts.c (parse_no_sanitize_attribute): New function. (common_handle_option): Use renamed enum value. * opts.h (parse_no_sanitize_attribute): Declare. * tree.c (saniti
Re: [PATCH] Fix PR81007
On Thu, 8 Jun 2017, Richard Biener wrote: > On Thu, 8 Jun 2017, Richard Biener wrote: > > > > > Folding during gimplification can invoke the devirt machinery which > > doesn't deal with errorneous state. Thus avoid ICEing by not folding > > from gimplification in case we've seen errors. > > > > Similarly do not build cgraph edges in those cases as that invokes > > the devirt machinery as well (we stop compilation after lowering anyway > > in case errors were reported). > > > > The patch also fixes ordering of passes. > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > Bah. The cgraphbuild.c hunks cause > > FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 111) > FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 111) > FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 63) > FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 82) > FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 90) > FAIL: g++.dg/gomp/linear-2.C -std=gnu++11 (test for errors, line 90) > ... > > looks like the code processing queued cgraph nodes enqueues > further nodes by looking at callees and thus omp lowering > doesn't register split out functions with the cgraph? > > No time to dig in right now. > > But relying on the folding machinery not to ICE looks fragile to me. > Eventually, given the ICE is reached by cgraph edge building, we > have to plug that hole anyways. So I'm testing the following instead. Richard. 2017-06-08 Richard Biener PR middle-end/81007 * ipa-polymorphic-call.c (ipa_polymorphic_call_context::restrict_to_inner_class): Skip FIELD_DECLs with error_mark_node type. * passes.def (all_lowering_passes): Run pass_build_cgraph_edges last again. * g++.dg/pr81007.C: New testcase. Index: gcc/ipa-polymorphic-call.c === --- gcc/ipa-polymorphic-call.c (revision 249003) +++ gcc/ipa-polymorphic-call.c (working copy) @@ -267,7 +267,8 @@ ipa_polymorphic_call_context::restrict_t { for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld)) { - if (TREE_CODE (fld) != FIELD_DECL) + if (TREE_CODE (fld) != FIELD_DECL + || TREE_TYPE (fld) == error_mark_node) continue; pos = int_bit_position (fld); Index: gcc/passes.def === --- gcc/passes.def (revision 249003) +++ gcc/passes.def (working copy) @@ -42,9 +42,9 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_build_cfg); NEXT_PASS (pass_warn_function_return); NEXT_PASS (pass_expand_omp); - NEXT_PASS (pass_build_cgraph_edges); NEXT_PASS (pass_sprintf_length, false); NEXT_PASS (pass_walloca, /*strict_mode_p=*/true); + NEXT_PASS (pass_build_cgraph_edges); TERMINATE_PASS_LIST (all_lowering_passes) /* Interprocedural optimization passes. */ Index: gcc/testsuite/g++.dg/pr81007.C === --- gcc/testsuite/g++.dg/pr81007.C (nonexistent) +++ gcc/testsuite/g++.dg/pr81007.C (working copy) @@ -0,0 +1,15 @@ +// { dg-do compile } +// { dg-options "-O2" } + +struct A +{ + A p; // { dg-error "incomplete" } + virtual void foo(); +}; + +struct B : A {}; + +void bar(B& b) +{ + b.foo(); +}
Re: [PATCH v2] Implement no_sanitize function attribute
Hi! I'd still prefer to handle it with the flags infrastructure instead, but if Richard wants to do it this way, then at least: On Thu, Jun 08, 2017 at 03:30:49PM +0200, Martin Liška wrote: > +/* Return true when flag_sanitize & FLAG is non-zero. If FN is non-null, > + remove all flags mentioned in "no_sanitize_flags" of DECL_ATTRIBUTES. */ > + > +bool > +sanitize_flags_p (unsigned int flag, const_tree fn) > +{ > + unsigned int result_flags = flag_sanitize & flag; This function really should be either inline, or partly inline, partly out of line, to handle the common case (sanitization of something not enabled) in the fast path. And, it should have an early out, if (result_flags == 0) return false; > + > + if (fn != NULL_TREE) > +{ > + tree value = lookup_attribute ("no_sanitize_flags", DECL_ATTRIBUTES > (fn)); The attribute, if it is internal only, should have spaces or similar characters in its name, like "fn spec", "omp declare target" and many others. +add_no_sanitize_value (tree node, unsigned int flags) +{ + tree attr = lookup_attribute ("no_sanitize_flags", DECL_ATTRIBUTES (node)); + if (attr) +{ + unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr)); + flags |= old_value; +} + + DECL_ATTRIBUTES (node) += tree_cons (get_identifier ("no_sanitize_flags"), +build_int_cst (unsigned_type_node, flags), +DECL_ATTRIBUTES (node)); If there is a previous attribute already, can't you modify it in place? If not, as it could be perhaps shared? with other functions somehow, at least you should avoid adding a new attribute if (old_value | flags) == old_value. Jakub
Re: [PATCH 4/6] Port prefetch configuration from aarch32 to aarch64
On Mon, Jan 30, 2017 at 08:35:00AM -0800, Andrew Pinski wrote: > On Mon, Jan 30, 2017 at 3:48 AM, Maxim Kuvyrkov > wrote: > > This patch port prefetch configuration from aarch32 backend to aarch64. > > There is no code-generation change from this patch. > > > > This patch also happens to address Kyrill's comment on Andrew's prefetching > > patch at https://gcc.gnu.org/ml/gcc-patches/2017-01/msg02133.html . > > > > This patch also fixes a minor bug in aarch64_override_options_internal(), > > which used "selected_cpu->tune" instead of "aarch64_tune_params". > > I am not a fan of the macro at all. I'm with Andrew for this. The precedent in the AArch64 port is for explicitly spelling this out, as we do with the branch costs, approx_modes, vector costs etc. I'd rather we went that route than the macro you're using. I don't have any objections to the rest of your patch. Thanks, James
Re: [PATCH] Fix PR66623
On Thu, 8 Jun 2017, Richard Biener wrote: > > The following fixes unsafe vectorization of reductions in outer loop > vectorization. It combines it with a bit of cleanup in the affected > function. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. Re-testing the following variant -- multiple loop-closed PHI nodes for the same SSA name happen. Richard. 2017-06-08 Richard Biener PR tree-optimization/66623 * tree-vect-loop.c (vect_is_simple_reduction): Cleanup, refactor check_reduction into two parts, properly computing whether we have to check reduction validity for outer loop vectorization. * gcc.dg/vect/pr66623.c: New testcase. Index: gcc/tree-vect-loop.c === --- gcc/tree-vect-loop.c(revision 249007) +++ gcc/tree-vect-loop.c(working copy) @@ -2727,8 +2727,6 @@ vect_is_simple_reduction (loop_vec_info { struct loop *loop = (gimple_bb (phi))->loop_father; struct loop *vect_loop = LOOP_VINFO_LOOP (loop_info); - edge latch_e = loop_latch_edge (loop); - tree loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e); gimple *def_stmt, *def1 = NULL, *def2 = NULL, *phi_use_stmt = NULL; enum tree_code orig_code, code; tree op1, op2, op3 = NULL_TREE, op4 = NULL_TREE; @@ -2742,11 +2740,6 @@ vect_is_simple_reduction (loop_vec_info *double_reduc = false; *v_reduc_type = TREE_CODE_REDUCTION; - /* Check validity of the reduction only for the innermost loop. */ - bool check_reduction = ! flow_loop_nested_p (vect_loop, loop); - gcc_assert ((check_reduction && loop == vect_loop) - || (!check_reduction && flow_loop_nested_p (vect_loop, loop))); - name = PHI_RESULT (phi); /* ??? If there are no uses of the PHI result the inner loop reduction won't be detected as possibly double-reduction by vectorizable_reduction @@ -2775,13 +2768,15 @@ vect_is_simple_reduction (loop_vec_info { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"reduction used in loop.\n"); +"reduction value used in loop.\n"); return NULL; } phi_use_stmt = use_stmt; } + edge latch_e = loop_latch_edge (loop); + tree loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e); if (TREE_CODE (loop_arg) != SSA_NAME) { if (dump_enabled_p ()) @@ -2795,18 +2790,22 @@ vect_is_simple_reduction (loop_vec_info } def_stmt = SSA_NAME_DEF_STMT (loop_arg); - if (!def_stmt) + if (gimple_nop_p (def_stmt)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"reduction: no def_stmt.\n"); +"reduction: no def_stmt\n"); return NULL; } if (!is_gimple_assign (def_stmt) && gimple_code (def_stmt) != GIMPLE_PHI) { if (dump_enabled_p ()) - dump_gimple_stmt (MSG_NOTE, TDF_SLIM, def_stmt, 0); + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "reduction: unhandled reduction operation: "); + dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, def_stmt, 0); + } return NULL; } @@ -2822,6 +2821,7 @@ vect_is_simple_reduction (loop_vec_info } nloop_uses = 0; + auto_vec lcphis; FOR_EACH_IMM_USE_FAST (use_p, imm_iter, name) { gimple *use_stmt = USE_STMT (use_p); @@ -2829,6 +2829,9 @@ vect_is_simple_reduction (loop_vec_info continue; if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))) nloop_uses++; + else + /* We can have more than one loop-closed PHI. */ + lcphis.safe_push (as_a (use_stmt)); if (nloop_uses > 1) { if (dump_enabled_p ()) @@ -2873,6 +2876,27 @@ vect_is_simple_reduction (loop_vec_info return NULL; } + /* If we are vectorizing an inner reduction we are executing that + in the original order only in case we are not dealing with a + double reduction. */ + bool check_reduction = true; + if (flow_loop_nested_p (vect_loop, loop)) +{ + gphi *lcphi; + unsigned i; + check_reduction = false; + FOR_EACH_VEC_ELT (lcphis, i, lcphi) + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (lcphi)) + { + gimple *use_stmt = USE_STMT (use_p); + if (is_gimple_debug (use_stmt)) + continue; + if (! flow_bb_inside_loop_p (vect_loop, gimple_bb (use_stmt))) + check_reduction = true; + } +} + + bool nested_in_vect_loop = flow_loop_nested_p (vect_loop, loop); code = orig_code = gimple_assign_rhs_code (def_stmt); /* We can handle "res -= x[i]", which is non-associative by @@ -2887,27 +2911,8 @@ vect_is_simple_reduction (loop_vec_info if (code == COND_EXPR) { - if (check_reduction) +
fix libgcc build on VxWorks, missing -I for wrn/coreip
Hello, To circumvent unfortunate header files naming conflicts, the VxWorks port resorts to various tricks. One of them still on mainline consists in this twist in libgcc/config/t-vxworks: # This ensures that the correct target headers are used; some # VxWorks system headers have names that collide with GCC's # internal (host) headers, e.g. regs.h. LIBGCC2_INCLUDES = -nostdinc -I \ `case "/$$(MULTIDIR)" in \ */mrtp*) echo $(WIND_USR)/h ;; \ *) echo $(WIND_BASE)/target/h ;; \ esac` There is an alternate approach to this also on mainline, via fixincludes. Both methods incur a couple of problems, and I'll soon send messages to discuss possible resolutions. In the most immediate term, the current setup causes a libgcc build failure for e.g. powerpc-wrs-vxworks (for VxWorks 6.9 in my case), with, among other things: In file included from /chelles.a/users/hainque/bld-gcc/gcc/include-fixed/unistd.h:13:0, from /chelles.b/users/hainque/gcc-svn/libgcc/../gcc/tsystem.h:102, from /chelles.b/users/hainque/gcc-svn/libgcc/libgcc2.c:27: .../powerpc-wrs-vxworks/sys-include/ioLib.h:159:10: fatal error: net/uio.h: No such file or directory VxWorks headers perform #includes relative not only to target/h, but also to target/h/wrn/coreip. The attached patch fixes this by twisting the LIBGCC2_INCLUDES definition above so it passes the additional -I. Committing to mainline, after testing that it cures the aforementioned error. This might be turn out only a temporary measure pending results of further discussions on other aspects regarding the workarounds for the header file name conflicts, but this is all very intricate and I find it easier to proceed incrementally. With Kind Regards, Olivier 2017-06-08 Olivier Hainque libgcc/ * config/t-vxworks (LIBGCC2_INCLUDES): Add wrn/coreip to the set of -I options, support for direct inclusions of net/uio.h by VxWorks header files via ioLib.h. vx-coreip.diff Description: Binary data
Avoid useless overhead of -freorder-blocks-and-partition
Hi, enabling -freorder-functions-and-partition leads to outputting many dead labels for cold sections of functions that was not partitioned as well as to many addition COMDAT sections for empty partitions. This patch fixes it by using the partitined path of final codegen only when function was actually partitioned. I have double checked that those checks that are executed for x86-64 at all are actually checked after has_bb_partition is initialized. There is additional check in scheduled and holoop but those are both done from machine reorg that is late as well. Bootstrapped/regtested x86_64 and profiledbootstrapped with additional patches, will commit it shortly. Honza * cfgrtl.c (cfg_layout_initialize): Check crtl->has_bb_partition instead of flag_reorder_blocks_and_partition. * dbxout.c (dbxout_function_end): Likewise. * dwarf2out.c (gen_subprogram_die): Likewise. * haifa-sched.c (sched_create_recovery_edges): Likewise. * hw-doloop.c (reorg_loops): Likewise. * varasm.c (assemble_start_function, assemble_end_function): Likewise. (decide_function_section): Do not check for flag_reorder_blocks_and_partition. Index: cfgrtl.c === --- cfgrtl.c(revision 249013) +++ cfgrtl.c(working copy) @@ -4249,8 +4249,7 @@ cfg_layout_initialize (int flags) layout required moving a block from the hot to the cold section. This would create an illegal partitioning unless some manual fixup was performed. */ - gcc_assert (!(crtl->bb_reorder_complete - && flag_reorder_blocks_and_partition)); + gcc_assert (!crtl->bb_reorder_complete || !crtl->has_bb_partition); initialize_original_copy_tables (); Index: dbxout.c === --- dbxout.c(revision 249013) +++ dbxout.c(working copy) @@ -916,7 +916,7 @@ dbxout_function_end (tree decl ATTRIBUTE /* By convention, GCC will mark the end of a function with an N_FUN symbol and an empty string. */ - if (flag_reorder_blocks_and_partition) + if (crtl->has_bb_partition) { dbxout_begin_empty_stabs (N_FUN); dbxout_stab_value_label_diff (crtl->subsections.hot_section_end_label, Index: dwarf2out.c === --- dwarf2out.c (revision 249013) +++ dwarf2out.c (working copy) @@ -22152,7 +22152,7 @@ gen_subprogram_die (tree decl, dw_die_re struct function *fun = DECL_STRUCT_FUNCTION (decl); - if (!flag_reorder_blocks_and_partition) + if (!crtl->has_bb_partition) { dw_fde_ref fde = fun->fde; if (fde->dw_fde_begin) @@ -26472,7 +26472,7 @@ set_cur_line_info_table (section *sec) { const char *end_label; - if (flag_reorder_blocks_and_partition) + if (crtl->has_bb_partition) { if (in_cold_section_p) end_label = crtl->subsections.cold_section_end_label; @@ -26514,7 +26514,7 @@ dwarf2out_begin_function (tree fun) if (sec != text_section) have_multiple_function_sections = true; - if (flag_reorder_blocks_and_partition && !cold_text_section) + if (crtl->has_bb_partition && !cold_text_section) { gcc_assert (current_function_decl == fun); cold_text_section = unlikely_text_section (); Index: haifa-sched.c === --- haifa-sched.c (revision 249013) +++ haifa-sched.c (working copy) @@ -8313,8 +8313,7 @@ sched_create_recovery_edges (basic_block /* Partition type is the same, if it is "unpartitioned". */ { /* Rewritten from cfgrtl.c. */ - if (flag_reorder_blocks_and_partition - && targetm_common.have_named_sections) + if (crtl->has_bb_partition && targetm_common.have_named_sections) { /* We don't need the same note for the check because any_condjump_p (check) == true. */ Index: hw-doloop.c === --- hw-doloop.c (revision 249013) +++ hw-doloop.c (working copy) @@ -634,7 +634,7 @@ reorg_loops (bool do_reorder, struct hw_ /* We can't enter cfglayout mode anymore if basic block partitioning already happened. */ - if (do_reorder && !flag_reorder_blocks_and_partition) + if (do_reorder && !crtl->has_bb_partition) { reorder_loops (loops); free_loops (loops); Index: varasm.c === --- varasm.c(revision 249013) +++ varasm.c(working copy) @@ -1670,10 +1670,6 @@ decide_function_section (tree decl) { first_function_block_is_cold = false; - if (flag_reorder_blocks_and_partition) -/* We will decide in assemble_start_function. */ -return; - if (DECL_SECTION_NAME (decl)) { struct cgraph_node *node = cgraph_node::get (current_fun
[PATCH] PR libstdc++/81017 add noexcept to std::function move operations
I'm amazed nobody's noticed this before, but std::function's move constructor and move assignment operator are not noexcept. Fixed like so. PR libstdc++/81017 * include/bits/std_function.h (function::function(function&&)) (function::operator=(funtion&&)): Add noexcept. * testsuite/20_util/function/assign/move.cc: Check for noexcept. * testsuite/20_util/function/cons/move.cc: Likewise. Tested powerpc64le-linux, committed to trunk. I think this should be backported too. commit 9f20b90ff85d9581a3f82dcfd9300775316b6fd3 Author: Jonathan Wakely Date: Thu Jun 8 15:15:26 2017 +0100 PR libstdc++/81017 add noexcept to std::function move operations PR libstdc++/81017 * include/bits/std_function.h (function::function(function&&)) (function::operator=(funtion&&)): Add noexcept. * testsuite/20_util/function/assign/move.cc: Check for noexcept. * testsuite/20_util/function/cons/move.cc: Likewise. diff --git a/libstdc++-v3/include/bits/std_function.h b/libstdc++-v3/include/bits/std_function.h index c4ea347..a9ba756 100644 --- a/libstdc++-v3/include/bits/std_function.h +++ b/libstdc++-v3/include/bits/std_function.h @@ -438,7 +438,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * The newly-created %function contains the target of @a __x * (if it has one). */ - function(function&& __x) : _Function_base() + function(function&& __x) noexcept : _Function_base() { __x.swap(*this); } @@ -495,7 +495,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * object, then this operation will not throw an %exception. */ function& - operator=(function&& __x) + operator=(function&& __x) noexcept { function(std::move(__x)).swap(*this); return *this; diff --git a/libstdc++-v3/testsuite/20_util/function/assign/move.cc b/libstdc++-v3/testsuite/20_util/function/assign/move.cc index 5264623..8658527 100644 --- a/libstdc++-v3/testsuite/20_util/function/assign/move.cc +++ b/libstdc++-v3/testsuite/20_util/function/assign/move.cc @@ -38,11 +38,12 @@ void test01() fo2 = (std::move(fo)); VERIFY( static_cast(fo2) ); VERIFY( fo2() == 2 ); + + static_assert(std::is_nothrow_move_assignable::value, + "PR libstdc++/81017"); } int main() { test01(); - - return 0; } diff --git a/libstdc++-v3/testsuite/20_util/function/cons/move.cc b/libstdc++-v3/testsuite/20_util/function/cons/move.cc index 1cdfeed..7dbc511 100644 --- a/libstdc++-v3/testsuite/20_util/function/cons/move.cc +++ b/libstdc++-v3/testsuite/20_util/function/cons/move.cc @@ -36,11 +36,12 @@ void test01() function fo2(std::move(fo)); VERIFY( static_cast(fo2) ); VERIFY( fo2() == 2 ); + + static_assert(std::is_nothrow_move_constructible::value, + "PR libstdc++/81017"); } int main() { test01(); - - return 0; }
Re: RFA: Enhance information recorded by -frecord-gcc-switches
Hi Richard, >>The -frecord-gcc-switches option records the gcc command line. It >>does not however expand options like -O2 into the optimizations that >>this enables. > > I think individually enabled passes by -On are not really useful information > as you generally cannot replace -On by a set of other switches on the > command-line. Is that true ? I always thought that -O2 could be duplicated by using a (very long) set of individual command line options. Regardless, the point of this patch is to record which options were enabled, via whatever route, in the binaries. This can be useful to users, or distributors, who want to check that, for example, a specific security option was enabled, or that a particular a particular optimization was run. Cheers Nick
Do not silently disable flag_reorder_functions when profile info is missing
Hi, flag_reorder_functions is now able to offload code leading to cold function calls. For this reason it now makes sense to enable with w/o profiling as well (I plan to enable it for -O2 but by a separate patch) Bootstrapped/regtested x86_64-linux, comitted. Honza * opts.c (finish_options): x_flag_reorder_blocks_and_partition no longer requires x_flag_profile_use. Index: opts.c === --- opts.c (revision 249013) +++ opts.c (working copy) @@ -863,16 +863,6 @@ finish_options (struct gcc_options *opts opts->x_flag_reorder_blocks = 1; } - /* Disable -freorder-blocks-and-partition when -fprofile-use is not in - effect. Function splitting was not actually being performed in that case, - as probably_never_executed_bb_p does not distinguish any basic blocks as - being cold vs hot when there is no profile data. Leaving it enabled, - however, causes the assembly code generator to create (empty) cold - sections and labels, leading to unnecessary size overhead. */ - if (opts->x_flag_reorder_blocks_and_partition - && !opts_set->x_flag_profile_use) -opts->x_flag_reorder_blocks_and_partition = 0; - if (opts->x_flag_reorder_blocks_and_partition && !opts_set->x_flag_reorder_functions) opts->x_flag_reorder_functions = 1;
Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)
I think I'll go with -Wmultistatement-expansion (without the dash). Fine with me, FWIW. I like this name better, but I think -Wmultistatement-macros would be clearer (it also matches the CERT rule). Martin
Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)
On Thu, Jun 08, 2017 at 08:53:54AM -0600, Martin Sebor wrote: > > > I think I'll go with -Wmultistatement-expansion (without the dash). > > > > Fine with me, FWIW. > > I like this name better, but I think -Wmultistatement-macros > would be clearer (it also matches the CERT rule). Ok, I'll buy this. I'll rename the warning once again :). Marek
Re: [PATCH 4/6] Port prefetch configuration from aarch32 to aarch64
On 08/06/17 14:47, James Greenhalgh wrote: > On Mon, Jan 30, 2017 at 08:35:00AM -0800, Andrew Pinski wrote: >> On Mon, Jan 30, 2017 at 3:48 AM, Maxim Kuvyrkov >> wrote: >>> This patch port prefetch configuration from aarch32 backend to aarch64. >>> There is no code-generation change from this patch. >>> >>> This patch also happens to address Kyrill's comment on Andrew's prefetching >>> patch at https://gcc.gnu.org/ml/gcc-patches/2017-01/msg02133.html . >>> >>> This patch also fixes a minor bug in aarch64_override_options_internal(), >>> which used "selected_cpu->tune" instead of "aarch64_tune_params". >> >> I am not a fan of the macro at all. > > I'm with Andrew for this. The precedent in the AArch64 port is for > explicitly spelling this out, as we do with the branch costs, approx_modes, > vector costs etc. > > I'd rather we went that route than the macro you're using. I don't have > any objections to the rest of your patch. > > Thanks, > James > I disagree with having to write all these things out, but I do agree that we should be self-consistent within the port. The purpose of introducing the macros in the ARM port was to avoid the common problem of merging adding new parameters with adding new ports. C initializers for these tables assign values sequentially with zero-padding at the end, so failing to merge such changes carefully leads to real bugs in the compiler (even if just performance bugs). I notice that the last entry in the current tune params table is an int, rather than something that is type-checked (like the penultimate entry - an enum). Having the final entry be type checked at least ensures that the right number of elements exist, even if the order is not strictly checked. My 2p. R.
Re: RFA: Enhance information recorded by -frecord-gcc-switches
On Thu, Jun 08, 2017 at 03:30:07PM +0100, Nick Clifton wrote: > Hi Richard, > > >>The -frecord-gcc-switches option records the gcc command line. It > >>does not however expand options like -O2 into the optimizations that > >>this enables. > > > > > I think individually enabled passes by -On are not really useful information > > as you generally cannot replace -On by a set of other switches on the > > command-line. > > Is that true ? I always thought that -O2 could be duplicated by using a (very > long) set of individual command line options. It can't. Various decisions in the compiler are based on the optimize argument itself, the most important is of course that for optimize == 0 or optimize_debug the pass queue is significantly different, with hundreds of optimization passes not being run even if they are enabled, but there are also optimize >= 2 or optimize >= 3 or optimize_size etc. guarded decisions in many places. So, with -O0 + all options that appear to be enabled by -O1 you still get something that is significantly closer to -O0 than -O1. And even with -O1 + all options that appear to be enabled by -O2 but not -O1, you still get something very different from -O2, etc. > Regardless, the point of this patch is to record which options were enabled, > via > whatever route, in the binaries. This can be useful to users, or > distributors, > who want to check that, for example, a specific security option was enabled, > or > that a particular a particular optimization was run. And that again doesn't tell you whether the particular optimization pass was run, just that some flag variable was zero or non-zero or had some other value. The decisions in the compiler are more complex and keep changing between compiler versions. For one particular compiler version, -O2 vs. -O1 if that is what was originally used to compile something is all you need, that implies a particular behavior, set of options and their interactions. For comparisons between different compiler versions, some of the options are ignored, others are added, others change meaning, and expanding the list of guarded options isn't really useful. Jakub
Annotate GCC sanity checking by cold attribute
Hi, this patch adds cold attributes to (some of) our internal checking. This makes it possible to to propagate zero counts on these and produce bit better code (especially for checking enabled compiler). Bootstrapped/regtested x86_64-linux, comited. Honza * system.h (fancy_abort): Annotate by ATTRIBUTE_COLD. * rtl.h (rtl_check_failed_bounds, rtl_check_failed_type1, rtl_check_failed_type2, rtl_check_failed_code1, rtl_check_failed_code2, rtl_check_failed_code_mode, rtl_check_failed_block_symbol, cwi_check_failed_bounds, rtvec_check_failed_bounds, rtl_check_failed_flag, _fatal_insn_not_found, _fatal_insn): Likewise. * tree.h (tree_contains_struct_check_failed, tree_check_failed, tree_not_check_failed, tree_class_check_failed, tree_range_check_failed, tree_not_class_check_failed, tree_int_cst_elt_check_failed, tree_vec_elt_check_failed, phi_node_elt_check_failed, tree_operand_check_failed, omp_clause_check_failed, omp_clause_operand_check_failed, omp_clause_range_check_failed): Likewise. * cp/cp-tree.h (lang_check_failed): Annotate by ATTRIBUTE_COLD. Index: system.h === --- system.h(revision 249013) +++ system.h(working copy) @@ -722,7 +722,8 @@ extern int vsnprintf (char *, size_t, co /* Redefine abort to report an internal error w/o coredump, and reporting the location of the error in the source file. */ -extern void fancy_abort (const char *, int, const char *) ATTRIBUTE_NORETURN; +extern void fancy_abort (const char *, int, const char *) +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; #define abort() fancy_abort (__FILE__, __LINE__, __FUNCTION__) /* Use gcc_assert(EXPR) to test invariants. */ Index: rtl.h === --- rtl.h (revision 249013) +++ rtl.h (working copy) @@ -1148,30 +1148,30 @@ is_a_helper ::test (rtx_insn extern void rtl_check_failed_bounds (const_rtx, int, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtl_check_failed_type1 (const_rtx, int, int, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtl_check_failed_type2 (const_rtx, int, int, int, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtl_check_failed_code1 (const_rtx, enum rtx_code, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtl_check_failed_code2 (const_rtx, enum rtx_code, enum rtx_code, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtl_check_failed_code_mode (const_rtx, enum rtx_code, machine_mode, bool, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtl_check_failed_block_symbol (const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void cwi_check_failed_bounds (const_rtx, int, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, int, const char *) -ATTRIBUTE_NORETURN; +ATTRIBUTE_NORETURN ATTRIBUTE_COLD; #else /* not ENABLE_RTL_CHECKING */ @@ -1269,7 +1269,7 @@ extern void rtvec_check_failed_bounds (c extern void rtl_check_failed_flag (const char *, const_rtx, const char *, int, const char *) -ATTRIBUTE_NORETURN +ATTRIBUTE_NORETURN ATTRIBUTE_COLD ; #else /* not ENABLE_RTL_FLAG_CHECKING */ @@ -3793,9 +3793,9 @@ extern location_t curr_insn_location (vo /* rtl-error.c */ extern void _fatal_insn_not_found (const_rtx, const char *, int, const char *) - ATTRIBUTE_NORETURN; + ATTRIBUTE_NORETURN ATTRIBUTE_COLD; extern void _fatal_insn (const char *, const_rtx, const char *, int, const char *) - ATTRIBUTE_NORETURN; + ATTRIBUTE_NORETURN ATTRIBUTE_COLD; #define fatal_insn(msgid, insn) \ _fatal_insn (msgid, insn, __FILE__, __LINE__, __FUNCTION__) Index: tree.h === --- tree.h (revision 249013) +++ tree.h (working copy) @@ -358,45 +358,45 @@ as_internal_fn (combined_fn code) extern void tree_contains_struct_check_failed (const_tree, const enum tree_node_structure_enum,
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
The testcase does something unexpected extern "C" int isnan (); void foo(float a) { int (*xx)(...); xx = isnan; if (xx(a)) g++; } and I'm wondering if this is a valid thing to do with a builtin. The issue is that at the point where gimple lowering is done xx hasn't been resolved to isnan yet. So it never recognizes the alias. Previously these builtins were being resolved in expand, which happens late enough that it has replaced xx with isnan. I can obviously fix the ICE by having the expand code leave the call as a call instead of a builtin. But if this is a valid thing for a builtin i'm not sure how to best resolve this case. From: Tamar Christina Sent: Thursday, June 8, 2017 1:21:44 PM To: Christophe Lyon; Markus Trippelsdorf Cc: Joseph Myers; Jeff Law; GCC Patches; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd Subject: RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE. Thanks, I'm looking at the failure. My final validate seems to have only run the GCC tests. > -Original Message- > From: Christophe Lyon [mailto:christophe.l...@linaro.org] > Sent: 08 June 2017 13:00 > To: Markus Trippelsdorf > Cc: Joseph Myers; Tamar Christina; Jeff Law; GCC Patches; Wilco Dijkstra; > rguent...@suse.de; Michael Meissner; nd > Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like > numbers in GIMPLE. > > On 8 June 2017 at 12:30, Markus Trippelsdorf > wrote: > > On 2017.01.19 at 18:20 +, Joseph Myers wrote: > >> On Thu, 19 Jan 2017, Tamar Christina wrote: > >> > >> > Hi Joseph, > >> > > >> > I made the requested changes and did a quick pass over the rest of > >> > the fp cases. > >> > >> I've no further comments, but watch out for any related test failures > >> being reported. > > > > g++.dg/opt/pr60849.C started ICEing on both X86_64 and ppc64le. > > > > Same on arm/aarch64, but there are also other regressions on big-endian > configs: > See http://people.linaro.org/~christophe.lyon/cross- > validation/gcc/trunk/249005/report-build-info.html > > > > -- > > Markus
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
On June 8, 2017 5:40:06 PM GMT+02:00, Tamar Christina wrote: >The testcase does something unexpected > >extern "C" int isnan (); > >void foo(float a) { > int (*xx)(...); > xx = isnan; > if (xx(a)) >g++; >} > >and I'm wondering if this is a valid thing to do with a builtin. The >issue is that at the point where gimple lowering is done xx hasn't been >resolved to isnan yet. >So it never recognizes the alias. Previously these builtins were being >resolved in expand, which happens late enough that it has replaced xx >with isnan. > >I can obviously fix the ICE by having the expand code leave the call as >a call instead of a builtin. But if this is a valid thing for a builtin >i'm not sure how to >best resolve this case. For a built-in this is generally valid. For plain isnan it depends on what the standard says. You have to support taking the address of isnan anyway and thus expanding to a library call in that case. Why doesn't that not work? Richard. > >From: Tamar Christina >Sent: Thursday, June 8, 2017 1:21:44 PM >To: Christophe Lyon; Markus Trippelsdorf >Cc: Joseph Myers; Jeff Law; GCC Patches; Wilco Dijkstra; >rguent...@suse.de; Michael Meissner; nd >Subject: RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like >numbers in GIMPLE. > >Thanks, I'm looking at the failure. >My final validate seems to have only run the GCC tests. > >> -Original Message- >> From: Christophe Lyon [mailto:christophe.l...@linaro.org] >> Sent: 08 June 2017 13:00 >> To: Markus Trippelsdorf >> Cc: Joseph Myers; Tamar Christina; Jeff Law; GCC Patches; Wilco >Dijkstra; >> rguent...@suse.de; Michael Meissner; nd >> Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like >> numbers in GIMPLE. >> >> On 8 June 2017 at 12:30, Markus Trippelsdorf >> wrote: >> > On 2017.01.19 at 18:20 +, Joseph Myers wrote: >> >> On Thu, 19 Jan 2017, Tamar Christina wrote: >> >> >> >> > Hi Joseph, >> >> > >> >> > I made the requested changes and did a quick pass over the rest >of >> >> > the fp cases. >> >> >> >> I've no further comments, but watch out for any related test >failures >> >> being reported. >> > >> > g++.dg/opt/pr60849.C started ICEing on both X86_64 and ppc64le. >> > >> >> Same on arm/aarch64, but there are also other regressions on >big-endian >> configs: >> See http://people.linaro.org/~christophe.lyon/cross- >> validation/gcc/trunk/249005/report-build-info.html >> >> >> > -- >> > Markus
Re: Annotate GCC sanity checking by cold attribute
On June 8, 2017 5:27:24 PM GMT+02:00, Jan Hubicka wrote: >Hi, >this patch adds cold attributes to (some of) our internal checking. >This makes >it possible to to propagate zero counts on these and produce bit better >code >(especially for checking enabled compiler). > >Bootstrapped/regtested x86_64-linux, comited. I wonder if we shouldn't annotate exit() with non-cold instead :) >Honza > > * system.h (fancy_abort): Annotate by ATTRIBUTE_COLD. > * rtl.h (rtl_check_failed_bounds, rtl_check_failed_type1, > rtl_check_failed_type2, rtl_check_failed_code1, > rtl_check_failed_code2, rtl_check_failed_code_mode, > rtl_check_failed_block_symbol, cwi_check_failed_bounds, > rtvec_check_failed_bounds, rtl_check_failed_flag, > _fatal_insn_not_found, _fatal_insn): Likewise. > * tree.h (tree_contains_struct_check_failed, > tree_check_failed, tree_not_check_failed, > tree_class_check_failed, tree_range_check_failed, > tree_not_class_check_failed, tree_int_cst_elt_check_failed, > tree_vec_elt_check_failed, phi_node_elt_check_failed, > tree_operand_check_failed, omp_clause_check_failed, > omp_clause_operand_check_failed, omp_clause_range_check_failed): > Likewise. > * cp/cp-tree.h (lang_check_failed): Annotate by ATTRIBUTE_COLD. > >Index: system.h >=== >--- system.h (revision 249013) >+++ system.h (working copy) >@@ -722,7 +722,8 @@ extern int vsnprintf (char *, size_t, co > > /* Redefine abort to report an internal error w/o coredump, and >reporting the location of the error in the source file. */ >-extern void fancy_abort (const char *, int, const char *) >ATTRIBUTE_NORETURN; >+extern void fancy_abort (const char *, int, const char *) >+ ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > #define abort() fancy_abort (__FILE__, __LINE__, __FUNCTION__) > > /* Use gcc_assert(EXPR) to test invariants. */ >Index: rtl.h >=== >--- rtl.h (revision 249013) >+++ rtl.h (working copy) >@@ -1148,30 +1148,30 @@ is_a_helper ::test (rtx_insn > >extern void rtl_check_failed_bounds (const_rtx, int, const char *, int, >const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtl_check_failed_type1 (const_rtx, int, int, const char *, >int, > const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtl_check_failed_type2 (const_rtx, int, int, int, const >char *, > int, const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtl_check_failed_code1 (const_rtx, enum rtx_code, const >char *, > int, const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtl_check_failed_code2 (const_rtx, enum rtx_code, enum >rtx_code, > const char *, int, const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtl_check_failed_code_mode (const_rtx, enum rtx_code, >machine_mode, > bool, const char *, int, const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtl_check_failed_block_symbol (const char *, int, const >char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void cwi_check_failed_bounds (const_rtx, int, const char *, int, >const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, >int, > const char *) >-ATTRIBUTE_NORETURN; >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > > #else /* not ENABLE_RTL_CHECKING */ > >@@ -1269,7 +1269,7 @@ extern void rtvec_check_failed_bounds (c > >extern void rtl_check_failed_flag (const char *, const_rtx, const char >*, > int, const char *) >-ATTRIBUTE_NORETURN >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD > ; > > #else /* not ENABLE_RTL_FLAG_CHECKING */ >@@ -3793,9 +3793,9 @@ extern location_t curr_insn_location (vo > > /* rtl-error.c */ >extern void _fatal_insn_not_found (const_rtx, const char *, int, const >char *) >- ATTRIBUTE_NORETURN; >+ ATTRIBUTE_NORETURN ATTRIBUTE_COLD; >extern void _fatal_insn (const char *, const_rtx, const char *, int, >const char *) >- ATTRIBUTE_NORETURN; >+ ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > > #define fatal_insn(msgid, insn) \ > _fatal_insn (msgid, insn, __FILE__, __LINE__, __FUNCTION__) >Index: tree.h >=== >--- tree.h (revision 249013) >+++ tree.h (working copy) >@@ -358
Re: [PATCH 2/2] [MSP430] Fix issues handling .persistent attribute (PR 78818)
On 30/05/2017 12:50, Nick Clifton wrote: When I applied this patch to the sources and ran the new test, I encountered an internal compiler error: msp430-elf/gcc/xgcc [...] pr78818-auto-warn.c [...] [...] gcc/testsuite/gcc.target/msp430/pr78818-auto-warn.c: In function 'main': gcc/testsuite/gcc.target/msp430/pr78818-auto-warn.c:10:3: internal compiler error: in get, at cgraph.h:403 0xd30d3b symtab_node::get(tree_node const*) gcc/current/gcc/cgraph.h:400 0xd30d3b decl_section_name(tree_node const*) gcc/current/gcc/tree.c:700 0xd9ff22 msp430_data_attr gcc/current/gcc/config/msp430/msp430.c:1998 It seems that there is a problem with calling the DECL_SECTION_NAME macro on the line just before your new code. Are you able to reproduce this problem ? Hi Nick, Apologies for the delay in replying. I have reproduced this issue with current trunk and on the gcc-7-branch, but it does not reproduce on the gcc-6-branch. The ICE isn't caused by my patch but the pr78818-auto-warn.c test does expose it. I've attached an additional patch to fix the ICE (0001*). The ICE was caused by a new gcc_checking_assert added in GCC7 that checks that "symtab_node" has been called on a sane object. Added some additional check in msp430.c:data_attr that prevent section names being looked up on variables that can't have a section. The 0002* patch has been updated as this also caused an ICE when running the test case for the same reason as above, which was exposed when the first ICE was fixed. I have tested these patches on trunk this time (previous testing was done on performed on gcc-6-branch only) and can confirm there are no regressions and the new tests build successfully. Ok for trunk and gcc-7-branch? Thanks, Jozef From 8a41a45f2f771ae540b16ec007ded499a9ed5244 Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Mon, 5 Jun 2017 18:07:22 + Subject: [PATCH 1/3] MSP430: Check if a variable can have a section before checking for a section name 2017-06-XX Jozef Lawrynowicz gcc/ * config/msp430/msp430.c (msp430_data_attr): Check that it's possible for a variable to have a section before checking if the section has a name. --- gcc/config/msp430/msp430.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c index dd53dea..cdd765b 100644 --- a/gcc/config/msp430/msp430.c +++ b/gcc/config/msp430/msp430.c @@ -1995,8 +1995,10 @@ msp430_data_attr (tree * node, if (TREE_CODE (* node) != VAR_DECL) message = "%qE attribute only applies to variables"; - if (DECL_SECTION_NAME (* node)) -message = "%qE attribute cannot be applied to variables with specific sections"; + /* Check that it's possible for the variable to have a section. */ + if ((TREE_STATIC (* node) || DECL_EXTERNAL (* node) || in_lto_p) + && DECL_SECTION_NAME (* node)) + message = "%qE attribute cannot be applied to variables with specific sections"; /* If this var is thought to be common, then change this. Common variables are assigned to sections before the backend has a chance to process them. */ -- 1.8.3.1 From e9404da28ade51c0303394f6ab12528b1c62afca Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Fri, 12 May 2017 13:04:59 + Subject: [PATCH 2/3] MSP430: Fix persistent attribute not placing data into the .persistent section 2017-06-XX Jozef Lawrynowicz gcc/ PR target/78818 * config/msp430/msp430.c (msp430_unique_section): Set section to .persistent if persistent attribute is set. gcc/testsuite PR target/78818 * gcc.target/msp430/msp430.exp: Search for tests in subfolders as well as main directory. * gcc.target/msp430/pr78818/pr78818-real.c: New template for tests. * gcc.target/msp430/pr78818/pr78818-auto.c: New test. * gcc.target/msp430/pr78818/pr78818-data-region.c: New test. * gcc.target/msp430/pr78818/pr78818-data-sec.c: Likewise. --- gcc/config/msp430/msp430.c| 8 gcc/testsuite/gcc.target/msp430/msp430.exp| 4 ++-- gcc/testsuite/gcc.target/msp430/pr78818/pr78818-auto.c| 5 + gcc/testsuite/gcc.target/msp430/pr78818/pr78818-data-region.c | 7 +++ gcc/testsuite/gcc.target/msp430/pr78818/pr78818-data-sec.c| 8 gcc/testsuite/gcc.target/msp430/pr78818/pr78818-real.c| 9 + 6 files changed, 39 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/msp430/pr78818/pr78818-auto.c create mode 100644 gcc/testsuite/gcc.target/msp430/pr78818/pr78818-data-region.c create mode 100644 gcc/testsuite/gcc.target/msp430/pr78818/pr78818-data-sec.c create mode 100644 gcc/testsuite/gcc.target/msp430/pr78818/pr78818-real.c diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c index
Re: RFA: Enhance information recorded by -frecord-gcc-switches
Hi Jakub, >> Regardless, the point of this patch is to record which options were enabled, >> via >> whatever route, in the binaries. This can be useful to users, or >> distributors, >> who want to check that, for example, a specific security option was enabled, >> or >> that a particular a particular optimization was run. > > And that again doesn't tell you whether the particular optimization pass was > run, just that some flag variable was zero or non-zero or had some other > value. The decisions in the compiler are more complex and keep changing > between compiler versions. For one particular compiler version, -O2 vs. -O1 > if that is what was originally used to compile something is all you need, > that implies a particular behavior, set of options and their interactions. > For comparisons between different compiler versions, some of the options > are ignored, others are added, others change meaning, and expanding the list > of guarded options isn't really useful. OK -so we need some other way of recording what optimization passes were actually run. Fortunately I have something in mind. Patch withdrawn. Cheers Nick
Re: Annotate GCC sanity checking by cold attribute
> On June 8, 2017 5:27:24 PM GMT+02:00, Jan Hubicka wrote: > >Hi, > >this patch adds cold attributes to (some of) our internal checking. > >This makes > >it possible to to propagate zero counts on these and produce bit better > >code > >(especially for checking enabled compiler). > > > >Bootstrapped/regtested x86_64-linux, comited. > > I wonder if we shouldn't annotate exit() with non-cold instead :) I think this has still chance for false positives i.e. for embedded stuff that does everything from infinite loop. It kind of sucks, yes ;) Honza > > >Honza > > > > * system.h (fancy_abort): Annotate by ATTRIBUTE_COLD. > > * rtl.h (rtl_check_failed_bounds, rtl_check_failed_type1, > > rtl_check_failed_type2, rtl_check_failed_code1, > > rtl_check_failed_code2, rtl_check_failed_code_mode, > > rtl_check_failed_block_symbol, cwi_check_failed_bounds, > > rtvec_check_failed_bounds, rtl_check_failed_flag, > > _fatal_insn_not_found, _fatal_insn): Likewise. > > * tree.h (tree_contains_struct_check_failed, > > tree_check_failed, tree_not_check_failed, > > tree_class_check_failed, tree_range_check_failed, > > tree_not_class_check_failed, tree_int_cst_elt_check_failed, > > tree_vec_elt_check_failed, phi_node_elt_check_failed, > > tree_operand_check_failed, omp_clause_check_failed, > > omp_clause_operand_check_failed, omp_clause_range_check_failed): > > Likewise. > > * cp/cp-tree.h (lang_check_failed): Annotate by ATTRIBUTE_COLD. > > > >Index: system.h > >=== > >--- system.h (revision 249013) > >+++ system.h (working copy) > >@@ -722,7 +722,8 @@ extern int vsnprintf (char *, size_t, co > > > > /* Redefine abort to report an internal error w/o coredump, and > >reporting the location of the error in the source file. */ > >-extern void fancy_abort (const char *, int, const char *) > >ATTRIBUTE_NORETURN; > >+extern void fancy_abort (const char *, int, const char *) > >+ ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > > #define abort() fancy_abort (__FILE__, __LINE__, __FUNCTION__) > > > > /* Use gcc_assert(EXPR) to test invariants. */ > >Index: rtl.h > >=== > >--- rtl.h(revision 249013) > >+++ rtl.h(working copy) > >@@ -1148,30 +1148,30 @@ is_a_helper ::test (rtx_insn > > > >extern void rtl_check_failed_bounds (const_rtx, int, const char *, int, > > const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtl_check_failed_type1 (const_rtx, int, int, const char *, > >int, > > const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtl_check_failed_type2 (const_rtx, int, int, int, const > >char *, > > int, const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtl_check_failed_code1 (const_rtx, enum rtx_code, const > >char *, > > int, const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtl_check_failed_code2 (const_rtx, enum rtx_code, enum > >rtx_code, > > const char *, int, const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtl_check_failed_code_mode (const_rtx, enum rtx_code, > >machine_mode, > > bool, const char *, int, const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtl_check_failed_block_symbol (const char *, int, const > >char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void cwi_check_failed_bounds (const_rtx, int, const char *, int, > > const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void rtvec_check_failed_bounds (const_rtvec, int, const char *, > >int, > >const char *) > >-ATTRIBUTE_NORETURN; > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > > > > #else /* not ENABLE_RTL_CHECKING */ > > > >@@ -1269,7 +1269,7 @@ extern void rtvec_check_failed_bounds (c > > > >extern void rtl_check_failed_flag (const char *, const_rtx, const char > >*, > >int, const char *) > >-ATTRIBUTE_NORETURN > >+ATTRIBUTE_NORETURN ATTRIBUTE_COLD > > ; > > > > #else /* not ENABLE_RTL_FLAG_CHECKING */ > >@@ -3793,9 +3793,9 @@ extern location_t curr_insn_location (vo > > > > /* rtl-error.c */ > >extern void _fatal_insn_not_found (const_rtx, const char *, int, const > >char *) > >- ATTRIBUTE_NORETURN; > >+ ATTRIBUTE_NORETURN ATTRIBUTE_COLD; > >extern void _fatal_insn (const char *, const_rtx, const char *, int, > >const char *) > >-
Re: [PATCH] handle bzero/bcopy in DSE and aliasing (PR 80933, 80934)
On 06/08/2017 01:51 AM, Richard Biener wrote: On Thu, Jun 8, 2017 at 4:33 AM, Martin Sebor wrote: On 06/07/2017 02:12 PM, Martin Sebor wrote: On 06/07/2017 02:01 PM, Marc Glisse wrote: On Wed, 7 Jun 2017, Bernhard Reutner-Fischer wrote: On 7 June 2017 16:46:53 CEST, Martin Sebor wrote: On 06/07/2017 02:23 AM, Richard Biener wrote: On Wed, Jun 7, 2017 at 5:26 AM, Martin Sebor wrote: Note I'd be _much_ more sympathetic to simply canonicalizing all of bzero and bcopy to memset / memmove and be done with all the above complexity. Attached is an updated patch along these lines. Please let me know if it matches your expectations. I think you attached the wrong patch. Yes I did, sorry. The correct one is attached. Under POSIX.1-2008 "optimizing" bzero or bcmp is IMO plain wrong. It's like optimizing foo() to a random built-in but maybe that's just me. If your libc provides a define to a standard function for these under a compat knob then fine but otherwise you should fix that. *shrug*. Joseph? The patch optimizes __builtin_bzero, which should be ok. The question (independent from this patch) is then under what conditions bzero should be detected as a builtin. Yes. The problem is that unlike for C and C++, GCC doesn't have a mechanism to select the target version of POSIX. I think it should. But there is a subtle problem with the patch that needs fixing. Bcopy should not be transformed to memcpy but rather memmove. I'll fix that before committing. Attached is an updated patch with this fix. I also added a cast from bcopy and bzero to void to detect accidental uses of the return value. Tested on x86_64-linux. Please do not add foldings to builtins.c but instead add them to gimple-fold.c. Sure. Attached is an adjusted patch. + /* Call memset and return the result cast to void to detect its use + (bzero returns void). */ + tree call = build_call_expr_loc (loc, fn, 3, dst, integer_zero_node, len); + return fold_convert (void_type_node, call); ??? How can the result be used if the original call result was not? The cast ensured GCC would continue to warn on code like: void f (void *d, unsigned n) { return bzero (d, n); } Without the cast (as in the first patch) the above was silently accepted. This isn't necessary when the folding is done in gimple-fold.c. Martin PR tree-optimization/80934 - bzero should be assumed not to escape pointer argument PR tree-optimization/80933 - redundant bzero/bcopy calls not eliminated gcc/ChangeLog: PR tree-optimization/80933 PR tree-optimization/80934 * builtins.c (fold_builtin_3): Do not handle bcmp here. * gimple-fold.c (gimple_fold_builtin_bcmp): New function. (gimple_fold_builtin_bcopy, gimple_fold_builtin_bzero): Likewise. (gimple_fold_builtin): Call them. gcc/testsuite/ChangeLog: PR tree-optimization/80933 PR tree-optimization/80934 * gcc.dg/fold-bcopy.c: New test. * gcc.dg/tree-ssa/ssa-dse-30.c: Likewise.. * gcc.dg/tree-ssa/alias-36.c: Likewise. * gcc/testsuite/gcc.dg/pr79214.c: Adjust. * gcc.dg/tree-prof/val-prof-7.c: Likewise. * gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/builtins-nonnull.c: Likewise. diff --git a/gcc/builtins.c b/gcc/builtins.c index 30462ad..ce657bf 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -9034,7 +9034,6 @@ fold_builtin_3 (location_t loc, tree fndecl, return do_mpfr_remquo (arg0, arg1, arg2); break; -case BUILT_IN_BCMP: case BUILT_IN_MEMCMP: return fold_builtin_memcmp (loc, arg0, arg1, arg2);; diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index d12f9d0..159a7e6 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -1076,6 +1076,88 @@ done: return true; } +/* Transform a call to built-in bcmp(a, b, len) at *GSI into one + to built-in memcmp (a, b, len). */ + +static bool +gimple_fold_builtin_bcmp (gimple_stmt_iterator *gsi) +{ + tree fn = builtin_decl_implicit (BUILT_IN_MEMCMP); + + if (!fn) +return false; + + /* Transform bcmp (a, b, len) into memcmp (a, b, len). */ + + gimple *stmt = gsi_stmt (*gsi); + tree a = gimple_call_arg (stmt, 0); + tree b = gimple_call_arg (stmt, 1); + tree len = gimple_call_arg (stmt, 2); + + gimple_seq seq = NULL; + gimple *repl = gimple_build_call (fn, 3, a, b, len); + gimple_seq_add_stmt_without_update (&seq, repl); + gsi_replace_with_seq_vops (gsi, seq); + fold_stmt (gsi); + + return true; +} + +/* Transform a call to built-in bcopy (src, dest, len) at *GSI into one + to built-in memmove (dest, src, len). */ + +static bool +gimple_fold_builtin_bcopy (gimple_stmt_iterator *gsi) +{ + tree fn = builtin_decl_implicit (BUILT_IN_MEMMOVE); + + if (!fn) +return false; + + /* bcopy has been removed from POSIX in Issue 7 but Issue 6 specifies + it's quivalent to memmove (not memcpy). Transform bcopy (src, dest, + len) into memmove (dest, src, len). */ + + gimple *stmt = gsi_stmt (*gsi); + tree src = gimple_call_arg (stmt, 0); +
Re: [PATCH 5/6][AArch64] Enable -fprefetch-loop-arrays at -O3 for cores that benefit from prefetching.
On Fri, Feb 03, 2017 at 02:58:23PM +0300, Maxim Kuvyrkov wrote: > > On Jan 30, 2017, at 5:50 PM, Maxim Kuvyrkov > > wrote: > > > >> On Jan 30, 2017, at 3:23 PM, Kyrill Tkachov > >> wrote: > >> > >> Hi Maxim, > >> > >> On 30/01/17 12:06, Maxim Kuvyrkov wrote: > >>> This patch enables prefetching at -O3 for aarch64 cores that set > >>> "simultaneous prefetches" parameter above 0. There are currently no such > >>> settings, so this patch doesn't change default code generation. > >>> > >>> I'm now working on improvements to -fprefetch-loop-arrays pass to make it > >>> suitable for -O2. I'll post this work in the next month. > >>> > >>> Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu. > >>> > >> > >> Are you aiming to get this in for GCC 8? > >> I have one small comment on this patch: > >> > >> + /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we > >> + have deemed it beneficial (signified by setting > >> + prefetch.num_slots to 1 or more). */ > >> + if (flag_prefetch_loop_arrays < 0 > >> + && HAVE_prefetch > >> > >> HAVE_prefetch will always be true on aarch64. > >> I imagine midend code that had logic like this would need this check, but > >> aarch64-specific code shouldn't need it. > > > > Agree, I'll remove HAVE_prefetch. > > > > This pattern was copied from other backends, and HAVE_prefetch is most > > likely a historical artifact. > > Andrew raised a good point in the review of his patch that it is a bad idea > to use one of prefetching parameters (simultaneous_prefetches) as indicator > for whether to enable prefetching pass by default. Indeed there are cases > when we want to set simultaneous_prefetch according to HW documentation (or > experimental results), but not enable prefetching pass by default. > > This update to the patch addresses it. The patch adds a new explicit field > to prefetch tuning structure "default_opt_level" that sets optimization level > from which prefetching should be enabled by default. The current value is to > enable prefetching at -O3; additionally, this parameter will come handy for > enabling prefetching at -O2 [when it is ready]. I really don't like the scheme of changing the optimisation threshold when profiling data is used. I've seen too many reports and presentations by the uninitiated who believe that the use of profiling data has made the difference, when in reality it is just GCC changing behaviour on which passes run. It is very misleading! With that line removed, and any rebasing needed over changes to the macro, I'm happy with this patch. Thanks, James
Re: [PATCH 6/6][AArch64] Update prefetch tuning parameters for falkor and qdf24xx tunings.
On Mon, Jan 30, 2017 at 03:08:04PM +0300, Maxim Kuvyrkov wrote: > This patch enables software prefetching at -O3 for Qualcomm's qdf24xx cores. > > Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu. This patch is OK in whatever form it takes after rebasing for the macro in 4/6. Thanks, James
Re: [PATCH, testsuite] Remove NO_TRAMPOLINES
On Jun 8, 2017, at 3:41 AM, Tom de Vries wrote: > > this patch removes the additional_flags=-DNO_TRAMPOLINES addition, and > instead uses the effective target trampolines. > OK for trunk? Ok.
Go patch committed: Fix undefined symbol error with unexported method
When an interface I1 in an imported Go package has an unexported method, and is then embedded into another interface I2, in a different package, that has other methods, and a type T2 is converted to I2, the Go frontend failed to ever define the required interface method table. Naturally T2 must implement the unexported method, and must therefore either be defined in the same package as I1, or embed a type from that package. In this case the compiler was assuming that that package would define the interface method table, but of course, since I2 was not defined in that package, that did not happen. The fix is to only assume that the interface method table will be defined elsewhere in the case where T2 and I2 are defined in the same package. The compiler ensures that all such interface method tables are created, in Gogo::build_interface_method_tables. This requires knowing the package in which an interface type is defined, a simple tweak to the importer. Testing this revealed that the special case for stub methods created for the embedded unexported methods of T2 needs to be done for function declarations as it currently is for function definitions, sothat the newly created interface method tables use the correct name. Testing that revealed that the code to determine the pkgpath symbolfor such stub methods was wrong. It assumed that one could call pkgpath_for_symbol on the pkgpath to get the pkgpath symbol. Would that it twere so simple. Instead, add a function to look up the package, which must be known, and fetch the pkgpath symbol. The test for this is https://golang.org/cl/45085. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 248994) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -eaf4afbabcd91df55d31955500b6db55b07f6de5 +4b857cde45939f0e9f3cf89b9e347b6f6ebe0f8f The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc(revision 248934) +++ gcc/go/gofrontend/expressions.cc(working copy) @@ -15430,27 +15430,32 @@ Interface_mtable_expression::do_get_back + "__" + this->type_->mangled_name(gogo)); - // See whether this interface has any hidden methods. - bool has_hidden_methods = false; - for (Typed_identifier_list::const_iterator p = interface_methods->begin(); - p != interface_methods->end(); - ++p) + // Set is_public if we are converting a named type to an interface + // type that is defined in the same package as the named type, and + // the interface has hidden methods. In that case the interface + // method table will be defined by the package that defines the + // types. + bool is_public = false; + if (this->type_->named_type() != NULL + && (this->type_->named_type()->named_object()->package() + == this->itype_->package())) { - if (Gogo::is_hidden_name(p->name())) + for (Typed_identifier_list::const_iterator p = interface_methods->begin(); + p != interface_methods->end(); + ++p) { - has_hidden_methods = true; - break; + if (Gogo::is_hidden_name(p->name())) + { + is_public = true; + break; + } } } - // We already know that the named type is convertible to the - // interface. If the interface has hidden methods, and the named - // type is defined in a different package, then the interface - // conversion table will be defined by that other package. - if (has_hidden_methods - && this->type_->named_type() != NULL + if (is_public && this->type_->named_type()->named_object()->package() != NULL) { + // The interface conversion table is defined elsewhere. Btype* btype = this->type()->get_backend(gogo); std::string asm_name(go_selectively_encode_id(mangled_name)); this->bvar_ = @@ -15517,7 +15522,6 @@ Interface_mtable_expression::do_get_back Bexpression* ctor = gogo->backend()->constructor_expression(btype, ctor_bexprs, loc); - bool is_public = has_hidden_methods && this->type_->named_type() != NULL; std::string asm_name(go_selectively_encode_id(mangled_name)); this->bvar_ = gogo->backend()->immutable_struct(mangled_name, asm_name, false, !is_public, btype, loc); Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gofrontend/gogo.cc (revision 248394) +++ gcc/go/gofrontend/gogo.cc (working copy) @@ -1688,6 +1688,16 @@ Gogo::register_package(const std::string return package; } +// Return the pkgpa
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
On Thu, 8 Jun 2017, Richard Biener wrote: > For a built-in this is generally valid. For plain isnan it depends on > what the standard says. > > You have to support taking the address of isnan anyway and thus > expanding to a library call in that case. Why doesn't that not work? In the case of isnan there is the Unix98 non-type-generic function, so it should definitely work to take the address of that function as declared in the system headers. For the DEF_GCC_BUILTIN type-generic functions there may not be any corresponding library function at all (as well as only being callable with the __builtin_* name). -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch AArch64] Do not increase data alignment at -Os and with -fconserve-stack.
On Tue, May 02, 2017 at 10:52:13AM +0100, Ramana Radhakrishnan wrote: > We unnecessarily align data to 8 byte alignments even when -Os is > specified. This brings the logic in the AArch64 backend more in line > with the ARM backend and helps gain some image size in a few places. > Caught by an internal report on the size of rodata sections being > high with aarch64 gcc. > > * config/aarch64/aarch64.h (AARCH64_EXPAND_ALIGNMENT): New. > (DATA_ALIGNMENT): Update to use AARCH64_EXPAND_ALIGNMENT. > (LOCAL_ALIGNMENT): Update to use AARCH64_EXPAND_ALIGNMENT. > > Bootstrapped and regression tested on aarch64-none-linux-gnu with no > regressions. > > Ok to commit ? OK. Thanks, James
C/C++ PATCH to implement -Wmultistatement-macros (PR c/80116)
This is the hopefully last incarnation of the patch. The change from the last time[0] is simpy that I've added a new test and the warning has been renamed to -Wmultistatement-macros. David - any another comments? Joseph - how about the C parts? Jason - how about the C++ parts? Thanks, [0] https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00458.html Bootstrapped/regtested on x86_64-linux, ok for trunk? 2017-06-08 Marek Polacek PR c/80116 * c-common.h (warn_for_multistatement_macros): Declare. * c-warn.c (warn_for_multistatement_macros): New function. * c.opt (Wmultistatement-macros): New option. * c-parser.c (c_parser_if_body): Set the location of the body of the conditional after parsing all the labels. Call warn_for_multistatement_macros. (c_parser_else_body): Likewise. (c_parser_switch_statement): Likewise. (c_parser_while_statement): Likewise. (c_parser_for_statement): Likewise. (c_parser_statement): Add a default argument. Save the location after labels have been parsed. (c_parser_c99_block_statement): Likewise. * parser.c (cp_parser_statement): Add a default argument. Save the location of the expression-statement after labels have been parsed. (cp_parser_implicitly_scoped_statement): Set the location of the body of the conditional after parsing all the labels. Call warn_for_multistatement_macros. (cp_parser_already_scoped_statement): Likewise. * doc/invoke.texi: Document -Wmultistatement-macros. * c-c++-common/Wmultistatement-macros-1.c: New test. * c-c++-common/Wmultistatement-macros-2.c: New test. * c-c++-common/Wmultistatement-macros-3.c: New test. * c-c++-common/Wmultistatement-macros-4.c: New test. * c-c++-common/Wmultistatement-macros-5.c: New test. * c-c++-common/Wmultistatement-macros-6.c: New test. * c-c++-common/Wmultistatement-macros-7.c: New test. * c-c++-common/Wmultistatement-macros-8.c: New test. * c-c++-common/Wmultistatement-macros-9.c: New test. * c-c++-common/Wmultistatement-macros-10.c: New test. * c-c++-common/Wmultistatement-macros-11.c: New test. diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h index 79072e6..686932c 100644 --- gcc/c-family/c-common.h +++ gcc/c-family/c-common.h @@ -1539,6 +1539,8 @@ extern bool maybe_warn_shift_overflow (location_t, tree, tree); extern void warn_duplicated_cond_add_or_warn (location_t, tree, vec **); extern bool diagnose_mismatched_attributes (tree, tree); extern tree do_warn_duplicated_branches_r (tree *, int *, void *); +extern void warn_for_multistatement_macros (location_t, location_t, + location_t); /* In c-attribs.c. */ extern bool attribute_takes_identifier_p (const_tree); diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c index 35321a6..d883330 100644 --- gcc/c-family/c-warn.c +++ gcc/c-family/c-warn.c @@ -2401,3 +2401,91 @@ do_warn_duplicated_branches_r (tree *tp, int *, void *) do_warn_duplicated_branches (*tp); return NULL_TREE; } + +/* Implementation of -Wmultistatement-macros. This warning warns about + cases when a macro expands to multiple statements not wrapped in + do {} while (0) or ({ }) and is used as a body of if/else/for/while + conditionals. For example, + + #define DOIT x++; y++ + + if (c) + DOIT; + + will increment y unconditionally. + + BODY_LOC is the location of the first token in the body after labels + have been parsed, NEXT_LOC is the location of the next token after the + body of the conditional has been parsed, and GUARD_LOC is the location + of the conditional. */ + +void +warn_for_multistatement_macros (location_t body_loc, location_t next_loc, + location_t guard_loc) +{ + if (!warn_multistatement_macros) +return; + + /* Ain't got time to waste. We only care about macros here. */ + if (!from_macro_expansion_at (body_loc) + || !from_macro_expansion_at (next_loc)) +return; + + /* Let's skip macros defined in system headers. */ + if (in_system_header_at (body_loc) + || in_system_header_at (next_loc)) +return; + + /* Find the actual tokens in the macro definition. BODY_LOC and + NEXT_LOC have to come from the same spelling location, but they + will resolve to different locations in the context of the macro + definition. */ + location_t body_loc_exp += linemap_resolve_location (line_table, body_loc, + LRK_MACRO_DEFINITION_LOCATION, NULL); + location_t next_loc_exp += linemap_resolve_location (line_table, next_loc, + LRK_MACRO_DEFINITION_LOCATION, NULL); + location_t guard_loc_exp += linemap_resolve_location (line_table, guard_loc, + LRK_MACRO_DEFINITION_LOCATION, NULL); + +
Re: [PATCH][GCC][AArch64] Inline calls to lrint when possible
On Wed, Jun 07, 2017 at 12:38:27PM +0100, Tamar Christina wrote: > Hi All, > > This patch allows the inlining of lrint when -fno-math-errno > assuming that errno does not need to be set when the rounded value > is not representable as a long. > > The case > > void f(double *a, long *b, double x) > { > *a = __builtin_rint(x); > *b = __builtin_lrint(x); > } > > now generates with -fno-math-errno: > > f: > frintx d0, d0 > fcvtzs x2, d0 > str d0, [x0] > str x2, [x1] > ret > > When the flag is not used the same function call is emitted as before: > > f: > stp x29, x30, [sp, -32]! > frintx d1, d0 > add x29, sp, 0 > str x19, [sp, 16] > mov x19, x1 > str d1, [x0] > bl lrint > str x0, [x19] > ldr x19, [sp, 16] > ldp x29, x30, [sp], 32 > ret > > Bootstrapped and regtested on aarch64-none-linux-gnu and no regressions. > The patch also has no regressions on Spec2006. > > Ok for trunk? OK. Thanks, James > > gcc/ > 2017-06-07 Tamar Christina > > * config/aarch64/aarch64.md (lrint2): New. > > gcc/testsuite/ > 2017-06-07 Tamar Christina > > * gcc.target/aarch64/lrint-matherr.h: New. > * gcc.target/aarch64/inline-lrint_1.c: New. > * gcc.target/aarch64/inline-lrint_2.c: New. > * gcc.target/aarch64/no-inline-lrint_1.c: New. > * gcc.target/aarch64/no-inline-lrint_2.c: New.
Re: [PATCH, testsuite] Fix no_trampolines test in check_effective_target_trampolines
On Jun 8, 2017, at 3:20 AM, Tom de Vries wrote: > [ To complicate matters objc_target_compile tests for 'objc,no_trampolines' > to set -DNO_TRAMPOLINES, but AFAICT that macro is not used anywhere in the > objc test suites, so I think that's dead code. ] Yes, Ok to remove the dead code as well. > - it's better to have to define one variable than two > - it looks like an accident that the 'gcc,' was dropped > - the one with the 'gcc,' prefix has been around longer, and is > mentioned in dejagnu docs > I propose to test for 'gcc,no_trampolines' instead of 'no_trampolines' in > check_effective_target_trampolines. > OK for trunk? Ok. I had hit this bug years ago, and was puzzled why people seemed to get it so wrong. I took the easy way out and just defined the three of them. :-(
Re: Reorgnanization of profile count maintenance code, part 1
On Tue, Jun 6, 2017 at 1:00 AM, Jan Hubicka wrote: >> On Thu, Jun 1, 2017 at 4:35 AM, Jan Hubicka wrote: >> > Index: profile.c >> > === >> > --- profile.c (revision 248684) >> > +++ profile.c (working copy) >> > @@ -67,6 +67,10 @@ along with GCC; see the file COPYING3. >> > >> > #include "profile.h" >> > >> > +/* Map from BBs/edges to gcov counters. */ >> > +vec bb_gcov_counts; >> > +hash_map edge_gcov_counts; >> >> This completely breaks the compiler with >> --enable-gather-detailed-mem-stats; edge_gcov_counts gets initialized >> before hash_table_usage in hash-table.c, and so when the constructor >> for edge_gcov_counts calls hash_table_usage.register_descriptor, m_map >> is null and we get a SEGV. > > I will change this to pointer to avoid static cdtor. Thanks! Ping? Jason
Re: RFC: [PATCH] Add warn_if_not_aligned attribute
On Wed, Jun 7, 2017 at 6:30 AM, H.J. Lu wrote: > On Tue, Jun 6, 2017 at 5:11 PM, Martin Sebor wrote: >> On 06/06/2017 04:57 PM, H.J. Lu wrote: >>> >>> On Tue, Jun 6, 2017 at 10:34 AM, Martin Sebor wrote: On 06/06/2017 10:59 AM, H.J. Lu wrote: > > > On Tue, Jun 6, 2017 at 9:10 AM, Martin Sebor wrote: >> >> >> On 06/06/2017 10:07 AM, Martin Sebor wrote: >>> >>> >>> >>> On 06/05/2017 11:45 AM, H.J. Lu wrote: On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers wrote: > > > > The new attribute needs documentation. Should the test be in > c-c++-common This feature does support C++. But C++ compiler issues a slightly different warning at a different location. > or does this feature not support C++? > Here is the updated patch with documentation and a C++ test. This patch caused a few testsuite failures: FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1: warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16 FAIL: g++.dg/torture/pr80334.C -O0 (test for excess errors) /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8: warning: alignment 1 of 'B' is less than 16 >>> >>> Users often want the ability to control a warning, even when it >>> certainly indicates a bug. I would suggest to add an option to >>> make it possible for this warning as well. >>> >>> Btw., a bug related to some of those this warning is meant to >>> detect is assigning the address of an underaligned object to >>> a pointer of a natively aligned type. Clang has an option >>> to detect this problem: -Waddress-of-packed-member. It might >>> make a nice follow-on enhancement to add support for the same >>> thing. I mention this because I think it would make sense to >>> consider this when choosing the name of the GCC option (i.e., >>> rather than having two distinct but closely related warnings, >>> have one that detects both of these alignment type of bugs. >> >> >> >> >> A bug that has some additional context on this is pr 51628. >> A possible name for the new option suggested there is -Wpacked. >> >> Martin > > > > Isn't -Waddress-of-packed-member a subset of or the same as > -Wpacked? In Clang it's neither. -Waddress-of-packed-member only triggers when the address of a packed member is taken but not for the cases in bug 53037 (i.e., reducing the alignment of a member). It's also enabled by default, while -Wpacked needs to be specified explicitly (i.e., it's in neither -Wall or -Wextra). FWIW, I don't really have a strong opinion about the names of the options. My input is that the proliferation of fine-grained warning options for closely related problems tends to make it tricky to get their interactions right (both in the compiler and for users). Enabling both/all such options can lead to multiple warnings for what boils down to essentially the same bug in the same expression, overwhelming the user in repetitive diagnostics. >>> >>> There is already -Wpacked. Should I overload it for this? >> >> >> I'd say yes if -Wpacked were at least in -Wall. But it's >> an opt-in kind of warning that's not even in -Wextra, and >> relaxing an explicitly specified alignment seems more like >> a bug than just something that might be nice to know about. >> I would expect warn_if_not_aligned to trigger a warning even >> without -Wall (i.e., as you have it in your patch, but with >> an option to control it). That would suggest three levels >> of warnings: >> >> 1) warn by default (warn_if_not_aligned violation) >> 2) warn with -Wall (using a type with attribute aligned to >>define a member of a packed struct) >> 3) warn if requested (current -Wpacked) >> >> So one way to deal with it would be to change -Wpacked to >> take an argument between 0 and 3, set the default to >> correspond to the (1) above, and have -Wall bump it up to >> (2). >> >> If the equivalent of -Waddress-of-packed-member were to be >> implemented in GCC it would probably be a candidate to add >> to the (2) above.(*) >> >> This might be more involved than you envisioned. A slightly >> simpler alternative would be to add a different option, say >> something like -Walign=N, and have it handle just (1) and >> (2) above, leaving -Wpacked unchanged. >> > > Since there is no agreement on -W options and changes > may touch many places, I will d
Re: [PATCH] handling address mode changes inside extract_bit_field
I've got a testcase to add for this patch. Sorry about the delay, I took some time off to deal with a medical problem. This was tested with and without the extract_bit_field patch. The testcase fails without the patch and works with the patch. Jim gcc/testsuite/ PR middle-end/79794 * gcc.target/aarch64/pr79794.c: New. Index: gcc/testsuite/gcc.target/aarch64/pr79794.c === --- gcc/testsuite/gcc.target/aarch64/pr79794.c (nonexistent) +++ gcc/testsuite/gcc.target/aarch64/pr79794.c (working copy) @@ -0,0 +1,25 @@ +/* PR middle-end/79794 */ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ +/* { dg-final { scan-assembler-not "umov" } } */ + +struct node_struct +{ + float _Complex gap; + unsigned long long state; +}; + +struct reg_struct +{ + int size; + struct node_struct *node; +}; + +void +func(int target, struct reg_struct *reg) +{ + int i; + + for(i=0; isize; i++) +reg->node[i].state ^= ((unsigned long long) 1 << target); +}
Re: C/C++ PATCH to implement -Wmultistatement-macros (PR c/80116)
On Thu, 2017-06-08 at 18:49 +0200, Marek Polacek wrote: > This is the hopefully last incarnation of the patch. The change from > the > last time[0] is simpy that I've added a new test and the warning has > been > renamed to -Wmultistatement-macros. > > David - any another comments? Thanks for working on this; looks useful. The new name is more accurate, but is rather long; oh well. As part of -Wall, users won't typically have to type it, so that's OK. [...] > diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c > index 35321a6..d883330 100644 > --- gcc/c-family/c-warn.c > +++ gcc/c-family/c-warn.c [...] > + if (warning_at (body_loc, OPT_Wmultistatement_macros, > + "macro expands to multiple statements")) > +inform (guard_loc, "some parts of macro expansion are not > guarded by " > + "this conditional"); Is the guard necessarily a "conditional"? I take a "conditional" to mean an "if"; the guard could be a "for" or a "while" (or an "else", which still seems something of a stretch to me to call a "conditional"). Suggestion: word "this conditional" as "this %qs clause" and either (a) rework the code in c-indentation.c's guard_tinfo_to_string so that it's shared between these two warnings (i.e. to go from a RID_ to a const char *), or (b) just pass in a const char * identifying the guard clause token. > diff --git gcc/c-family/c.opt gcc/c-family/c.opt > index 37bb236..9dbe211 100644 > --- gcc/c-family/c.opt > +++ gcc/c-family/c.opt > @@ -698,6 +698,10 @@ Wmissing-field-initializers > C ObjC C++ ObjC++ Var(warn_missing_field_initializers) Warning > EnabledBy(Wextra) > Warn about missing fields in struct initializers. > > +Wmultistatement-macros > +C ObjC C++ ObjC++ Var(warn_multistatement_macros) Warning > LangEnabledBy(C ObjC C++ ObjC++,Wall) > +Warn about macros expanding to multiple statements in a body of a > conditional such as if, else, while, or for. Likewise; is "conditional" the right word here? Also, whether of not the statements are actually "in" the body of the guard is the issue here. How about: "Warn about unsafe multiple statement macros that appear to be guarded by a clause such as if, else, while, or for, in which only the first statement is actually guarded after the macro is expanded." or somesuch? > diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi > index c116882..2fe16dd 100644 > --- gcc/doc/invoke.texi > +++ gcc/doc/invoke.texi > @@ -4496,6 +4497,29 @@ This warning is enabled by @option{-Wall}. > @opindex Wno-missing-include-dirs > Warn if a user-supplied include directory does not exist. > > +@item -Wmultistatement-macros > +@opindex Wmultistatement-macros > +@opindex Wno-multistatement-macros > +Warn about macros expanding to multiple statements in a body of a > conditional, > +such as @code{if}, @code{else}, @code{for}, or @code{while}. (as above). > +For example: > + > +@smallexample > +#define DOIT x++; y++ > +if (c) > + DOIT; > +@end smallexample > + > +will increment @code{y} unconditionally, not just when @code{c} > holds. > +The can usually be fixed by wrapping the macro in a do-while loop: > +@smallexample > +#define DOIT do @{ x++; y++; @} while (0) > +if (c) > + DOIT; > +@end smallexample > + > +This warning is enabled by @option{-Wall} in C and C++. > + > @item -Wparentheses > @opindex Wparentheses > @opindex Wno-parentheses Hope this is constructive Dave
[PATCH, i386]: Fix PR81015, Bad codegen for __builtin_clz(unsigned short)
Hello! Attached patch removes invalid substitution of zero-extended HImode operands with HImode operation. CLZ returns different value when operating on SImode value vs. HImode value. 2017-06-08 Uros Bizjak PR target/81015 Revert: 2016-12-14 Uros Bizjak PR target/59874 * config/i386/i386.md (*ctzhi2): New insn_and_split pattern. (*clzhi2): Ditto. testsuite/ChangeLog: 2017-06-08 Uros Bizjak PR target/81015 * gcc.target/i386/pr59874-1.c (foo): Call __builtin_ctzs. * gcc.target/i386/pr59874-2.c (foo): Call __builtin_clzs. * gcc.target/i386/pr81015.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN, will be committed to gcc-7 branch. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 249018) +++ config/i386/i386.md (working copy) @@ -12762,24 +12762,6 @@ (set_attr "znver1_decode" "vector") (set_attr "mode" "")]) -(define_insn_and_split "*ctzhi2" - [(set (match_operand:SI 0 "register_operand") - (ctz:SI - (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" - (clobber (reg:CC FLAGS_REG))] - "TARGET_BMI - && can_create_pseudo_p ()" - "#" - "&& 1" - [(const_int 0)] -{ - rtx tmp = gen_reg_rtx (HImode); - - emit_insn (gen_tzcnt_hi (tmp, operands[1])); - emit_insn (gen_zero_extendhisi2 (operands[0], tmp)); - DONE; -}) - (define_insn_and_split "ctz2" [(set (match_operand:SWI48 0 "register_operand" "=r") (ctz:SWI48 @@ -12899,24 +12881,6 @@ operands[2] = GEN_INT (GET_MODE_BITSIZE (mode)-1); }) -(define_insn_and_split "*clzhi2" - [(set (match_operand:SI 0 "register_operand") - (clz:SI - (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" - (clobber (reg:CC FLAGS_REG))] - "TARGET_LZCNT - && can_create_pseudo_p ()" - "#" - "&& 1" - [(const_int 0)] -{ - rtx tmp = gen_reg_rtx (HImode); - - emit_insn (gen_lzcnt_hi (tmp, operands[1])); - emit_insn (gen_zero_extendhisi2 (operands[0], tmp)); - DONE; -}) - (define_insn_and_split "clz2_lzcnt" [(set (match_operand:SWI48 0 "register_operand" "=r") (clz:SWI48 Index: testsuite/gcc.target/i386/pr59874-1.c === --- testsuite/gcc.target/i386/pr59874-1.c (revision 249018) +++ testsuite/gcc.target/i386/pr59874-1.c (working copy) @@ -6,5 +6,5 @@ unsigned int foo (unsigned short x) { - return x ? __builtin_ctz (x) : 16U; + return x ? __builtin_ctzs (x) : 16U; } Index: testsuite/gcc.target/i386/pr59874-2.c === --- testsuite/gcc.target/i386/pr59874-2.c (revision 249018) +++ testsuite/gcc.target/i386/pr59874-2.c (working copy) @@ -6,5 +6,5 @@ unsigned int foo (unsigned short x) { - return x ? __builtin_clz (x) : 16U; + return x ? __builtin_clzs (x) : 16U; } Index: testsuite/gcc.target/i386/pr81015.c === --- testsuite/gcc.target/i386/pr81015.c (nonexistent) +++ testsuite/gcc.target/i386/pr81015.c (working copy) @@ -0,0 +1,21 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mlzcnt" } */ +/* { dg-require-effective-target lzcnt } */ + +#include "lzcnt-check.h" + +int +__attribute__ ((noinline, noclone)) +foo (unsigned short a) +{ + return __builtin_clz (a); +} + +static void +lzcnt_test () +{ + int res = foo (1); + + if (res != 31) +abort (); +}
Re: [PATCH GCC][4/5]Improve loop distribution to handle hmmer
On Thu, Jun 8, 2017 at 3:48 AM, kugan wrote: > Hi Bin, > >> + >> +/* In reduced dependence graph RDG for loop distribution, return true if >> + dependence between references DR1 and DR2 may create dependence cycle >> + and such dependence cycle can't be resolved by runtime alias check. >> */ >> + >> +static bool >> +possible_data_dep_cycle_p (struct graph *rdg, >> + hash_table *ddr_table, >> + data_reference_p dr1, data_reference_p dr2) > > > This name seems to be misleading a bit. It is basically dependence test ? Of > course this can lead to a cycle but looks like possible_data_dep_p would be > better. This tests dependence between statements in one partition. It indicates a must dependence cycle if the function returns true, I suppose data_dep_in_cycle_p should be better. > >> +{ >> + struct data_dependence_relation *ddr; >> + >> + /* Re-shuffle data-refs to be in topological order. */ >> + if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1)) >> + > rdg_vertex_for_stmt (rdg, DR_STMT (dr2))) >> +std::swap (dr1, dr2); >> + >> + ddr = get_ddr (rdg, ddr_table, dr1, dr2); >> + >> + /* In case something goes wrong in data dependence analysis. */ >> + if (ddr == NULL) >> +return true; >> + /* In case of no data dependence. */ >> + else if (DDR_ARE_DEPENDENT (ddr) == chrec_known) >> +return false; >> + /* Or the data dependence can be resolved by compilation time alias >> + check. */ >> + else if (!alias_sets_conflict_p (get_alias_set (DR_REF (dr1)), >> + get_alias_set (DR_REF (dr2 >> +return false; >> + /* For unknown data dependence or known data dependence which can't be >> + expressed in classic distance vector, we check if it can be resolved >> + by runtime alias check. If yes, we still consider data dependence >> + as won't introduce data dependence cycle. */ >> + else if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know >> + || DDR_NUM_DIST_VECTS (ddr) == 0) > > > You have already handled chrec_known above. Can you still have known data > dependence which can't be expressed in classic distance vector ? I don't know the very details in data dependence analyzer, only I tend to believe it's possible by code. Reading build_classic_dist_vector gives: if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE) return false; //... dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr)); if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr), dist_v, &init_b, &index_carry)) return false; Looks like we can have DDR_ARE_DEPENDENT (ddr) == NULL_TREE without classic distance vector. Or the code could be simplified vice versa. > >> (dr1) >> + || !DR_BASE_ADDRESS (dr2) || !DR_OFFSET (dr2) || !DR_INIT >> (dr2) >> + || !DR_STEP (dr1) || !tree_fits_uhwi_p (DR_STEP (dr1)) >> + || !DR_STEP (dr2) || !tree_fits_uhwi_p (DR_STEP (dr2)) >> + || res == 0) >> + this_dir = 2; >> + /* If it's possible to do runtime alias check, simply record >> the >> + ddr. */ >> + else if (alias_ddrs != NULL) >> + alias_ddrs->safe_push (ddr); > > > If alias_ddrs is not passed (i.e. NULL), shouldn't this_ddr = 2. May be add > an assert for alias_ddrs ? No, it's intended to ignore the dependence when alias_ddrs == NULL, I will add more comment for this. > >> +/* Build and return partition dependence graph for PARTITIONS. RDG is >> + reduced dependence graph for the loop to be distributed. DDR_TABLE >> + is hash table contains all data dependence relations. ALIAS_DDRS is >> + a data dependence relations vector for temporary use. If ALIAS_DDRS >> + is NULL, dependence which can be resolved by runtime alias check will >> + not be considered, vice versa. */ >> + >> +static struct graph * >> +build_partition_graph (struct graph *rdg, >> + hash_table *ddr_table, >> + vec *partitions, >> + vec *alias_ddrs) > > > Why do you pass alias_ddrs to this. alias_ddrs does not pass any data or > return any. It can be defined in the function and used there. If you are > using this to indicate runtime alias check should be considered, you can > define a different bool for that ? I need to pass vector at two end of function calling stack, so I tried to use vector pointer parameter for all related functions. Apparently this causes confusion for you, I will try the other way by using bool parameter in next version patch. > >> +{ >> + >> +static void >> +break_alias_scc_partitions (struct graph *rdg, >> + hash_table *ddr_table, >> + vec *partitions, >> + vec *alias_ddrs) > > > I am not sure I understand this. When you are in pg_add_dependence_edges, > when you record alias_ddrs for runtime checking you set this_dur io 1. That > means you
Re: C/C++ PATCH to implement -Wmultistatement-macros (PR c/80116)
On 06/08/2017 11:24 AM, David Malcolm wrote: On Thu, 2017-06-08 at 18:49 +0200, Marek Polacek wrote: This is the hopefully last incarnation of the patch. The change from the last time[0] is simpy that I've added a new test and the warning has been renamed to -Wmultistatement-macros. David - any another comments? Thanks for working on this; looks useful. The new name is more accurate, but is rather long; oh well. As part of -Wall, users won't typically have to type it, so that's OK. [...] diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c index 35321a6..d883330 100644 --- gcc/c-family/c-warn.c +++ gcc/c-family/c-warn.c [...] + if (warning_at (body_loc, OPT_Wmultistatement_macros, + "macro expands to multiple statements")) +inform (guard_loc, "some parts of macro expansion are not guarded by " + "this conditional"); Is the guard necessarily a "conditional"? I take a "conditional" to mean an "if"; the guard could be a "for" or a "while" (or an "else", which still seems something of a stretch to me to call a "conditional"). Suggestion: word "this conditional" as "this %qs clause" and either (a) rework the code in c-indentation.c's guard_tinfo_to_string so that it's shared between these two warnings (i.e. to go from a RID_ to a const char *), or (b) just pass in a const char * identifying the guard clause token. ... Likewise; is "conditional" the right word here? Also, whether of not the statements are actually "in" the body of the guard is the issue here. How about: "Warn about unsafe multiple statement macros that appear to be guarded by a clause such as if, else, while, or for, in which only the first statement is actually guarded after the macro is expanded." or somesuch? FWIW, I agree with David that "conditional" isn't entirely accurate. At the same time, referring to any of do, for, if, or switch as clauses isn't quite precise either(*). In the C language they are the names of selection and iteration statements, and what follows is called the controlling expression (with for being special) and the next thing is a substatement. I think many people will informally call the two a condition or conditional and the body. I don't have strong feelings about the current wording but if it should be tweaked for accuracy I would suggest to use the formal term "controlling expression", similarly to -Wswitch-unreachable. Martin PS [*] To be completely pedantic, the word clause in the C and C++ standards has a precise meaning: it refers to a chapter of the text (such as Scope, Conformance, Language, etc.)
[PATCH] Add -fsanitize={null,alignment} sanitization of aggregate arguments of calls (PR middle-end/81005)
Hi! I've noticed we don't sanitize aggregate arguments of calls. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? For asms I'm not 100% sure, the arguments might be unused in the asm, or used only conditionally etc. 2017-06-07 Jakub Jelinek PR middle-end/81005 * ubsan.c (instrument_null): Avoid pointless code temporary. (pass_ubsan::execute): Instrument aggregate arguments of calls. * c-c++-common/ubsan/align-10.c: New test. * c-c++-common/ubsan/null-13.c: New test. --- gcc/ubsan.c.jj 2017-05-21 15:46:13.0 +0200 +++ gcc/ubsan.c 2017-06-07 14:14:27.883227570 +0200 @@ -1212,8 +1212,7 @@ instrument_null (gimple_stmt_iterator gs if (TREE_CODE (t) == ADDR_EXPR) t = TREE_OPERAND (t, 0); tree base = get_base_address (t); - const enum tree_code code = TREE_CODE (base); - if (code == MEM_REF + if (TREE_CODE (base) == MEM_REF && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME) instrument_mem_ref (t, base, &gsi, is_lhs); } @@ -2003,6 +2002,20 @@ pass_ubsan::execute (function *fun) instrument_null (gsi, true); if (gimple_assign_single_p (stmt)) instrument_null (gsi, false); + if (is_gimple_call (stmt)) + { + unsigned args_num = gimple_call_num_args (stmt); + for (unsigned i = 0; i < args_num; ++i) + { + tree arg = gimple_call_arg (stmt, i); + if (is_gimple_reg (arg) || is_gimple_min_invariant (arg)) + continue; + tree base = get_base_address (arg); + if (TREE_CODE (base) == MEM_REF + && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME) + instrument_mem_ref (arg, base, &gsi, false); + } + } } if (flag_sanitize & (SANITIZE_BOOL | SANITIZE_ENUM) --- gcc/testsuite/c-c++-common/ubsan/align-10.c.jj 2017-06-07 15:32:18.545409340 +0200 +++ gcc/testsuite/c-c++-common/ubsan/align-10.c 2017-06-07 17:06:44.0 +0200 @@ -0,0 +1,39 @@ +/* Limit this to known non-strict alignment targets. */ +/* { dg-do run { target { i?86-*-linux* x86_64-*-linux* } } } */ +/* { dg-options "-O -fsanitize=alignment -fsanitize-recover=alignment" } */ + +struct R { int a; } r; +struct S { struct R a; char b; long long c; short d[10]; }; +struct T { char a; long long b; }; +struct U { char a; int b; int c; long long d; struct S e; struct T f; } __attribute__((packed)); +struct V { long long a; struct S b; struct T c; struct U u; } v; + +__attribute__((noinline, noclone)) int +bar (int x, struct R y, struct R z) +{ + return x + y.a; +} + +__attribute__((noinline, noclone)) int +foo (struct S *p, struct S *q) +{ + int i = bar (0, r, r); + i += bar (1, p->a, r); + i += bar (2, r, q->a); + return i; +} + +int +main () +{ + char *p = (char *) &v.u.e; + struct S *q, *r; + asm volatile ("" : "=r" (q) : "0" (p)); + asm volatile ("" : "=r" (r) : "0" (p)); + if (foo (q, r) != 3) +__builtin_abort (); + return 0; +} + +/* { dg-output "\.c:21:\[0-9]*: \[^\n\r]*member access within misaligned address 0x\[0-9a-fA-F]* for type 'struct S', which requires \[48] byte alignment.*" } */ +/* { dg-output "\.c:22:\[0-9]*: \[^\n\r]*member access within misaligned address 0x\[0-9a-fA-F]* for type 'struct S', which requires \[48] byte alignment" } */ --- gcc/testsuite/c-c++-common/ubsan/null-13.c.jj 2017-06-07 14:21:27.460145351 +0200 +++ gcc/testsuite/c-c++-common/ubsan/null-13.c 2017-06-07 16:54:18.576027478 +0200 @@ -0,0 +1,37 @@ +/* { dg-do run } */ +/* { dg-options "-fsanitize=null -fno-sanitize-recover=null -w" } */ +/* { dg-shouldfail "ubsan" } */ + +struct S { + int i; + long long j; + long long m; +}; +union U { + int k; + struct S l; +}; + +__attribute__((noinline, noclone)) int +foo (struct S s) +{ + return s.i + s.j + s.m; +} + +__attribute__((noinline, noclone)) int +bar (union U *u) +{ + foo (u->l); +} + +union U v; + +int +main (void) +{ + union U *u = 0; + asm volatile ("" : "+r" (u) : "r" (&v) : "memory"); + return bar (u); +} + +/* { dg-output "member access within null pointer of type 'union U'" } */ Jakub
Re: [PATCH] Add -fsanitize={null,alignment} sanitization of aggregate arguments of calls (PR middle-end/81005)
On June 8, 2017 8:15:28 PM GMT+02:00, Jakub Jelinek wrote: >Hi! > >I've noticed we don't sanitize aggregate arguments of calls. > >Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, >ok for trunk? OK. For asms I'm not 100% sure, the arguments might be >unused >in the asm, or used only conditionally etc. Valid points. Richard. >2017-06-07 Jakub Jelinek > > PR middle-end/81005 > * ubsan.c (instrument_null): Avoid pointless code temporary. > (pass_ubsan::execute): Instrument aggregate arguments of calls. > > * c-c++-common/ubsan/align-10.c: New test. > * c-c++-common/ubsan/null-13.c: New test. > >--- gcc/ubsan.c.jj 2017-05-21 15:46:13.0 +0200 >+++ gcc/ubsan.c2017-06-07 14:14:27.883227570 +0200 >@@ -1212,8 +1212,7 @@ instrument_null (gimple_stmt_iterator gs > if (TREE_CODE (t) == ADDR_EXPR) > t = TREE_OPERAND (t, 0); > tree base = get_base_address (t); >- const enum tree_code code = TREE_CODE (base); >- if (code == MEM_REF >+ if (TREE_CODE (base) == MEM_REF > && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME) > instrument_mem_ref (t, base, &gsi, is_lhs); > } >@@ -2003,6 +2002,20 @@ pass_ubsan::execute (function *fun) > instrument_null (gsi, true); > if (gimple_assign_single_p (stmt)) > instrument_null (gsi, false); >+if (is_gimple_call (stmt)) >+ { >+unsigned args_num = gimple_call_num_args (stmt); >+for (unsigned i = 0; i < args_num; ++i) >+ { >+tree arg = gimple_call_arg (stmt, i); >+if (is_gimple_reg (arg) || is_gimple_min_invariant (arg)) >+ continue; >+tree base = get_base_address (arg); >+if (TREE_CODE (base) == MEM_REF >+&& TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME) >+ instrument_mem_ref (arg, base, &gsi, false); >+ } >+ } > } > > if (flag_sanitize & (SANITIZE_BOOL | SANITIZE_ENUM) >--- gcc/testsuite/c-c++-common/ubsan/align-10.c.jj 2017-06-07 >15:32:18.545409340 +0200 >+++ gcc/testsuite/c-c++-common/ubsan/align-10.c2017-06-07 >17:06:44.0 +0200 >@@ -0,0 +1,39 @@ >+/* Limit this to known non-strict alignment targets. */ >+/* { dg-do run { target { i?86-*-linux* x86_64-*-linux* } } } */ >+/* { dg-options "-O -fsanitize=alignment -fsanitize-recover=alignment" >} */ >+ >+struct R { int a; } r; >+struct S { struct R a; char b; long long c; short d[10]; }; >+struct T { char a; long long b; }; >+struct U { char a; int b; int c; long long d; struct S e; struct T f; >} __attribute__((packed)); >+struct V { long long a; struct S b; struct T c; struct U u; } v; >+ >+__attribute__((noinline, noclone)) int >+bar (int x, struct R y, struct R z) >+{ >+ return x + y.a; >+} >+ >+__attribute__((noinline, noclone)) int >+foo (struct S *p, struct S *q) >+{ >+ int i = bar (0, r, r); >+ i += bar (1, p->a, r); >+ i += bar (2, r, q->a); >+ return i; >+} >+ >+int >+main () >+{ >+ char *p = (char *) &v.u.e; >+ struct S *q, *r; >+ asm volatile ("" : "=r" (q) : "0" (p)); >+ asm volatile ("" : "=r" (r) : "0" (p)); >+ if (foo (q, r) != 3) >+__builtin_abort (); >+ return 0; >+} >+ >+/* { dg-output "\.c:21:\[0-9]*: \[^\n\r]*member access within >misaligned address 0x\[0-9a-fA-F]* for type 'struct S', which requires >\[48] byte alignment.*" } */ >+/* { dg-output "\.c:22:\[0-9]*: \[^\n\r]*member access within >misaligned address 0x\[0-9a-fA-F]* for type 'struct S', which requires >\[48] byte alignment" } */ >--- gcc/testsuite/c-c++-common/ubsan/null-13.c.jj 2017-06-07 >14:21:27.460145351 +0200 >+++ gcc/testsuite/c-c++-common/ubsan/null-13.c 2017-06-07 >16:54:18.576027478 +0200 >@@ -0,0 +1,37 @@ >+/* { dg-do run } */ >+/* { dg-options "-fsanitize=null -fno-sanitize-recover=null -w" } */ >+/* { dg-shouldfail "ubsan" } */ >+ >+struct S { >+ int i; >+ long long j; >+ long long m; >+}; >+union U { >+ int k; >+ struct S l; >+}; >+ >+__attribute__((noinline, noclone)) int >+foo (struct S s) >+{ >+ return s.i + s.j + s.m; >+} >+ >+__attribute__((noinline, noclone)) int >+bar (union U *u) >+{ >+ foo (u->l); >+} >+ >+union U v; >+ >+int >+main (void) >+{ >+ union U *u = 0; >+ asm volatile ("" : "+r" (u) : "r" (&v) : "memory"); >+ return bar (u); >+} >+ >+/* { dg-output "member access within null pointer of type 'union U'" } >*/ > > Jakub
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
On June 8, 2017 6:44:01 PM GMT+02:00, Joseph Myers wrote: >On Thu, 8 Jun 2017, Richard Biener wrote: > >> For a built-in this is generally valid. For plain isnan it depends >on >> what the standard says. >> >> You have to support taking the address of isnan anyway and thus >> expanding to a library call in that case. Why doesn't that not work? > >In the case of isnan there is the Unix98 non-type-generic function, so >it >should definitely work to take the address of that function as declared >in >the system headers. > >For the DEF_GCC_BUILTIN type-generic functions there may not be any >corresponding library function at all (as well as only being callable >with >the __builtin_* name). I haven't followed the patches in detail but I would suggest to move the lowering somewhere to gimple-fold.c so that late discovered direct calls are also lowered. Richard.
Go patch committed: lvalue context fixes in Unary_expression::do_get_backend
In the Go frontend a couple of the Backend::var_expression invocations in Unary_expression::do_get_backend were selecting "lvalue" context incorrectly (these var exprs were not in an assignment or LHS position). This patch by Than McIntosh fixes the problem by changing them back to "rvalue" context. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 249028) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -4b857cde45939f0e9f3cf89b9e347b6f6ebe0f8f +81d9f6d05c2bb92b2b3af02807713b6bed9bf053 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc(revision 249028) +++ gcc/go/gofrontend/expressions.cc(working copy) @@ -4370,7 +4370,7 @@ Unary_expression::do_get_backend(Transla gogo->backend()->implicit_variable_set_init(implicit, buf, btype, true, copy_to_heap, false, bexpr); - bexpr = gogo->backend()->var_expression(implicit, VE_lvalue, loc); + bexpr = gogo->backend()->var_expression(implicit, VE_rvalue, loc); // If we are not copying a slice initializer to the heap, // then it can be changed by the program, so if it can @@ -4380,7 +4380,7 @@ Unary_expression::do_get_backend(Transla && this->expr_->type()->has_pointer()) { Bexpression* root = - gogo->backend()->var_expression(implicit, VE_lvalue, loc); + gogo->backend()->var_expression(implicit, VE_rvalue, loc); root = gogo->backend()->address_expression(root, loc); Type* type = Type::make_pointer_type(this->expr_->type()); gogo->add_gc_root(Expression::make_backend(root, type, loc)); @@ -4400,7 +4400,7 @@ Unary_expression::do_get_backend(Transla true, false, btype, loc); gogo->backend()->immutable_struct_set_init(decl, buf, true, false, btype, loc, bexpr); - bexpr = gogo->backend()->var_expression(decl, VE_lvalue, loc); + bexpr = gogo->backend()->var_expression(decl, VE_rvalue, loc); } go_assert(!this->create_temp_ || this->expr_->is_variable());
[PATCH] Fix reassoc range opt related ICE (PR tree-optimization/81003)
Hi! force_gimple_operand_gsi called by update_range_test can using match.pd simplifications sometimes return INTEGER_CST (especially when cunroll unrolled code isn't really optimized by forwprop/ccp and similar passes before reassoc2), but that is something not acceptable to the rest of the optimize_range* code, because it needs to know not just the value, but also some gimple_stmt_iterator to insert related code etc. This patch makes sure we have a SSA_NAME even in that case. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2017-06-08 Jakub Jelinek PR tree-optimization/81003 * tree-ssa-reassoc.c (force_into_ssa_name): New function. (update_range_test): Use it instead of force_gimple_operand_gsi. * gcc.c-torture/compile/pr81003.c: New test. --- gcc/tree-ssa-reassoc.c.jj 2017-06-05 11:58:14.0 +0200 +++ gcc/tree-ssa-reassoc.c 2017-06-08 12:20:47.619790903 +0200 @@ -2282,6 +2282,26 @@ range_entry_cmp (const void *a, const vo } } +/* Helper function for update_range_test. Force EXPR into an SSA_NAME, + insert needed statements BEFORE or after GSI. */ + +static tree +force_into_ssa_name (gimple_stmt_iterator *gsi, tree expr, bool before) +{ + enum gsi_iterator_update m = before ? GSI_SAME_STMT : GSI_CONTINUE_LINKING; + tree ret = force_gimple_operand_gsi (gsi, expr, true, NULL_TREE, before, m); + if (TREE_CODE (ret) != SSA_NAME) +{ + gimple *g = gimple_build_assign (make_ssa_name (TREE_TYPE (ret)), ret); + if (before) + gsi_insert_before (gsi, g, GSI_SAME_STMT); + else + gsi_insert_after (gsi, g, GSI_CONTINUE_LINKING); + ret = gimple_assign_lhs (g); +} + return ret; +} + /* Helper routine of optimize_range_test. [EXP, IN_P, LOW, HIGH, STRICT_OVERFLOW_P] is a merged range for RANGE and OTHERRANGE through OTHERRANGE + COUNT - 1 ranges, @@ -2393,15 +2413,13 @@ update_range_test (struct range_entry *r else if (op != range->exp) { gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); - tem = force_gimple_operand_gsi (&gsi, tem, true, NULL_TREE, true, - GSI_SAME_STMT); + tem = force_into_ssa_name (&gsi, tem, true); gsi_prev (&gsi); } else if (gimple_code (stmt) != GIMPLE_PHI) { gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING); - tem = force_gimple_operand_gsi (&gsi, tem, true, NULL_TREE, false, - GSI_CONTINUE_LINKING); + tem = force_into_ssa_name (&gsi, tem, false); } else { @@ -2419,8 +2437,7 @@ update_range_test (struct range_entry *r } } gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); - tem = force_gimple_operand_gsi (&gsi, tem, true, NULL_TREE, true, - GSI_SAME_STMT); + tem = force_into_ssa_name (&gsi, tem, true); if (gsi_end_p (gsi)) gsi = gsi_last_bb (gimple_bb (stmt)); else --- gcc/testsuite/gcc.c-torture/compile/pr81003.c.jj2017-06-08 12:16:20.284127013 +0200 +++ gcc/testsuite/gcc.c-torture/compile/pr81003.c 2017-06-08 12:16:07.0 +0200 @@ -0,0 +1,10 @@ +/* PR tree-optimization/81003 */ + +unsigned int a, b; + +void +foo (void) +{ + for (b = 0; b < 13; b += 2) +a &= !!b; +} Jakub
libgo patch committed: Update to 1.8.3 release
This patch updates libgo to the Go 1.8.3 release. This is a fairly minor patch for a minor release. Bootstrapped and ran Go tests on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 249029) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -81d9f6d05c2bb92b2b3af02807713b6bed9bf053 +82961ce59e8bb02598d963d2a05b3acca860d9dd The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/MERGE === --- libgo/MERGE (revision 249028) +++ libgo/MERGE (working copy) @@ -1,4 +1,4 @@ -a4c18f063b6659079ca2848ca217a0587dabc001 +352996a381701cfa0c16e8de29cbde8f3922182f The first line of this file holds the git revision number of the last merge done from the master library sources. Index: libgo/VERSION === --- libgo/VERSION (revision 249028) +++ libgo/VERSION (working copy) @@ -1 +1 @@ -go1.8.1 +go1.8.3 Index: libgo/go/cmd/go/build.go === --- libgo/go/cmd/go/build.go(revision 249028) +++ libgo/go/cmd/go/build.go(working copy) @@ -3133,6 +3133,26 @@ func (b *builder) ccompile(p *Package, o desc := p.ImportPath output, err := b.runOut(p.Dir, desc, nil, compiler, flags, "-o", outfile, "-c", file) if len(output) > 0 { + // On FreeBSD 11, when we pass -g to clang 3.8 it + // invokes its internal assembler with -dwarf-version=2. + // When it sees .section .note.GNU-stack, it warns + // "DWARF2 only supports one section per compilation unit". + // This warning makes no sense, since the section is empty, + // but it confuses people. + // We work around the problem by detecting the warning + // and dropping -g and trying again. + if bytes.Contains(output, []byte("DWARF2 only supports one section per compilation unit")) { + newFlags := make([]string, 0, len(flags)) + for _, f := range flags { + if !strings.HasPrefix(f, "-g") { + newFlags = append(newFlags, f) + } + } + if len(newFlags) < len(flags) { + return b.ccompile(p, outfile, newFlags, file, compiler) + } + } + b.showOutput(p.Dir, desc, b.processOutput(output)) if err != nil { err = errPrintedOutput Index: libgo/go/crypto/elliptic/elliptic_test.go === --- libgo/go/crypto/elliptic/elliptic_test.go (revision 249028) +++ libgo/go/crypto/elliptic/elliptic_test.go (working copy) @@ -300,6 +300,29 @@ var p224BaseMultTests = []baseMultTest{ }, } +type scalarMultTest struct { + k string + xIn, yIn string + xOut, yOut string +} + +var p256MultTests = []scalarMultTest{ + { + "2a265f8bcbdcaf94d58519141e578124cb40d64a501fba9c11847b28965bc737", + "023819813ac969847059028ea88a1f30dfbcde03fc791d3a252c6b41211882ea", + "f93e4ae433cc12cf2a43fc0ef26400c0e125508224cdb649380f25479148a4ad", + "4d4de80f1534850d261075997e3049321a0864082d24a917863366c0724f5ae3", + "a22d2b7f7818a3563e0f7a76c9bf0921ac55e06e2e4d11795b233824b1db8cc0", + }, + { + "313f72ff9fe811bf573176231b286a3bdb6f1b14e05c40146590727a71c3bccd", + "cc11887b2d66cbae8f4d306627192522932146b42f01d3c6f92bd5c8ba739b06", + "a2f08a029cd06b46183085bae9248b0ed15b70280c7ef13a457f5af382426031", + "831c3f6b5f762d2f461901577af41354ac5f228c2591f84f8a6e51e2e3f17991", + "93f90934cd0ef2c698cc471c60a93524e87ab31ca2412252337f364513e43684", + }, +} + func TestBaseMult(t *testing.T) { p224 := P224() for i, e := range p224BaseMultTests { @@ -379,6 +402,19 @@ func TestP256Mult(t *testing.T) { break } } + + for i, e := range p256MultTests { + x, _ := new(big.Int).SetString(e.xIn, 16) + y, _ := new(big.Int).SetString(e.yIn, 16) + k, _ := new(big.Int).SetString(e.k, 16) + expectedX, _ := new(big.Int).SetString(e.xOut, 16) + expectedY, _ := new(big.Int).SetString(e.yOut, 16) + + xx, yy := p256.ScalarMult(x, y, k.Bytes()) + if xx.Cmp(expectedX) != 0 || yy.Cmp(expectedY) != 0 { + t.Errorf("#%d: got (%x, %x), want (%x,
[committed] Fix error-recovery on OpenMP clause with unusable copy-ctor/dtor (PR c++/81011)
Hi! If we can't copy construct, or destruct etc. a privatized variable, for error-recovery we turn it into a shared clause that doesn't need that. But starting with GCC 6 there are two OMP_CLAUSE_SHARED_* bits that mean something different on other clauses, so we need to clear them. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk and 7.x/6.x. 2017-06-08 Jakub Jelinek PR c++/81011 * cp-gimplify.c (cxx_omp_finish_clause): When changing clause to OMP_CLAUSE_SHARED, also clear OMP_CLAUSE_SHARED_FIRSTPRIVATE and OMP_CLAUSE_SHARED_READONLY flags. * g++.dg/gomp/pr81011.C: New test. --- gcc/cp/cp-gimplify.c.jj 2017-06-07 10:45:32.0 +0200 +++ gcc/cp/cp-gimplify.c2017-06-08 13:24:48.639272627 +0200 @@ -1912,7 +1912,11 @@ cxx_omp_finish_clause (tree c, gimple_se make_shared = true; if (make_shared) -OMP_CLAUSE_CODE (c) = OMP_CLAUSE_SHARED; +{ + OMP_CLAUSE_CODE (c) = OMP_CLAUSE_SHARED; + OMP_CLAUSE_SHARED_FIRSTPRIVATE (c) = 0; + OMP_CLAUSE_SHARED_READONLY (c) = 0; +} } /* Return true if DECL's DECL_VALUE_EXPR (if any) should be --- gcc/testsuite/g++.dg/gomp/pr81011.C.jj 2017-06-08 13:33:28.226656742 +0200 +++ gcc/testsuite/g++.dg/gomp/pr81011.C 2017-06-08 13:33:07.0 +0200 @@ -0,0 +1,19 @@ +// PR c++/81011 +// { dg-do compile } + +class A { A (const A&); }; // { dg-message "declared private here" } +void foo (const A&); + +void +bar (A& a) +{ +#pragma omp task // { dg-error "is private within this context" } + foo (a); +} + +void +baz (A& a) +{ +#pragma omp task firstprivate (a) // { dg-error "is private within this context" } + foo (a); +} Jakub
Re: RFC: [PATCH] Add warn_if_not_aligned attribute
On 06/08/2017 11:00 AM, H.J. Lu wrote: On Wed, Jun 7, 2017 at 6:30 AM, H.J. Lu wrote: On Tue, Jun 6, 2017 at 5:11 PM, Martin Sebor wrote: On 06/06/2017 04:57 PM, H.J. Lu wrote: On Tue, Jun 6, 2017 at 10:34 AM, Martin Sebor wrote: On 06/06/2017 10:59 AM, H.J. Lu wrote: On Tue, Jun 6, 2017 at 9:10 AM, Martin Sebor wrote: On 06/06/2017 10:07 AM, Martin Sebor wrote: On 06/05/2017 11:45 AM, H.J. Lu wrote: On Mon, Jun 5, 2017 at 8:11 AM, Joseph Myers wrote: The new attribute needs documentation. Should the test be in c-c++-common This feature does support C++. But C++ compiler issues a slightly different warning at a different location. or does this feature not support C++? Here is the updated patch with documentation and a C++ test. This patch caused a few testsuite failures: FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o compile /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/compat//struct-align-1.h:169:1: warning: alignment 1 of 'struct B2_m_inner_p_outer' is less than 16 FAIL: g++.dg/torture/pr80334.C -O0 (test for excess errors) /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/torture/pr80334.C:4:8: warning: alignment 1 of 'B' is less than 16 Users often want the ability to control a warning, even when it certainly indicates a bug. I would suggest to add an option to make it possible for this warning as well. Btw., a bug related to some of those this warning is meant to detect is assigning the address of an underaligned object to a pointer of a natively aligned type. Clang has an option to detect this problem: -Waddress-of-packed-member. It might make a nice follow-on enhancement to add support for the same thing. I mention this because I think it would make sense to consider this when choosing the name of the GCC option (i.e., rather than having two distinct but closely related warnings, have one that detects both of these alignment type of bugs. A bug that has some additional context on this is pr 51628. A possible name for the new option suggested there is -Wpacked. Martin Isn't -Waddress-of-packed-member a subset of or the same as -Wpacked? In Clang it's neither. -Waddress-of-packed-member only triggers when the address of a packed member is taken but not for the cases in bug 53037 (i.e., reducing the alignment of a member). It's also enabled by default, while -Wpacked needs to be specified explicitly (i.e., it's in neither -Wall or -Wextra). FWIW, I don't really have a strong opinion about the names of the options. My input is that the proliferation of fine-grained warning options for closely related problems tends to make it tricky to get their interactions right (both in the compiler and for users). Enabling both/all such options can lead to multiple warnings for what boils down to essentially the same bug in the same expression, overwhelming the user in repetitive diagnostics. There is already -Wpacked. Should I overload it for this? I'd say yes if -Wpacked were at least in -Wall. But it's an opt-in kind of warning that's not even in -Wextra, and relaxing an explicitly specified alignment seems more like a bug than just something that might be nice to know about. I would expect warn_if_not_aligned to trigger a warning even without -Wall (i.e., as you have it in your patch, but with an option to control it). That would suggest three levels of warnings: 1) warn by default (warn_if_not_aligned violation) 2) warn with -Wall (using a type with attribute aligned to define a member of a packed struct) 3) warn if requested (current -Wpacked) So one way to deal with it would be to change -Wpacked to take an argument between 0 and 3, set the default to correspond to the (1) above, and have -Wall bump it up to (2). If the equivalent of -Waddress-of-packed-member were to be implemented in GCC it would probably be a candidate to add to the (2) above.(*) This might be more involved than you envisioned. A slightly simpler alternative would be to add a different option, say something like -Walign=N, and have it handle just (1) and (2) above, leaving -Wpacked unchanged. Since there is no agreement on -W options and changes may touch many places, I will do 1. Add -Wwarn-if-not-aligned and enable it by default. 2. Add -Wpacked-not-aligned and disable it by default. Once there is an agreement, I replace -Wpacked-not-aligned with the new option. Okay. I can't approve the patch but thank you for enhancing your patch to handle the additional case I brought up! Martin Here is the updated patch. Add warn_if_not_aligned attribute as well as command line options: -Wif-not-aligned and -Wpacked-not-aligned. __attribute__((warn_if_not_aligned(N))) causes compiler to issue a warning if the field in a struct or union is not aligned to N: typedef unsigned long long __u64 __attribute__((aligned(4),warn_if_not_aligned(8))); struct foo { int i1; int i2; __u64 x; }; __u64 is aligned to 4 b
[committed] Fix ICE in OpenMP array section handling (PR c/81006)
Hi! Apparently TYPE_MAX_VALUE (TYPE_DOMAIN (type)) on some arrays doesn't have expected sizetype type, but instead has ssizetype -1, which causes the size_binop verification to ICE. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk, 7.x, 6.x and 5.x (on the last one with additional xfail for the diagnostics - we only fix the ICE, but don't introduce something we failed to diagnose). 2017-06-08 Jakub Jelinek PR c/81006 * c-typeck.c (handle_omp_array_sections_1): Convert TYPE_MAX_VALUE to sizetype before size_binop. * semantics.c (handle_omp_array_sections_1): Convert TYPE_MAX_VALUE to sizetype before size_binop. * c-c++-common/gomp/pr81006.c: New test. --- gcc/c/c-typeck.c.jj 2017-06-02 09:01:07.0 +0200 +++ gcc/c/c-typeck.c2017-06-08 14:46:53.940730138 +0200 @@ -12362,9 +12362,9 @@ handle_omp_array_sections_1 (tree c, tre && TREE_CODE (TYPE_MAX_VALUE (TYPE_DOMAIN (type))) == INTEGER_CST) { - tree size = size_binop (PLUS_EXPR, - TYPE_MAX_VALUE (TYPE_DOMAIN (type)), - size_one_node); + tree size + = fold_convert (sizetype, TYPE_MAX_VALUE (TYPE_DOMAIN (type))); + size = size_binop (PLUS_EXPR, size, size_one_node); if (TREE_CODE (low_bound) == INTEGER_CST) { if (tree_int_cst_lt (size, low_bound)) --- gcc/cp/semantics.c.jj 2017-06-02 09:01:19.0 +0200 +++ gcc/cp/semantics.c 2017-06-08 14:40:34.141432913 +0200 @@ -4731,9 +4731,9 @@ handle_omp_array_sections_1 (tree c, tre && TREE_CODE (TYPE_MAX_VALUE (TYPE_DOMAIN (type))) == INTEGER_CST) { - tree size = size_binop (PLUS_EXPR, - TYPE_MAX_VALUE (TYPE_DOMAIN (type)), - size_one_node); + tree size + = fold_convert (sizetype, TYPE_MAX_VALUE (TYPE_DOMAIN (type))); + size = size_binop (PLUS_EXPR, size, size_one_node); if (TREE_CODE (low_bound) == INTEGER_CST) { if (tree_int_cst_lt (size, low_bound)) --- gcc/testsuite/c-c++-common/gomp/pr81006.c.jj2017-06-08 14:50:15.195219648 +0200 +++ gcc/testsuite/c-c++-common/gomp/pr81006.c 2017-06-08 14:49:50.0 +0200 @@ -0,0 +1,10 @@ +/* PR c/81006 */ +/* { dg-do compile } */ + +int a[] = {}; + +void foo() +{ + #pragma omp task depend(out: a[:]) /* { dg-error "zero length array section in .depend. clause" } */ +{} +} Jakub
[C++ PATCH] Fix sanitization ICE (PR c++/80973)
Hi! cp_genericize_r now instruments INTEGER_CSTs that have REFERENCE_TYPE, so that we can diagnose binding references to NULL in some cases, see PR79572. As the following testcase shows, there is one exception when we do not want to do that - in MEM_EXPR, the second operand is an INTEGER_CST whose value is an offset, but type is something unrelated - what should be used for aliasing purposes. So, that is something we do not want to diagnose, and it is also invalid IL, as the second argument has to be an INTEGER_CST, not some expression with side-effects. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/7.x? 2017-06-08 Jakub Jelinek PR c++/80973 * cp-gimplify.c (cp_genericize_r): Don't instrument MEM_REF second argument even if it has REFERENCE_TYPE. * g++.dg/ubsan/pr80973.C: New test. --- gcc/cp/cp-gimplify.c.jj 2017-06-08 13:24:48.0 +0200 +++ gcc/cp/cp-gimplify.c2017-06-08 17:48:13.466868875 +0200 @@ -1476,6 +1476,21 @@ cp_genericize_r (tree *stmt_p, int *walk cp_ubsan_maybe_instrument_member_call (stmt); } } + else if (TREE_CODE (stmt) == MEM_REF) + { + /* For MEM_REF, make sure not to sanitize the second operand even +if it has reference type. It is just an offset with a type +holding other information. */ + if (TREE_CODE (TREE_OPERAND (stmt, 1)) == INTEGER_CST + && (TREE_CODE (TREE_TYPE (TREE_OPERAND (stmt, 1))) + == REFERENCE_TYPE) + && (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))) + { + cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_genericize_r, + data, NULL); + *walk_subtrees = 0; + } + } } p_set->add (*stmt_p); --- gcc/testsuite/g++.dg/ubsan/pr80973.C.jj 2017-06-08 17:49:55.491622907 +0200 +++ gcc/testsuite/g++.dg/ubsan/pr80973.C2017-06-08 17:49:51.056677069 +0200 @@ -0,0 +1,16 @@ +// PR c++/80973 +// { dg-do compile } +// { dg-options "-fsanitize=undefined -std=c++14" } + +struct A { + A(); + A(const A &); +}; +struct B { + B(); + template auto g(Args &&... p1) { +return [=] { f(p1...); }; + } + void f(A, const char *); +}; +B::B() { g(A(), ""); } Jakub
[PATCH] Fix mpx testcases (Re: [CHKP] Fix for PR79990)
On Tue, May 09, 2017 at 03:29:40PM +0200, Alexander Ivchenko wrote: > 2017-05-09 Alexander Ivchenko > > * gcc.target/i386/mpx/hard-reg-2-lbv.c: New test. > * gcc.target/i386/mpx/hard-reg-2-nov.c: New test. > * gcc.target/i386/mpx/hard-reg-2-ubv.c: New test. These tests fail for me on i686, without -msse2 there is no "xmm0" register one can use. The following patch fixes it, tested on x86_64-linux and i686-linux, ok for trunk? 2017-06-08 Jakub Jelinek * gcc.target/i386/mpx/hard-reg-1-nov.c (mpx_test): Use "esp" instead of "rsp" for -m32. * gcc.target/i386/mpx/hard-reg-2-lbv.c: Require sse2_runtime effective target, add -msse2 to dg-options. * gcc.target/i386/mpx/hard-reg-2-nov.c: Likewise. * gcc.target/i386/mpx/hard-reg-2-ubv.c: Likewise. --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-1-nov.c.jj 2015-03-10 16:56:41.0 +0100 +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-1-nov.c 2017-06-08 21:37:00.357993146 +0200 @@ -13,7 +13,11 @@ int rd (int *p, int i) int mpx_test (int argc, const char **argv) { +#ifdef __x86_64__ register int *frame __asm__("rsp"); +#else + register int *frame __asm__("esp"); +#endif rd (frame, 1); return 0; --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c.jj 2017-06-08 17:53:25.0 +0200 +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c 2017-06-08 21:37:23.772718716 +0200 @@ -1,6 +1,6 @@ -/* { dg-do run } */ +/* { dg-do run { target sse2_runtime } } */ /* { dg-shouldfail "bounds violation" } */ -/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx -msse2" } */ #define SHOULDFAIL --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c.jj 2017-06-08 17:53:25.0 +0200 +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c 2017-06-08 21:37:35.517581062 +0200 @@ -1,5 +1,5 @@ -/* { dg-do run } */ -/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ +/* { dg-do run { target sse2_runtime } } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx -msse2" } */ #include "mpx-check.h" --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c.jj 2017-06-08 17:53:25.0 +0200 +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c 2017-06-08 21:37:49.910412372 +0200 @@ -1,6 +1,6 @@ -/* { dg-do run } */ +/* { dg-do run { target sse2_runtime } } */ /* { dg-shouldfail "bounds violation" } */ -/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx -msse2" } */ #define SHOULDFAIL Jakub
Re: [PATCH] testsuite: example plugin for spellchecking comments
On Jun 8, 2017, at 3:07 AM, David Malcolm wrote: > > Given that Enchant seems a stretch as a hard dependency, But, by using autoconf, it isn't a hard dependency. If it is there, it is built, if it isn't, it isn't. I think it should be trivial (10-20 lines) to do this. If you want to ship this, I think it makes more sense to just do that. Build and install in the usual way, conditionalizing the build as necessary. I think the autoconf code is 10-20 lines. We could drop this into the testsuite as a test of the infrastructure, if you want; but if you'd like to ship it, I'd rather just start it in the source tree. If you'd prefer to not ship it for now and just have it live in the testsuite, the patch is Ok. I'd like to see it shipped.
Re: [PR80693] drop value of parallel SETs dropped by combine
Hi! [ I missed this patch the first time around; please cc: me to prevent this ] On Thu, May 18, 2017 at 07:25:57AM -0300, Alexandre Oliva wrote: > When an insn used by combine has multiple SETs, only the non-REG_UNUSED > set is used: others will end up dropped on the floor. Sometimes, yes; not always. > We have to take > note of the dropped REG_UNUSED SETs, clearing their cached values, so > that, even if the REGs remain used (e.g. because they were referenced > in the used SET_SRC), we will not use properties of the latest value > as if they applied to the earlier one. The reg_stat stuff is no end of pain, sigh. > PR rtl-optimization/80693 > * combine.c (distribute_notes): Add IDEST parameter. Reset any > REG_UNUSED REGs that are not IDEST, if IDEST is given. Adjust > all callers. Most callers use NULL_RTX for idest. It isn't obvious to me that this is correct. > @@ -14087,6 +14090,26 @@ distribute_notes (rtx notes, rtx_insn *from_insn, > rtx_insn *i3, rtx_insn *i2, > PUT_REG_NOTE_KIND (note, REG_DEAD); > place = i3; > } > + > + /* If there were any parallel sets in FROM_INSN other than > + the one setting IDEST, it must be REG_UNUSED, otherwise > + we could not have used FROM_INSN in combine. Since this > + combine attempt succeeded, we know this unused SET was > + dropped on the floor, because the insn was either deleted > + or created from a new pattern that does not use its > + SET_DEST. We must forget whatever we knew about the > + value that was stored by that SET, since the prior value > + may still be present in IDEST's src expression or > + elsewhere, and we do not want to use properties of the > + dropped value as if they applied to the prior one when > + simplifying e.g. subsequent combine attempts. */ > + if (idest && XEXP (note, 0) != idest) Would it work to just have "else" instead if this "if"? Or hrm, we'll need to kill the recorded reg_stat value in the last case before this as well? > + { > + gcc_assert (REG_P (XEXP (note, 0))); > + record_value_for_reg (XEXP (note, 0), NULL, NULL_RTX); > + INC_REG_N_SETS (REGNO (XEXP (note, 0)), -1); > + } > + > break; Could you try that out? Or I can do it, let me know what you prefer. Thanks, Segher
Re: [PATCH] Fix mpx testcases (Re: [CHKP] Fix for PR79990)
2017-06-08 22:45 GMT+03:00 Jakub Jelinek : > On Tue, May 09, 2017 at 03:29:40PM +0200, Alexander Ivchenko wrote: >> 2017-05-09 Alexander Ivchenko >> >> * gcc.target/i386/mpx/hard-reg-2-lbv.c: New test. >> * gcc.target/i386/mpx/hard-reg-2-nov.c: New test. >> * gcc.target/i386/mpx/hard-reg-2-ubv.c: New test. > > These tests fail for me on i686, without -msse2 there is no > "xmm0" register one can use. > > The following patch fixes it, tested on x86_64-linux and i686-linux, > ok for trunk? OK. Thanks for the fix. Ilya > > 2017-06-08 Jakub Jelinek > > * gcc.target/i386/mpx/hard-reg-1-nov.c (mpx_test): Use "esp" > instead of "rsp" for -m32. > * gcc.target/i386/mpx/hard-reg-2-lbv.c: Require sse2_runtime effective > target, add -msse2 to dg-options. > * gcc.target/i386/mpx/hard-reg-2-nov.c: Likewise. > * gcc.target/i386/mpx/hard-reg-2-ubv.c: Likewise. > > --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-1-nov.c.jj 2015-03-10 > 16:56:41.0 +0100 > +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-1-nov.c 2017-06-08 > 21:37:00.357993146 +0200 > @@ -13,7 +13,11 @@ int rd (int *p, int i) > > int mpx_test (int argc, const char **argv) > { > +#ifdef __x86_64__ >register int *frame __asm__("rsp"); > +#else > + register int *frame __asm__("esp"); > +#endif >rd (frame, 1); > >return 0; > --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c.jj 2017-06-08 > 17:53:25.0 +0200 > +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c 2017-06-08 > 21:37:23.772718716 +0200 > @@ -1,6 +1,6 @@ > -/* { dg-do run } */ > +/* { dg-do run { target sse2_runtime } } */ > /* { dg-shouldfail "bounds violation" } */ > -/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ > +/* { dg-options "-fcheck-pointer-bounds -mmpx -msse2" } */ > > > #define SHOULDFAIL > --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c.jj 2017-06-08 > 17:53:25.0 +0200 > +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c 2017-06-08 > 21:37:35.517581062 +0200 > @@ -1,5 +1,5 @@ > -/* { dg-do run } */ > -/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ > +/* { dg-do run { target sse2_runtime } } */ > +/* { dg-options "-fcheck-pointer-bounds -mmpx -msse2" } */ > > #include "mpx-check.h" > > --- gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c.jj 2017-06-08 > 17:53:25.0 +0200 > +++ gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c 2017-06-08 > 21:37:49.910412372 +0200 > @@ -1,6 +1,6 @@ > -/* { dg-do run } */ > +/* { dg-do run { target sse2_runtime } } */ > /* { dg-shouldfail "bounds violation" } */ > -/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ > +/* { dg-options "-fcheck-pointer-bounds -mmpx -msse2" } */ > > > #define SHOULDFAIL > > > Jakub
Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)
On 06/07/2017 07:09 PM, Jason Merrill wrote: On 06/06/2017 03:24 PM, Martin Sebor wrote: + /* Iterate over copy and move assignment overloads. */ + + for (ovl_iterator oi (fns); oi; ++oi) +{ + tree f = *oi; + + bool accessible = !access || !(TREE_PRIVATE (f) || TREE_PROTECTED (f)); + + /* Skip template assignment operators and deleted functions. */ + if (TREE_CODE (f) != FUNCTION_DECL || DECL_DELETED_FN (f)) +continue; + + if (accessible) +*hasassign = true; + + if (!accessible || !trivial_fn_p (f)) +all_trivial = false; + + /* Break early when both properties have been determined. */ + if (*hasassign && !all_trivial) +break; +} This is iterating over all assignment operators, not just copy/move. I think you want copy_fn_p, here and in has_trivial_copy_p. I assumed the cp_ in cp_assignment_operator_id meant copy. Clearly, even a 1,500 line test wasn't enough to catch this mistake. I've added more test cases. FWIW, giving the macro a more descriptive name and/or adding a comment above it explaining that's any assignment would help. I also expected to see a similar thing for ctor but the closest I could find was the three xxx_ctor_identifiers (or, conversely, I expected to see an assignment_identifier). I picked one based on some of its uses, but I'm sure that it's the best choice (ctor_identifier seems to work too). It seems redundant to check access here and also check is is_trivially_xible, which takes access into account. The access check is done separately so that the right suggestion can be made. If the class is (non-trivially) copy assignable but none of the copy assignment operators is public then the warning doesn't suggest to use the copy assignment if it's not accessible. And if we're going to check access at all, it should consider the access of the current scope, not just whether the function is private or protected. Okay, I've added this. Please see the attached update. Martin PR c++/80560 - warn on undefined memory operations involving non-trivial types gcc/c-family/ChangeLog: PR c++/80560 * c.opt (-Wclass-memaccess): New option. gcc/cp/ChangeLog: PR c++/80560 * call.c (first_non_public_field, maybe_warn_class_memaccess): New functions. (has_trivial_copy_assign_p, has_trivial_copy_p): Ditto. (build_cxx_call): Call maybe_warn_class_memaccess. gcc/ChangeLog: PR c++/80560 * dumpfile.c (dump_register): Avoid calling memset to initialize a class with a default ctor. * gcc.c (struct compiler): Remove const qualification. * genattrtab.c (gen_insn_reserv): Replace memset with initialization. * hash-table.h: Ditto. * ipa-cp.c (allocate_and_init_ipcp_value): Replace memset with assignment. * ipa-prop.c (ipa_free_edge_args_substructures): Ditto. * omp-low.c (lower_omp_ordered_clauses): Replace memset with default ctor. * params.h (struct param_info): Make struct members non-const. * tree-switch-conversion.c (emit_case_bit_tests): Replace memset with default initialization. * vec.h (vec_copy_construct, vec_default_construct): New helper functions. (vec::copy, vec::splice, vec::reserve): Replace memcpy with vec_copy_construct. (vect::quick_grow_cleared): Replace memset with default ctor. (vect::vec_safe_grow_cleared, vec_safe_grow_cleared): Same. * doc/invoke.texi (-Wclass-memaccess): Document. libcpp/ChangeLog: PR c++/80560 * line-map.c (line_maps::~line_maps): Avoid calling htab_delete with a null pointer. (linemap_init): Avoid calling memset on an object of a non-trivial type. libitm/ChangeLog: PR c++/80560 * beginend.cc (GTM::gtm_thread::rollback): Avoid calling memset on an object of a non-trivial type. (GTM::gtm_transaction_cp::commit): Use assignment instead of memcpy to copy an object. * method-ml.cc (orec_iterator::reinit): Avoid -Wclass-memaccess. gcc/testsuite/ChangeLog: PR c++/80560 * g++.dg/Wclass-memaccess.C: New test. diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 37bb236..363d104 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -804,6 +804,10 @@ Wnon-template-friend C++ ObjC++ Var(warn_nontemplate_friend) Init(1) Warning Warn when non-templatized friend functions are declared within a template. +Wclass-memaccess +C++ ObjC++ Var(warn_class_memaccess) Warning LangEnabledBy(C++ ObjC++, Wall) +Warn for unsafe raw memory writes to objects of class types. + Wnon-virtual-dtor C++ ObjC++ Var(warn_nonvdtor) Warning LangEnabledBy(C++ ObjC++,Weffc++) Warn about non-virtual destructors. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 51260f0..6e0951e 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -8146,6 +8146,402 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain) return call; } +/* Return the DECL of the first non-public data member of class TYPE + or null if none can be found. */ + +static tree +first_non_public_field (tree type) +{ + if (!CLASS_TYPE_P (type)) +return NULL_TRE
Fix genmultilib reuse rule checks for large sets of option combinations
genmultilib computes combination_space, a list of all combinations of options in MULTILIB_OPTIONS that might have multilibs built for them (some of which may end up not having multilibs built for them, and some of those may end up being mapped to other multilibs with MULTILIB_REUSE). It is then used to validate the right hand part of MULTILIB_REUSE rules, checking with expr that combination_space matches a basic regular expression derived from that right hand part. There are two problems with this approach to validation: * It requires that right hand part to have options in the same order as in MULTILIB_OPTIONS, in contradiction to the documentation of MULTILIB_REUSE saying that order does not matter there. * combination_space can be so large that the expr call fails with an E2BIG error. I have a local ARM configuration with 40 multilibs but 3840 combinations of options from MULTILIB_OPTIONS (so 3839 listed in combination_space, since it doesn't list the default multilib) and 996 MULTILIB_REUSE rules. This generates a combination_space string longer than the Linux kernel's MAX_ARG_STRLEN (PAGE_SIZE * 32, the limit on the length of a single argv string), so that expr cannot be run. This patch changes the validation approach to generate a much shorter extended regular expression for any sequence of multilib options in any order, and uses that for the validation instead. Tested with a built for arm-none-eabi --with-multilib-list=aprofile (as a configuration that uses MULTILIB_REUSE). 2017-06-08 Joseph Myers * genmultilib (combination_space): Remove variable. Validate reuse rules against regular expression for any sequence of multilib options in any order. Index: gcc/genmultilib === --- gcc/genmultilib (revision 249028) +++ gcc/genmultilib (working copy) @@ -186,8 +186,7 @@ EOF chmod +x tmpmultilib -combination_space=`initial=/ ./tmpmultilib ${options}` -combinations="$combination_space" +combinations=`initial=/ ./tmpmultilib ${options}` # If there exceptions, weed them out now if [ -n "${exceptions}" ]; then @@ -460,6 +459,15 @@ echo "NULL" echo "};" +# Generate a regular expression to validate option combinations. +options_re= +for set in ${options}; do + for opt in `echo ${set} | sed -e 's_[/|]_ _g'`; do +options_re="${options_re}${options_re:+|}${opt}" + done +done +options_re="^/((${options_re})/)*\$" + # Output rules used for multilib reuse. echo "" echo "static const char *const multilib_reuse_raw[] = {" @@ -473,7 +481,7 @@ # in this variable, it means no multilib will be built for current reuse # rule. Thus the reuse purpose specified by current rule is meaningless. if expr "${combinations} " : ".*/${combo}/.*" > /dev/null; then -if expr "${combination_space} " : ".*/${copts}/.*" > /dev/null; then +if echo "/${copts}/" | grep -E "${options_re}" > /dev/null; then combo="/${combo}/" dirout=`./tmpmultilib3 "${combo}" "${todirnames}" "${toosdirnames}" "${enable_multilib}"` copts="/${copts}/" -- Joseph S. Myers jos...@codesourcery.com
[gomp4] Properly handle allocatable scalars in acc update.
This patch fixes a bug I introduced while adding support for allocatable scalars back in April. Before, gfc_trans_oacc_executable_directive would indiscriminately promote all GOMP_MAP_FIRSTPRIVATE_POINTER data mappings to GOMP_MAP_ALWAYS_POINTER, because for allocatable scalars the runtime needs to update the on-device values of both the data being pointed to and the pointer itself. The problem with this is that the allocatable scalar bit isn't propagated across subroutine calls. Clearly this indiscriminate update behavior is wrong because the pointer of dummy arguments may not get mapped onto the accelerator. In this patch, I introduced a new acc_update bit inside the gfc_omp_clauses struct to represent when allocatable scalars need special treatment. As it stands, gfc_trans_omp_clauses_1 already has too many arguments, so I thought this approach would be better. Now gfc_trans_omp_clauses_1 can handle the allocatable scalar update directly. I've applied this patch to gomp-4_0-branch. Cesar 2017-06-08 Cesar Philippidis gcc/fortran/ * gfortran.h gfc_omp_clauses: Add update_allocatable. * trans-openmp.c (gfc_trans_omp_clauses_1): Set update_allocatable bit in clauses. Remove GOMP_MAP_FIRSTPRIVATE_POINTER data map promotions for acc update. (gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER for allocatable scalar data clauses inside acc update directives. libgomp/ * testsuite/libgomp.oacc-fortran/allocatable-array-1.f90: New test. diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 8ca9f8a..0290efe 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -1318,7 +1318,7 @@ typedef struct gfc_omp_clauses gfc_expr_list *tile_list; unsigned async:1, gang:1, worker:1, vector:1, seq:1, independent:1; unsigned wait:1, par_auto:1, gang_static:1, nohost:1, acc_collapse:1, bind:1; - unsigned if_present:1, finalize:1; + unsigned if_present:1, finalize:1, update_allocatable:1; locus loc; char bind_name[GFC_MAX_SYMBOL_LEN+1]; } diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index d5e2250..ca41903 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -2133,6 +2133,10 @@ gfc_trans_omp_clauses_1 (stmtblock_t *block, gfc_omp_clauses *clauses, enum gomp_map_kind gmk = GOMP_MAP_FIRSTPRIVATE_POINTER; if (n->u.map_op == OMP_MAP_FORCE_DEVICEPTR) gmk = GOMP_MAP_POINTER; + else if (GFC_DECL_GET_SCALAR_ALLOCATABLE (decl) + && (n->sym->attr.oacc_declare_create) + && clauses->update_allocatable) + gmk = GOMP_MAP_ALWAYS_POINTER; node4 = build_omp_clause (input_location, OMP_CLAUSE_MAP); OMP_CLAUSE_SET_MAP_KIND (node4, gmk); @@ -3323,12 +3327,14 @@ gfc_trans_oacc_executable_directive (gfc_code *code) { stmtblock_t block; tree stmt, oacc_clauses; + gfc_omp_clauses *clauses = code->ext.omp_clauses; enum tree_code construct_code; switch (code->op) { case EXEC_OACC_UPDATE: construct_code = OACC_UPDATE; + clauses->update_allocatable = 1; break; case EXEC_OACC_ENTER_DATA: construct_code = OACC_ENTER_DATA; @@ -3344,20 +3350,7 @@ gfc_trans_oacc_executable_directive (gfc_code *code) } gfc_start_block (&block); - oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses, - code->loc); - - /* Promote GOMP_MAP_FIRSTPRIVATE_POINTER to GOMP_MAP_ALWAYS_POINTER for - variables inside OpenACC update directives. */ - if (code->op == EXEC_OACC_UPDATE) -for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c)) - { -if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER) - OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_ALWAYS_POINTER); - } - - + oacc_clauses = gfc_trans_omp_clauses (&block, clauses, code->loc); stmt = build1_loc (input_location, construct_code, void_type_node, oacc_clauses); gfc_add_expr_to_block (&block, stmt); diff --git a/libgomp/testsuite/libgomp.oacc-fortran/allocatable-array-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/allocatable-array-1.f90 new file mode 100644 index 000..4fc7b02 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/allocatable-array-1.f90 @@ -0,0 +1,28 @@ +! Ensure that dummy arguments of allocatable arrays don't cause +! "libgomp: [...] is not mapped" errors. + +program main + integer, parameter :: n = 40 + integer, allocatable :: ar(:,:,:) + integer :: i + + allocate (ar(1:n,0:n-1,0:n-1)) + !$acc enter data copyin (ar) + + !$acc update host (ar) + + !$acc update device (ar) + + call update_ar (ar, n) + + !$acc exit data copyout (ar) +end program main + +subroutine update_ar (ar, n) + integer :: n + integer, dimension (1:n,0:n-1,0:n-1) :: ar + + !$acc update host (ar) + + !$acc update device (ar) +end subroutine update_ar
Re: [PATCH 1/3] Come up with selftests for predict.c.
> On 06/06/2017 12:44 PM, David Malcolm wrote: > > On Tue, 2017-06-06 at 10:55 +0200, marxin wrote: > > > > Some minor nits below. > > > >> gcc/ChangeLog: > >> > >> 2017-05-26 Martin Liska > >> > >>* predict.c (struct branch_predictor): New struct. > >>(test_prediction_value_range): New test. > >>(predict_tests_c_tests): New function. > >>* selftest-run-tests.c (selftest::run_tests): Run the function. > >>* selftest.h: Declare new tests. Ok, thanks (also do David for pointing out issues) Honza
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
Tamar, This patch has caused a large number of new failures on AIX, including compiler ICEs. Thanks, David
[PATCH] PR arm/80986
Test case should be the same as 80986. -- Lin Zuojian diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 52890d03f9a..0ab500a9a7c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -315,7 +315,7 @@ static const struct attribute_spec arm_attribute_table[] = { "short_call", 0, 0, false, true, true, NULL, false }, /* Specify the procedure call conventions for a function. */ { "pcs", 1, 1, false, true, true, arm_handle_pcs_attribute, -false }, +true }, /* Interrupt Service Routines have special prologue and epilogue requirements. */ { "isr", 0, 1, false, false, false, arm_handle_isr_attribute, false },
[PATCH] rs6000: Don't add an immediate to r0 (PR80966)
If there is a large stack frame the rs6000 -fstack-limit code would calculate the new stack pointer value using two insns (an addis and an addi), with r0 as temporary. Such instructions do not exist. This patch changes add3 to expand using a different strategy in such cases; to FAIL if there is no way to do it (namely, if the source is r0 and there is no way to get a temporary reg); and it changes rs6000_emit_allocate_stack to assert gen_add3_insn did in fact emit instructions. Tested on powerpc64-linux {-m32,-m64}; committing to trunk. Segher 2017-06-09 Segher Boessenkool PR target/80966 * config/rs6000/rs6000.c (rs6000_emit_allocate_stack): Assert that gen_add3_insn did not fail. * config/rs6000/rs6000.md (add3): If asked to add a constant to r0, construct that number in a temporary reg and add that reg to r0. If asked to put the result in r0 as well, fail. gcc/testsuite/ * gcc.target/powerpc/stack-limit.c: New testcase. --- gcc/config/rs6000/rs6000.c | 8 +--- gcc/config/rs6000/rs6000.md| 11 +++ gcc/testsuite/gcc.target/powerpc/stack-limit.c | 10 ++ 3 files changed, 26 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/stack-limit.c diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ad52eee..32c84cd 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -26940,9 +26940,11 @@ rs6000_emit_allocate_stack (HOST_WIDE_INT size, rtx copy_reg, int copy_off) && REGNO (stack_limit_rtx) > 1 && REGNO (stack_limit_rtx) <= 31) { - emit_insn (gen_add3_insn (tmp_reg, stack_limit_rtx, GEN_INT (size))); - emit_insn (gen_cond_trap (LTU, stack_reg, tmp_reg, - const0_rtx)); + rtx_insn *insn + = gen_add3_insn (tmp_reg, stack_limit_rtx, GEN_INT (size)); + gcc_assert (insn); + emit_insn (insn); + emit_insn (gen_cond_trap (LTU, stack_reg, tmp_reg, const0_rtx)); } else if (GET_CODE (stack_limit_rtx) == SYMBOL_REF && TARGET_32BIT diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 08d3e1b..bb505c5 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1629,6 +1629,17 @@ (define_expand "add3" || rtx_equal_p (operands[0], operands[1])) ? operands[0] : gen_reg_rtx (mode)); + /* Adding a constant to r0 is not a valid insn, so use a different +strategy in that case. */ + if (REGNO (operands[1]) == 0 || REGNO (tmp) == 0) + { + if (operands[0] == operands[1]) + FAIL; + rs6000_emit_move (operands[0], operands[2], mode); + emit_insn (gen_add3 (operands[0], operands[1], operands[0])); + DONE; + } + HOST_WIDE_INT val = INTVAL (operands[2]); HOST_WIDE_INT low = ((val & 0x) ^ 0x8000) - 0x8000; HOST_WIDE_INT rest = trunc_int_for_mode (val - low, mode); diff --git a/gcc/testsuite/gcc.target/powerpc/stack-limit.c b/gcc/testsuite/gcc.target/powerpc/stack-limit.c new file mode 100644 index 000..e676c96 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/stack-limit.c @@ -0,0 +1,10 @@ +/* { dg-options "-O0 -fstack-limit-register=r14" } */ + +// PR80966 + +int foo (int i) +{ + char arr[135000]; + + arr[i] = 0; +} -- 1.9.3
Go patch committed: More lvalue marking fixes
This patch to the Go frontend by Than McIntosh fixes an lvalue/rvalue context mixup Set_and_use_temporary_expression's do_get_backend() method, and enhances Mark_lvalue_varexprs to handle conversions and temporary reference expressions, since occasionally the front end emits code such as "deref(conv(tempref)) = ...". Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 249033) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -82961ce59e8bb02598d963d2a05b3acca860d9dd +d4875b19266d5f726e0e32843b903075f5c50b4c The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc(revision 249029) +++ gcc/go/gofrontend/expressions.cc(working copy) @@ -962,7 +962,7 @@ Set_and_use_temporary_expression::do_get Location loc = this->location(); Gogo* gogo = context->gogo(); Bvariable* bvar = this->statement_->get_backend_variable(context); - Bexpression* lvar_ref = gogo->backend()->var_expression(bvar, VE_rvalue, loc); + Bexpression* lvar_ref = gogo->backend()->var_expression(bvar, VE_lvalue, loc); Named_object* fn = context->function(); go_assert(fn != NULL); @@ -970,7 +970,7 @@ Set_and_use_temporary_expression::do_get Bexpression* bexpr = this->expr_->get_backend(context); Bstatement* set = gogo->backend()->assignment_statement(bfn, lvar_ref, bexpr, loc); - Bexpression* var_ref = gogo->backend()->var_expression(bvar, VE_lvalue, loc); + Bexpression* var_ref = gogo->backend()->var_expression(bvar, VE_rvalue, loc); Bexpression* ret = gogo->backend()->compound_expression(set, var_ref, loc); return ret; } Index: gcc/go/gofrontend/statements.cc === --- gcc/go/gofrontend/statements.cc (revision 249028) +++ gcc/go/gofrontend/statements.cc (working copy) @@ -912,6 +912,21 @@ int Mark_lvalue_varexprs::expression(Exp if (ue && ue->op() == OPERATOR_MULT) return TRAVERSE_CONTINUE; + Type_conversion_expression* ce = e->conversion_expression(); + if (ce) +return TRAVERSE_CONTINUE; + + Temporary_reference_expression* tre = + e->temporary_reference_expression(); + if (tre) +{ + tre = new Temporary_reference_expression(tre->statement(), + tre->location()); + *ppexpr = tre; + tre->set_is_lvalue(); + return TRAVERSE_EXIT; +} + return TRAVERSE_EXIT; }
Patch RFC: disable block partitioning with split stack
This recent patch https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00518.html turned on block partitioning even when profiling is not enabled. This can break code compiled with -fsplit-stack, if the cold partition calls a function that is not compiled with -fsplit-stack (such as a C library function). The problem is that when the linker sees a split-stack function call a non-split-stack function, it adjusts the function header to request more stack space. This doesn't work if the call is in the cold partition, as the linker doesn't know how to find the header to adjust. You can see this by trying to build the Go library using the gold linker with this patch. This patch fixes the problem by disabling an implicit -freorder-blocks-and-partition when -fsplit-stack is turned on. If the user explicitly requested -freorder-blocks-and-partition we leave it alone, assuming that they know what they are doing. But if it was only turned it on implicitly because of the optimization level we turn it off. This is done in both the main option handling code and in the Go frontend option handling, because the Go frontend implicitly turns on -fsplit-stack itself. The test case in the patch fails to build when using the gold linker without the patch. Bootstrapped and ran split-stack test cases on x86_64-pc-linux-gnu. I plan to commit this tomorrow unless I hear objections. Ian 2017-06-08 Ian Lance Taylor * opts.c (finish_options): If -fsplit-stack, disable implicit -forder-blocks-and-partition. 2017-06-08 Ian Lance Taylor * go-lang.c (go_langhook_post_options): If -fsplit-stack is turned on, disable implicit -forder-blocks-and-partition. 2017-06-08 Ian Lance Taylor * gcc.dg/tree-prof/split-1.c: New test. Index: opts.c === --- opts.c (revision 249028) +++ opts.c (working copy) @@ -863,6 +863,16 @@ finish_options (struct gcc_options *opts opts->x_flag_reorder_blocks = 1; } + /* If stack splitting is turned on, and the user did not explicitly + request function partitioning, turn off partitioning, as it + confuses the linker when trying to handle partitioned split-stack + code that calls a non-split-stack functions. But if partitioning + was turned on explicitly just hope for the best. */ + if (opts->x_flag_split_stack + && opts->x_flag_reorder_blocks_and_partition + && !opts_set->x_flag_reorder_blocks_and_partition) +opts->x_flag_reorder_blocks_and_partition = 0; + if (opts->x_flag_reorder_blocks_and_partition && !opts_set->x_flag_reorder_functions) opts->x_flag_reorder_functions = 1; Index: go/go-lang.c === --- go/go-lang.c(revision 249028) +++ go/go-lang.c(working copy) @@ -304,6 +304,15 @@ go_langhook_post_options (const char **p && targetm_common.supports_split_stack (false, &global_options)) global_options.x_flag_split_stack = 1; + /* If stack splitting is turned on, and the user did not explicitly + request function partitioning, turn off partitioning, as it + confuses the linker when trying to handle partitioned split-stack + code that calls a non-split-stack function. */ + if (global_options.x_flag_split_stack + && global_options.x_flag_reorder_blocks_and_partition + && !global_options_set.x_flag_reorder_blocks_and_partition) +global_options.x_flag_reorder_blocks_and_partition = 0; + /* Returning false means that the backend should be used. */ return false; } Index: testsuite/gcc.dg/tree-prof/split-1.c === --- testsuite/gcc.dg/tree-prof/split-1.c(revision 0) +++ testsuite/gcc.dg/tree-prof/split-1.c(working copy) @@ -0,0 +1,41 @@ +/* Test case that we don't get a link-time error when using + -fsplit-stack with -freorder-blocks-and-partition. */ +/* { dg-require-effective-target freorder } */ +/* { dg-options "-O2 -fsplit-stack" } */ + +extern unsigned int sleep (unsigned int); + +#define SIZE 1 + +const char *sarr[SIZE]; +const char *buf_hot; +const char *buf_cold; + +__attribute__((noinline)) +void +foo (int path) +{ + int i; + if (path) +{ + for (i = 0; i < SIZE; i++) + sarr[i] = buf_hot; +} + else +{ + for (i = 0; i < SIZE; i++) + sarr[i] = buf_cold; + sleep (0); +} +} + +int +main (int argc, char *argv[]) +{ + int i; + buf_hot = "hello"; + buf_cold = "world"; + for (i = 0; i < 100; i++) +foo (argc); + return 0; +}
Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
On Thu, 8 Jun 2017, David Edelsohn wrote: > Tamar, > > This patch has caused a large number of new failures on AIX, including > compiler ICEs. It also caused ICEs building glibc for all powerpc configurations (i.e., configurations using IBM long double). https://sourceware.org/ml/libc-testresults/2017-q2/msg00318.html -- Joseph S. Myers jos...@codesourcery.com
Re: Statically propagate basic blocks which are likely executed 0 times
On 8 June 2017 14:52:49 CEST, Jan Hubicka wrote: >Hi, >this patch adds static code to detect basic block with 0 execution >count. >Those are basic block either reached only by EH or those which leads to >call of >function decorated with cold attribute. > >Function decorated by noreturn is not sufficient, because exit(0) is a >call that >is executed in most of program executions. But aren't paths to exit except in main cold resp unlikely, at least exit (!0)? > >Note that internally we have cold and unlikely functions where the >first is >optimized for size but the second is known to be unlikely executed by >the >program and is offloaded to special unlikely section. >Perhaps we want to expose this to user and also distinguish between >cold and >unlikely function attributes. I guess this however can be done >incrementally. > >As a followup I will decoreate trap/abort and friends with cold >attributes. > >Bootstrapped/regtested x86_64-linux, will commit it shortly. > >+/* Return true if STMT is known to be unlikely executed. */ >+ >+static bool >+unlikely_executed_stmt_p (gimple *stmt) >+{ >+ if (!is_gimple_call (stmt)) >+return false; >+ /* NORETURN attribute enough is not strong enough: exit() may be s/attribute enough/attribute alone/ >quite >+ likely executed once during program run. */ >+ if (gimple_call_fntype (stmt) >+ && lookup_attribute ("cold", >+ TYPE_ATTRIBUTES (gimple_call_fntype (stmt))) >+ && !lookup_attribute ("cold", DECL_ATTRIBUTES >(current_function_decl))) >+return true; >+ tree decl = gimple_call_fndecl (stmt); >+ if (!decl) >+return false; Can decl ever be NULL here? >+ if (lookup_attribute ("cold", DECL_ATTRIBUTES (decl)) >+ && !lookup_attribute ("cold", DECL_ATTRIBUTES >(current_function_decl))) >+return true; >+ >+ cgraph_node *n = cgraph_node::get (decl); >+ if (!n) >+return false; >+ enum availability avail; Didn't style want us to omit enum here since its C++ now? >+ n = n->ultimate_alias_target (&avail); >+ if (avail < AVAIL_AVAILABLE) >+return NULL; false thanks, >+ if (!n->analyzed >+ || n->decl == current_function_decl) >+return false; >+ return n->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED; >+}