[PATCH] Fix PR87917
The following side-steps a possible issue with the evolution_function_is_affine_multivariate_p predicate by guarding the call to analyze_subscript_affine_affine in analyze_miv_subscript in the same way as the call from analyze_siv_subscript. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2018-11-15 Richard Biener PR middle-end/87917 * tree-data-ref.c (analyze_miv_subscript): Guard calls to analyze_subscript_affine_affine properly. * gcc.dg/tree-ssa/pr87917.c: New testcase. Index: gcc/tree-data-ref.c === --- gcc/tree-data-ref.c (revision 266145) +++ gcc/tree-data-ref.c (working copy) @@ -3994,9 +3993,9 @@ analyze_miv_subscript (tree chrec_a, dependence_stats.num_miv_independent++; } - else if (evolution_function_is_affine_multivariate_p (chrec_a, loop_nest->num) + else if (evolution_function_is_affine_in_loop (chrec_a, loop_nest->num) && !chrec_contains_symbols (chrec_a) - && evolution_function_is_affine_multivariate_p (chrec_b, loop_nest->num) + && evolution_function_is_affine_in_loop (chrec_b, loop_nest->num) && !chrec_contains_symbols (chrec_b)) { /* testsuite/.../ssa-chrec-35.c Index: gcc/testsuite/gcc.dg/tree-ssa/pr87917.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr87917.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/pr87917.c (working copy) @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-loop-distribute-patterns -ftrapv -fno-tree-fre" } */ + +void foo(int x[]) +{ + int i, j; + + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + { + x[i] = x[i*j]; + x[i] = x[i*j]; + } +}
Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
On Wed, 14 Nov 2018, Andrew Pinski wrote: > On Fri, May 13, 2016 at 3:51 AM Richard Biener wrote: > > > > > > The following patch adds BIT_FIELD_INSERT, an operation to > > facilitate doing bitfield inserts on registers (as opposed > > to currently where we'd have a BIT_FIELD_REF store). > > > > Originally this was developed as part of bitfield lowering > > where bitfield stores were lowered into read-modify-write > > cycles and the modify part, instead of doing shifting and masking, > > be kept in a more high-level form to ease combining them. > > > > A second use case (the above is still valid) is vector element > > inserts which we currently can only do via memory or > > by extracting all components and re-building the vector using > > a CONSTRUCTOR. For this second use case I added code > > re-writing the BIT_FIELD_REF stores the C family FEs produce > > into BIT_FIELD_INSERT when update-address-taken can otherwise > > re-write a decl into SSA form (the testcase shows we miss > > a similar opportunity with the MEM_REF form of a vector insert, > > I plan to fix that for the final submission). > > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF > > is that the size of the insertion is given implicitely via the > > type size/precision of the value to insert. That avoids > > introducing ways to have quaternary ops in folding and GIMPLE stmts. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > Richard. > > > > 2011-06-16 Richard Guenther > > > > PR tree-optimization/29756 > > * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code. > > * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT. > > * fold-const.c (operand_equal_p): Likewise. > > (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT. > > * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT. > > * tree-inline.c (estimate_operator_cost): Likewise. > > * tree-pretty-print.c (dump_generic_node): Likewise. > > * tree-ssa-operands.c (get_expr_operands): Likewise. > > * cfgexpand.c (expand_debug_expr): Likewise. > > * gimple-pretty-print.c (dump_ternary_rhs): Likewise. > > * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT. > > * tree-cfg.c (verify_gimple_assign_ternary): Verify > > BIT_FIELD_INSERT. > > > > * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite > > vector inserts using BIT_FIELD_REF on the lhs. > > (execute_update_addresses_taken): Do it. > > > > * gcc.dg/tree-ssa/vector-6.c: New testcase. > > > > Index: trunk/gcc/expr.c > > === > > *** trunk.orig/gcc/expr.c 2016-05-12 13:40:30.704262951 +0200 > > --- trunk/gcc/expr.c2016-05-12 15:40:32.481225744 +0200 > > *** expand_expr_real_2 (sepops ops, rtx targ > > *** 9358,9363 > > --- 9358,9380 > > target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, > > target); > > return target; > > > > + case BIT_FIELD_INSERT: > > + { > > + unsigned bitpos = tree_to_uhwi (treeop2); > > + unsigned bitsize; > > + if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1))) > > + bitsize = TYPE_PRECISION (TREE_TYPE (treeop1)); > > + else > > + bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1))); > > + rtx op0 = expand_normal (treeop0); > > + rtx op1 = expand_normal (treeop1); > > + rtx dst = gen_reg_rtx (mode); > > + emit_move_insn (dst, op0); > > + store_bit_field (dst, bitsize, bitpos, 0, 0, > > +TYPE_MODE (TREE_TYPE (treeop1)), op1, false); > > + return dst; > > + } > > + > > default: > > gcc_unreachable (); > > } > > Index: trunk/gcc/fold-const.c > > === > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200 > > --- trunk/gcc/fold-const.c 2016-05-13 09:41:13.509812127 +0200 > > *** operand_equal_p (const_tree arg0, const_ > > *** 3163,3168 > > --- 3163,3169 > > > > case VEC_COND_EXPR: > > case DOT_PROD_EXPR: > > + case BIT_FIELD_INSERT: > > return OP_SAME (0) && OP_SAME (1) && OP_SAME (2); > > > > default: > > *** fold_ternary_loc (location_t loc, enum t > > *** 11870,11875 > > --- 11871,11916 > > } > > return NULL_TREE; > > > > + case BIT_FIELD_INSERT: > > + /* Perform (partial) constant folding of BIT_FIELD_INSERT. */ > > + if (TREE_CODE (arg0) == INTEGER_CST > > + && TREE_CODE (arg1) == INTEGER_CST) > > + { > > + unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2); > > + unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1)); > > + wide_int tem = wi::bit_and (arg0, > > + wi::shifted_mask (bit
Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
On Thu, 15 Nov 2018, Richard Biener wrote: > On Wed, 14 Nov 2018, Andrew Pinski wrote: > > > On Fri, May 13, 2016 at 3:51 AM Richard Biener wrote: > > > > > > > > > The following patch adds BIT_FIELD_INSERT, an operation to > > > facilitate doing bitfield inserts on registers (as opposed > > > to currently where we'd have a BIT_FIELD_REF store). > > > > > > Originally this was developed as part of bitfield lowering > > > where bitfield stores were lowered into read-modify-write > > > cycles and the modify part, instead of doing shifting and masking, > > > be kept in a more high-level form to ease combining them. > > > > > > A second use case (the above is still valid) is vector element > > > inserts which we currently can only do via memory or > > > by extracting all components and re-building the vector using > > > a CONSTRUCTOR. For this second use case I added code > > > re-writing the BIT_FIELD_REF stores the C family FEs produce > > > into BIT_FIELD_INSERT when update-address-taken can otherwise > > > re-write a decl into SSA form (the testcase shows we miss > > > a similar opportunity with the MEM_REF form of a vector insert, > > > I plan to fix that for the final submission). > > > > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF > > > is that the size of the insertion is given implicitely via the > > > type size/precision of the value to insert. That avoids > > > introducing ways to have quaternary ops in folding and GIMPLE stmts. > > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > > > Richard. > > > > > > 2011-06-16 Richard Guenther > > > > > > PR tree-optimization/29756 > > > * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code. > > > * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT. > > > * fold-const.c (operand_equal_p): Likewise. > > > (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT. > > > * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT. > > > * tree-inline.c (estimate_operator_cost): Likewise. > > > * tree-pretty-print.c (dump_generic_node): Likewise. > > > * tree-ssa-operands.c (get_expr_operands): Likewise. > > > * cfgexpand.c (expand_debug_expr): Likewise. > > > * gimple-pretty-print.c (dump_ternary_rhs): Likewise. > > > * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT. > > > * tree-cfg.c (verify_gimple_assign_ternary): Verify > > > BIT_FIELD_INSERT. > > > > > > * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite > > > vector inserts using BIT_FIELD_REF on the lhs. > > > (execute_update_addresses_taken): Do it. > > > > > > * gcc.dg/tree-ssa/vector-6.c: New testcase. > > > > > > Index: trunk/gcc/expr.c > > > === > > > *** trunk.orig/gcc/expr.c 2016-05-12 13:40:30.704262951 +0200 > > > --- trunk/gcc/expr.c2016-05-12 15:40:32.481225744 +0200 > > > *** expand_expr_real_2 (sepops ops, rtx targ > > > *** 9358,9363 > > > --- 9358,9380 > > > target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, > > > target); > > > return target; > > > > > > + case BIT_FIELD_INSERT: > > > + { > > > + unsigned bitpos = tree_to_uhwi (treeop2); > > > + unsigned bitsize; > > > + if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1))) > > > + bitsize = TYPE_PRECISION (TREE_TYPE (treeop1)); > > > + else > > > + bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1))); > > > + rtx op0 = expand_normal (treeop0); > > > + rtx op1 = expand_normal (treeop1); > > > + rtx dst = gen_reg_rtx (mode); > > > + emit_move_insn (dst, op0); > > > + store_bit_field (dst, bitsize, bitpos, 0, 0, > > > +TYPE_MODE (TREE_TYPE (treeop1)), op1, false); > > > + return dst; > > > + } > > > + > > > default: > > > gcc_unreachable (); > > > } > > > Index: trunk/gcc/fold-const.c > > > === > > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200 > > > --- trunk/gcc/fold-const.c 2016-05-13 09:41:13.509812127 +0200 > > > *** operand_equal_p (const_tree arg0, const_ > > > *** 3163,3168 > > > --- 3163,3169 > > > > > > case VEC_COND_EXPR: > > > case DOT_PROD_EXPR: > > > + case BIT_FIELD_INSERT: > > > return OP_SAME (0) && OP_SAME (1) && OP_SAME (2); > > > > > > default: > > > *** fold_ternary_loc (location_t loc, enum t > > > *** 11870,11875 > > > --- 11871,11916 > > > } > > > return NULL_TREE; > > > > > > + case BIT_FIELD_INSERT: > > > + /* Perform (partial) constant folding of BIT_FIELD_INSERT. */ > > > + if (TREE_CODE (arg0) == INTEGER_CST > > > + && TREE_CODE (arg1) == INTEGE
Re: [PATCH] Fix ICE in fixup_abnormal_edges (PR rtl-optimization/88018)
On Thu, 15 Nov 2018, Jakub Jelinek wrote: > Hi! > > On the following testcase, we have a call (not marked noreturn), which can > throw, followed immediately by __builtin_unreachable (), so it effectively > is noreturn in this particular call site (i.e. if it returns, it is UB). > In RTL this is represented by the corresponding bb having just EH edge > successor, no fallthrough edge. If one of the callers of > fixup_abnormal_edges (the stack pass in this case, or LRA or reload) adds > some instructions after the call and calls fixup_abnormal_edges, it ICEs, > as it tries to delete those insns and insert them on the NULL edge. > Somebody has hit that issue already but solved it for USE insns only, this > patch does that for all the cases where we don't have the fallthru edge. > If there weren't any instructions after the call before one of these 3 > passes, I'd say it must be UB to return from such a call and thus it should > be unimportant if those insns are executed or not. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Richard. > 2018-11-15 Jakub Jelinek > > PR rtl-optimization/88018 > * cfgrtl.c (fixup_abnormal_edges): Guard moving insns to fallthru edge > on the presence of fallthru edge, rather than if it is a USE or not. > > * g++.dg/tsan/pr88018.C: New test. > > --- gcc/cfgrtl.c.jj 2018-11-14 00:55:51.695333876 +0100 > +++ gcc/cfgrtl.c 2018-11-14 17:24:46.414915101 +0100 > @@ -3332,8 +3332,15 @@ fixup_abnormal_edges (void) >If it's placed after a trapping call (i.e. that >call is the last insn anyway), we have no fallthru >edge. Simply delete this use and don't try to insert > - on the non-existent edge. */ > - if (GET_CODE (PATTERN (insn)) != USE) > + on the non-existent edge. > + Similarly, sometimes a call that can throw is > + followed in the source with __builtin_unreachable (), > + meaning that there is UB if the call returns rather > + than throws. If there weren't any instructions > + following such calls before, supposedly even the ones > + we've deleted aren't significant and can be > + removed. */ > + if (e) > { > /* We're not deleting it, we're moving it. */ > insn->set_undeleted (); > --- gcc/testsuite/g++.dg/tsan/pr88018.C.jj2018-11-14 17:26:46.224944969 > +0100 > +++ gcc/testsuite/g++.dg/tsan/pr88018.C 2018-11-14 17:28:40.057073142 > +0100 > @@ -0,0 +1,6 @@ > +// PR rtl-optimization/88018 > +// { dg-do compile } > +// { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } > +// { dg-options "-fsanitize=thread -fno-ipa-pure-const -O1 > -fno-inline-functions-called-once -w" } > + > +#include "../pr69667.C" > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH][RFC] Come up with -flive-patching master option.
On 11/14/18 6:54 PM, Qing Zhao wrote: > Hi, > > >> On Nov 14, 2018, at 9:03 AM, Martin Liška wrote: >> >>> Yes, you are right. I added this into my patch. >>> >>> I am attaching the new patch here. >> >> Hello. >> >> Please use >> git diff HEAD~ > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py >> /tmp/patch >> >> in order to address many formatting issues of the patch (skip the ones >> reported in common.opt). > > will do and fix the style issues. > >> >>> >>> >>> +flive-patching >>> +Common RejectNegative Alias(flive-patching=,inline-clone) Optimization >>> + >>> +flive-patching= >>> +Common Report Joined RejectNegative Enum(live_patching_level) >>> Var(flag_live_patching) Init(LIVE_NONE) Optimization >>> +-flive-patching=[inline-only-static|inline-clone] Control ipa >>> optimizations to provide a >> >> Please use 'IPA' instead of 'ipa', similarly in documentation. > Okay. >> >>> --- a/gcc/doc/invoke.texi >>> +++ b/gcc/doc/invoke.texi >>> @@ -411,10 +411,11 @@ Objective-C and Objective-C++ Dialects}. >>> -fgcse-sm -fhoist-adjacent-loads -fif-conversion @gol >>> -fif-conversion2 -findirect-inlining @gol >>> -finline-functions -finline-functions-called-once -finline-limit=@var{n} >>> @gol >>> --finline-small-functions -fipa-cp -fipa-cp-clone @gol >>> +-finline-small-functions -fipa-cp -fipa-cp-clone @gol >> >> This changes is probably not intended. > No. will delete it. > >> >>> >>> diff --git a/gcc/flag-types.h b/gcc/flag-types.h >>> index 500f663..72e0f0f 100644 >>> --- a/gcc/flag-types.h >>> +++ b/gcc/flag-types.h >>> @@ -123,6 +123,14 @@ enum stack_reuse_level >>> SR_ALL >>> }; >>> >>> +/* The live patching level. */ >>> +enum live_patching_level >>> +{ >>> + LIVE_NONE = 0, >>> + LIVE_INLINE_ONLY_STATIC, >>> + LIVE_INLINE_CLONE >> >> Maybe better LIVE_PATCHING_INLINE_CLONE, without the 'PATCHING' the enum >> values are bit unclear. > > Okay. >> >>> >>> + /* visibility change should be excluded by !flag_whole_program >>> +&& !in_lto_p && !flag_ipa_cp_clone && !flag_ipa_sra >> >> You added sorry about LTO, maybe then !in_lto_p would be always true? > > Yes, since live-patching does not support LTO currently, !in_lto_p is always > TRUE. that’s the reason no need for a new flag to disable visibility change. Hi. Ok. >> >>> +&& !flag_partial_inlining. */ >>> + if (!opts_set->x_flag_ipa_pta) >>> +opts->x_flag_ipa_pta = 0; >>> + if (!opts_set->x_flag_ipa_reference) >>> +opts->x_flag_ipa_reference = 0; >>> + if (!opts_set->x_flag_ipa_ra) >>> +opts->x_flag_ipa_ra = 0; >>> + if (!opts_set->x_flag_ipa_icf) >>> +opts->x_flag_ipa_icf = 0; >>> + if (!opts_set->x_flag_ipa_icf_functions) >>> +opts->x_flag_ipa_icf_functions = 0; >>> + if (!opts_set->x_flag_ipa_icf_variables) >>> +opts->x_flag_ipa_icf_variables = 0; >>> + if (!opts_set->x_flag_ipa_bit_cp) >>> +opts->x_flag_ipa_bit_cp = 0; >>> + if (!opts_set->x_flag_ipa_vrp) >>> +opts->x_flag_ipa_vrp = 0; >>> + if (!opts_set->x_flag_ipa_pure_const) >> >> Can you please explain why you included: >> if (!opts_set->x_flag_ipa_bit_cp) >>opts->x_flag_ipa_bit_cp = 0; >> if (!opts_set->x_flag_ipa_vrp) >>opts->x_flag_ipa_vrp = 0; > > interprocedural bitwise constant propagation and interprocedural propagation > of value ranges does not involve creating clones, > and the bitwise constant and value ranges info extracted during ipa-cp phase > are used later by other optimizations. their effect on > impact functions are not clear at this moment. that’s the reason I think we > need to disable these two. > > Martin Jambor raised this issue during our previous discussion on 10/03/2018: > “ > I was thinking a bit more about this and recalled that not all stuff > that IPA-CP nowadays does involves creating clones, so we have to add > also: > - -fno-ipa-bit-cp, and > - -fno-ipa-vrp. > > These two just record info in the parameters of *callees* of functions > from which it extracted info, without any cloning involved. Both were > introduced in GCC 7. > > Thanks, > > Martin > “ > and I think he is right. Great, thanks for clarification! I forgot about that. And please can you mention in documentation which options are disabled with -flive-patching=*? We usually do it, e.g. take a look at '-Os' option: ``` ‘-Os’ disables the following optimization flags: -falign-functions -falign-jumps -falign-loops -falign-labels -freorder-blocks -freorder-blocks-algorithm=stc -freorder-blocks-and-partition -fprefetch-loop-arrays ``` > > > >> ? >> >>> +opts->x_flag_ipa_pure_const = 0; >>> + /* unreachable code removal. */ >>> + /* discovery of functions/variables with no address taken. */ >> >> ^^^ these 2 comments looks misaligned. > > will fix them. >> >>> >>> +++ b/gcc/testsuite/gcc.dg/live-patching-1.c >>> @@ -0,0 +1,22 @@ >>> +/* { dg-do compile } */ >>> +/* { d
Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses
Thank you Marek for the inputs . >>In the future, if using diff, please also use the -p option. We are using svn diif and other comments are addressed . please let us know your take on the revised attached patch . Thank you ~Umesh On Thu, Nov 15, 2018 at 12:23 AM Marek Polacek wrote: > > On Wed, Nov 14, 2018 at 09:55:39PM +0530, Umesh Kalappa wrote: > > My bad Marek and thank you for pointing that out. > > > > Please find the attached correct one (pr52869.patch) . > > Index: gcc/cp/ChangeLog > === > --- gcc/cp/ChangeLog(revision 266026) > +++ gcc/cp/ChangeLog(working copy) > @@ -1,3 +1,9 @@ > +2018-11-14 Kamlesh Kumar > + > + PR c++/52869 > + *parser.c () : restore the old current_class_{ptr,ref} by > + inject_this_parameter(). > + > > So the correct CL entry would look like > > 2018-11-14 Kamlesh Kumar > > DR 1207 > PR c++/52869 > * parser.c (cp_parser_noexcept_specification_opt): Call > inject_this_parameter. > > or so. > > Index: gcc/cp/parser.c > === > --- gcc/cp/parser.c (revision 266026) > +++ gcc/cp/parser.c (working copy) > @@ -24615,11 +24615,24 @@ > { >tree expr; >cp_lexer_consume_token (parser->lexer); > - > + > > You're adding whitespaces where they shouldn't be. Let's avoid changes like > these. > >if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN) > { > matching_parens parens; > parens.consume_open (parser); > + > + if (current_class_type) > + inject_this_parameter (current_class_type, TYPE_UNQUALIFIED); > + else > +{ > + /*clear the current_class_ptr for non class type , like > + int foo() noexcept(*this) > + { > + return 1; > + } > + */ > +current_class_ptr = NULL_TREE; > +} > > I don't believe that's what Jason meant by restoring; I think you want > > tree save_ccp = current_class_ptr; > tree save_ccr = current_class_ref; > > inject_this_parameter (current_class_type, TYPE_UNQUALIFIED); > > [...] > > current_class_ptr = save_ccp; > current_class_ref = save_ccr; > > In the future, if using diff, please also use the -p option. > > Index: gcc/testsuite/ChangeLog > === > --- gcc/testsuite/ChangeLog (revision 266026) > +++ gcc/testsuite/ChangeLog (working copy) > @@ -1,3 +1,8 @@ > +2018-11-14 Kamlesh Kumar > + > + PR g++.dg/52869 > + * g++.dg/pr52869.C: New. > > Should be "PR c++/52869". > > Index: gcc/testsuite/g++.dg/pr52869.C > === > --- gcc/testsuite/g++.dg/pr52869.C (nonexistent) > +++ gcc/testsuite/g++.dg/pr52869.C (working copy) > > Maybe move the test to testsuite/g++.dg/DRs? > > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O0 -g" } */ > > Why these options? I don't think you need -g. > > +struct S { > +void f() { } > +void g() noexcept(noexcept(f())) { } > +void h() noexcept(noexcept(this->f())) { } > +}; > + > +struct Nyan { > + constexpr Nyan &operator++() noexcept { return *this; } > + constexpr void omg() noexcept(noexcept(++*this)) {} > +}; > > I was hoping you'd add also a test with 'this' in noexcept in a class > template. > > This test doesn't compile on all dialects: > FAIL: g++.dg/pr52869.C -std=gnu++98 (test for excess errors) > FAIL: g++.dg/pr52869.C -std=gnu++11 (test for excess errors) > > You can run just the new test in all dialects using: > GXX_TESTSUITE_STDS=98,11,14,17,2a make check-c++ RUNTESTFLAGS=dg.exp=pr52869.C > > The noexcept specifier is only in C++11 and newer I think. > > +template< typename T > > +T sine( T const& a, T const& b ) noexcept > +{ > +static_assert( noexcept( T(a / sqrt(a * a + b * b)) ), "throwing expr" > ); > +return a / sqrt(a * a + b * b); > +} > + > +int foo() noexcept > +{ > + return 1; > +} > + > > I don't understand what this part of the test is testing. It compiles even > without the patch. Let's drop it. > > Marek pr52869.patch Description: Binary data
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626
Hi All, The attached patch (pr85667.patch) fixes the subjected issue . we tested on x86_64(linux and windows both) and no regress found . ok to commit ? Thank you ~Umesh pr85667.patch Description: Binary data
Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
On Nov 14 2018, Jozef Lawrynowicz wrote: > The timeout as set in the dejagnu configuration for msp430 > ([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely > hit. I don't think it makes sense for a board file to set a smaller timeout than the default. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)
On 02/11/18 09:54, Christophe Lyon wrote: > Hi, > > I've noticed failure on targets using newlib (aarch64-elf and arm-eabi): > FAIL: gcc.c-torture/execute/printf-2.c > FAIL: gcc.c-torture/execute/user-printf.c > > my gcc.log contains: > gcc.c-torture/execute/user-printf.c -O0 execution test (reason: TCL > LOOKUP CHANNEL exp5) > which is not very helpful > @Christophe Can I ask if your DejaGNU board setup has "needs_status_wrapper 1" for both of these boards? I believe this problem is caused by an interaction with the DejaGNU status wrapper. When the status wrapper is needed, DejaGNU looks at stdout for a line saying "*** EXIT code " indicating what the status code was. When it doesn't find that line it assumes an exit code of 2. Without the status wrapper DejaGNU takes the return code from the program executed. Because these tests use "freopen ()" on stdout, the status wrapper fails to print to the IO channel DejaGNU is listening on, hence DejaGNU fails to find it's line, uses an exit code of 2, and fails the test. @Martin Would these tests still be valid having run freopen on stderr and using fprintf instead of printf? That makes the testcases pass for me. If not we could add an { dg-require-effective-target unwrapped } directive in the testcases to stop the failure complaints.
Re: [PATCH 21/25] GCN Back-end (part 2/2).
On 14/11/2018 22:30, Jeff Law wrote: There's a particular case that has historically been problematical. If you have this kind of sequence in the epilogue restore register using FP move fp->sp (deallocates frame) return Under certain circumstances the scheduler can swap the register restore and move from fp into sp creating something like this: move fp->sp (deallocates frame) restore register using FP (reads from deallocated frame) return That would normally be OK, except if you take an interrupt between the first two instructions. If interrupt handling is done without switching stacks, then the interrupt handler may write into the just de-allocated frame destroying the values that were saved in the prologue. OK, so the barrier needs to be right before the stack pointer moves. I can do that. :-) Presumably the same is true for prologues, except that the barrier needs to be after the stack adjustment. You may not need to worry about that today on the GCN port, but you really want to fix it now so that it's never a problem. You *really* don't want to have to debug this kind of problem in the wild. Been there, done that, more than once :( I'm not exactly sure how interrupts work on this platform -- we've had no use for them yet -- but without a debugger, and with up to 1024 threads running simultaneously, you can be sure I don't want to debug it! I would hazard a guess that combine saw the one without the use as "simpler" and preferred it. I think you've made a bit of a fundamental problem with the way the EXEC register is being handled. Hopefully you can get by with some magic UNSPEC wrappers without having to do too much surgery. Exactly so. An initial experiment with combine re-enabled has not shown any errors, so it's possible the problem has gone away, but I've not been over the full testsuite yet (and you wouldn't expect actual failures anyway). In future, I'd like to have the scheduler insert real instructions into these slots, but that's very much on the to-do list. If you you can model this as a latency between the two points where you need to insert the nops, then the scheduler will fill in what it can. But it doesn't generally handle non-interlocked processors. So you'll still want your little pass to fix things up when the scheduler couldn't find useful work to schedule into those bubbles. Absolutely, the scheduler is about optimization and this md_reorg pass is about correctness. I have no idea whether the architecture has those issues or not. The guideline I would give to determine if you're vulnerable... Do you have speculation, including the ability to speculate past a memory operation, branch prediction, memory caches and high resolution timer (ie, like a cycle timer). If you've got those, then the processor is likely vulnerable to a spectre V1 style attack. Those are the basic building blocks. We have cycle timers and caches, but I'll have to ask AMD about the other details. Andrew
Re: [PATCH][rs6000] inline expansion of memcmp using vsx
On Wed, Nov 14, 2018 at 5:43 PM Aaron Sawdey wrote: > > This patch generalizes some the functions added earlier to do vsx expansion > of strncmp > so that the can also generate the code needed for memcmp. I reorganized > expand_block_compare() a little to be able to make use of this there. The vsx > code is more > compact so I've changed the default block compare inline limit to 63 bytes. > The vsx > code is only used if there is at least 16 bytes to compare as this means we > don't have to > do complex code to compare less than one chunk. If vsx is not available the > limit is cut > in half. The performance is good, vsx memcmp is considerably faster than the > gpr inline code > if the strings are equal and is comparable if the strings have a 10% chance > of being > equal (spread across the string). How is performance affected if there are close earlier char-size stores to one of the string/memory? Can power still do store forwarding in this case? > Currently regtesting, ok for trunk if tests pass? > > Thanks! >Aaron > > 2018-11-14 Aaron Sawdey > > * config/rs6000/rs6000-string.c (emit_vsx_zero_reg): New function. > (expand_cmp_vec_sequence): Rename and modify > expand_strncmp_vec_sequence. > (emit_final_compare_vec): Rename and modify > emit_final_str_compare_vec. > (generate_6432_conversion): New function. > (expand_block_compare): Add support for vsx. > (expand_block_compare_gpr): New function. > * config/rs6000/rs6000.opt (rs6000_block_compare_inline_limit): > Increase > default limit to 63 because of more compact vsx code. > > > > > Index: gcc/config/rs6000/rs6000-string.c > === > --- gcc/config/rs6000/rs6000-string.c (revision 266034) > +++ gcc/config/rs6000/rs6000-string.c (working copy) > @@ -615,6 +615,283 @@ > } > } > > +static rtx > +emit_vsx_zero_reg() > +{ > + unsigned int i; > + rtx zr[16]; > + for (i = 0; i < 16; i++) > +zr[i] = GEN_INT (0); > + rtvec zv = gen_rtvec_v (16, zr); > + rtx zero_reg = gen_reg_rtx (V16QImode); > + rs6000_expand_vector_init (zero_reg, gen_rtx_PARALLEL (V16QImode, zv)); > + return zero_reg; > +} > + > +/* Generate the sequence of compares for strcmp/strncmp using vec/vsx > + instructions. > + > + BYTES_TO_COMPARE is the number of bytes to be compared. > + ORIG_SRC1 is the unmodified rtx for the first string. > + ORIG_SRC2 is the unmodified rtx for the second string. > + S1ADDR is the register to use for the base address of the first string. > + S2ADDR is the register to use for the base address of the second string. > + OFF_REG is the register to use for the string offset for loads. > + S1DATA is the register for loading the first string. > + S2DATA is the register for loading the second string. > + VEC_RESULT is the rtx for the vector result indicating the byte > difference. > + EQUALITY_COMPARE_REST is a flag to indicate we need to make a cleanup call > + to strcmp/strncmp if we have equality at the end of the inline comparison. > + P_CLEANUP_LABEL is a pointer to rtx for a label we generate if we need > code > + to clean up and generate the final comparison result. > + FINAL_MOVE_LABEL is rtx for a label we can branch to when we can just > + set the final result. > + CHECKZERO indicates whether the sequence should check for zero bytes > + for use doing strncmp, or not (for use doing memcmp). */ > +static void > +expand_cmp_vec_sequence (unsigned HOST_WIDE_INT bytes_to_compare, > +rtx orig_src1, rtx orig_src2, > +rtx s1addr, rtx s2addr, rtx off_reg, > +rtx s1data, rtx s2data, rtx vec_result, > +bool equality_compare_rest, rtx *p_cleanup_label, > +rtx final_move_label, bool checkzero) > +{ > + machine_mode load_mode; > + unsigned int load_mode_size; > + unsigned HOST_WIDE_INT cmp_bytes = 0; > + unsigned HOST_WIDE_INT offset = 0; > + rtx zero_reg = NULL; > + > + gcc_assert (p_cleanup_label != NULL); > + rtx cleanup_label = *p_cleanup_label; > + > + emit_move_insn (s1addr, force_reg (Pmode, XEXP (orig_src1, 0))); > + emit_move_insn (s2addr, force_reg (Pmode, XEXP (orig_src2, 0))); > + > + if (checkzero && !TARGET_P9_VECTOR) > +zero_reg = emit_vsx_zero_reg(); > + > + while (bytes_to_compare > 0) > +{ > + /* VEC/VSX compare sequence for P8: > +check each 16B with: > +lxvd2x 32,28,8 > +lxvd2x 33,29,8 > +vcmpequb 2,0,1 # compare strings > +vcmpequb 4,0,3 # compare w/ 0 > +xxlorc 37,36,34 # first FF byte is either mismatch or end of > string > +vcmpequb. 7,5,3 # reg 7 contains 0 > +bnl 6,.Lmismatch > + > +For the P8 LE case, we use lxvd2x and compare full 16 bytes > +but then use use vgbbd and a shift to get two bytes with the > +
Re: Don't use %z printf format length specified
On Wed, Nov 14, 2018 at 6:24 PM Michael Matz wrote: > > Hi, > > On Wed, 14 Nov 2018, Alexander Monakov wrote: > > > On Wed, 14 Nov 2018, Michael Matz wrote: > > > > > Hi, > > > > > > it's not c++98 conforming and I get 1 million warnings when compiling. > > > Initially I had also casts to long at the various arguments, but I get no > > > warning with this variant either, so I removed them again. I'm currently > > > regstrapping this, okay for trunk if successfull? > > > > Surely this will break mingw-w64 where size_t is 64-bit yet long is 32-bit? > > Okay, probably. Then consider the same patch with sprinkling casts to > long for all these arguments (the actual numbers will fit that type in > reality). I could of course also use PRIu64 but that makes my eyes bleed. Hmm. Can you use PRIu64 please and cast to uint64_t? OK with that change. Thanks, Richard. > > Ciao, > Michael.
Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)
On Wed, Nov 14, 2018 at 7:08 PM Jeff Law wrote: > > On 11/6/18 4:56 PM, Martin Sebor wrote: > > In response to Joseph's comment I've removed the interaction > > with -Wpedantic from the updated patch. > > > > In addition, to help detect bugs like the one in the test case > > for pr87886, I have also enhanced the detection of incompatible > > calls to include integer/real type mismatches so that calls like > > the one below are diagnosed: > > > > extern double sqrt (); > > int f (int x) > > { > > return sqrt (x); // passing int where double is expected > > } > > > > With the removal of the -Wpedantic interaction declaring abort() > > without a prototype is no longer diagnosed and so the test suite > > changes to add the prototype are not necessary. I decided not > > to back them out because Jeff indicated a preference for making > > these kinds of improvements in general in an unrelated > > discussion. > > > > > > > > gcc-83656.diff > > > > PR c/83656 - missing -Wbuiltin-declaration-mismatch on declaration without > > prototype > > > > gcc/c/ChangeLog: > > > > PR c/83656 > > * c-decl.c (header_for_builtin_fn): Declare. > > (diagnose_mismatched_decls): Diagnose declarations of built-in > > functions without a prototype. > > * c-typeck.c (maybe_warn_builtin_no_proto_arg): New function. > > (convert_argument): Same. > > (convert_arguments): Factor code out into convert_argument. > > Detect mismatches between built-in formal arguments in calls > > to built-in without prototype. > > (build_conditional_expr): Same. > > (type_or_builtin_type): New function. > > (convert_for_assignment): Add argument. Conditionally issue > > warnings instead of errors for mismatches. > > > > gcc/testsuite/ChangeLog: > > > > PR c/83656 > > * gcc.dg/20021006-1.c > > * gcc.dg/Wbuiltin-declaration-mismatch.c: New test. > > * gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test. > > * gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test. > > * gcc.dg/Wbuiltin-declaration-mismatch-4.c: New test. > > * gcc.dg/Walloca-16.c: Adjust. > > * gcc.dg/Wrestrict-4.c: Adjust. > > * gcc.dg/Wrestrict-5.c: Adjust. > > * gcc.dg/atomic/stdatomic-generic.c: Adjust. > > * gcc.dg/atomic/stdatomic-lockfree.c: Adjust. > > * gcc.dg/initpri1.c: Adjust. > > * gcc.dg/pr15698-1.c: Adjust. > > * gcc.dg/pr69156.c: Adjust. > > * gcc.dg/pr83463.c: Adjust. > > * gcc.dg/redecl-4.c: Adjust. > > * gcc.dg/tls/thr-init-2.c: Adjust. > > * gcc.dg/torture/pr55890-2.c: Adjust. > > * gcc.dg/torture/pr55890-3.c: Adjust. > > * gcc.dg/torture/pr67741.c: Adjust. > > * gcc.dg/torture/stackalign/sibcall-1.c: Adjust. > > * gcc.dg/torture/tls/thr-init-1.c: Adjust. > > * gcc.dg/tree-ssa/builtins-folding-gimple-ub.c: Adjust. > > > > > > @@ -3547,8 +3598,24 @@ convert_arguments (location_t loc, vec > > arg_loc, tree typelist, > >if (parmval == error_mark_node) > > error_args = true; > > > > + if (!type && builtin_type && TREE_CODE (builtin_type) != VOID_TYPE) > > + { > > + /* For a call to a built-in function declared without a prototype, > > + perform the coversions from the argument to the expected type > > + but issue warnings rather than errors for any mismatches. > > + Ignore the converted argument and use the PARMVAL obtained > > + above by applying default coversions instead. */ > s/coversions/conversions/ > > Two of 'em in that comment. OK with that nit fixed. I once had -Wunprototyped-calls... (attached). Because the issue doesn't exist only for builtins... (originally inspired by a Xorg bug) Richard. > jeff > > Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt.orig 2014-11-11 09:38:31.286867405 +0100 +++ gcc/c-family/c.opt 2014-11-17 16:38:44.438078729 +0100 @@ -868,6 +868,10 @@ Wunused-local-typedefs C ObjC C++ ObjC++ Var(warn_unused_local_typedefs) Warning EnabledBy(Wunused) Warn when typedefs locally defined in a function are not used +Wunprototyped-calls +C Var(warn_unprototyped_calls) Warning LangEnabledBy(C,Wall) +Warn about calls to unprototyped functions with at least one argument + Wunused-macros C ObjC C++ ObjC++ CppReason(CPP_W_UNUSED_MACROS) Var(cpp_warn_unused_macros) Warning Warn about macros defined in the main file that are not used Index: gcc/c/c-typeck.c === --- gcc/c/c-typeck.c.orig 2014-11-11 09:38:32.068867371 +0100 +++ gcc/c/c-typeck.c 2014-11-17 16:38:44.442078729 +0100 @@ -2957,6 +2957,19 @@ build_function_call_vec (location_t loc, && !check_builtin_function_arguments (fundecl, nargs, argarray)) return error_mark_node; + /* If we cannot check function arguments because a prototype is + missing for the ca
Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)
On Thu, 15 Nov 2018 at 10:39, Matthew Malcomson wrote: > > On 02/11/18 09:54, Christophe Lyon wrote: > > Hi, > > > > I've noticed failure on targets using newlib (aarch64-elf and arm-eabi): > > FAIL: gcc.c-torture/execute/printf-2.c > > FAIL: gcc.c-torture/execute/user-printf.c > > > > my gcc.log contains: > > gcc.c-torture/execute/user-printf.c -O0 execution test (reason: TCL > > LOOKUP CHANNEL exp5) > > which is not very helpful > > > > @Christophe > Can I ask if your DejaGNU board setup has "needs_status_wrapper 1" for > both of these boards? > Yes, I do use this. > I believe this problem is caused by an interaction with the DejaGNU > status wrapper. > > When the status wrapper is needed, DejaGNU looks at stdout for a line > saying "*** EXIT code " indicating what the status code was. > When it doesn't find that line it assumes an exit code of 2. > Without the status wrapper DejaGNU takes the return code from the > program executed. > > Because these tests use "freopen ()" on stdout, the status wrapper fails > to print to the IO channel DejaGNU is listening on, hence DejaGNU fails > to find it's line, uses an exit code of 2, and fails the test. > > > @Martin > Would these tests still be valid having run freopen on stderr and using > fprintf instead of printf? > That makes the testcases pass for me. > > If not we could add an > { dg-require-effective-target unwrapped } > directive in the testcases to stop the failure complaints.
Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85667
Edited the subjected for the proper PR no. ~Umesh On Thu, Nov 15, 2018 at 2:32 PM Umesh Kalappa wrote: > > Hi All, > > The attached patch (pr85667.patch) fixes the subjected issue . > we tested on x86_64(linux and windows both) and no regress found . > > ok to commit ? > > Thank you > ~Umesh
Re: [PATCH, csky] Update dynamic linker'name
On Thu, Nov 15, 2018 at 7:02 AM 瞿仙淼 wrote: > > Hi, > I have submitted a patch to update dynamic linker'name > > > Index: gcc/ChangeLog > === > --- gcc/ChangeLog (revision 266171) > +++ gcc/ChangeLog (working copy) > @@ -1,3 +1,9 @@ > +2018-11-15 Xianmiao Qu > + > + * config/csky/csky-linux-elf.h (LINUX_DYNAMIC_LINKER): Remove. > + (GLIBC_DYNAMIC_LINKER): Define. > + (LINUX_TARGET_LINK_SPEC): Update the dynamic linker's name. > + > 2018-11-15 Bin Cheng > > PR tree-optimization/84648 > Index: gcc/config/csky/csky-linux-elf.h > === > --- gcc/config/csky/csky-linux-elf.h(revision 266171) > +++ gcc/config/csky/csky-linux-elf.h(working copy) > @@ -61,7 +61,7 @@ >%{mvdsp:-mvdsp} \ >" > > -#define LINUX_DYNAMIC_LINKER "/lib/ld.so.1" > +#define GLIBC_DYNAMIC_LINKER > "/lib/ld-linux-cskyv2%{mhard-float:-hf}%{mbig-endian:-be}.so.1" > > #define LINUX_TARGET_LINK_SPEC "%{h*} %{version:-v}\ > %{b}\ > @@ -70,7 +70,7 @@ > %{symbolic:-Bsymbolic} \ > %{!static: \ > %{rdynamic:-export-dynamic} \ > - %{!shared:-dynamic-linker " LINUX_DYNAMIC_LINKER "}} \ > + %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} \ ^^^ GNU_USER_DYNAMIC_LINKER vs. GLIBC_DYNAMIC_LINKER > -X \ > %{mbig-endian:-EB} %{mlittle-endian:-EL}\ > %{EB:-EB} %{EL:-EL}" > Index: libgcc/ChangeLog > === > --- libgcc/ChangeLog(revision 266171) > +++ libgcc/ChangeLog(working copy) > @@ -1,3 +1,7 @@ > +2018-11-15 Xianmiao Qu > + > + * config/csky/linux-unwind.h: Fix coding style. > + > 2018-11-13 Xianmiao Qu > > * config/csky/linux-unwind.h (_sig_ucontext_t): Remove. > Index: libgcc/config/csky/linux-unwind.h > === > --- libgcc/config/csky/linux-unwind.h (revision 266171) > +++ libgcc/config/csky/linux-unwind.h (working copy) > @@ -25,10 +25,8 @@ > > #ifndef inhibit_libc > > -/* > - * Do code reading to identify a signal frame, and set the frame state data > - * appropriately. See unwind-dw2.c for the structs. > - */ > +/* Do code reading to identify a signal frame, and set the frame state data > + appropriately. See unwind-dw2.c for the structs. */ > > #include > #include > >
Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626
On Thu, Nov 15, 2018 at 10:02 AM Umesh Kalappa wrote: > > Hi All, > > The attached patch (pr85667.patch) fixes the subjected issue . > we tested on x86_64(linux and windows both) and no regress found . > > ok to commit ? I wonder if you can turn the testcase into a dg-run one, making the functions noinline/noipa and check the correct values are returned. Richard. > Thank you > ~Umesh
[PATCH] Fix PR88030
The following fixes PR88030. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2018-11-15 Richard Biener PR tree-optimization/88030 * tree-complex.c (need_eh_cleanup): New global. (update_complex_assignment): Mark blocks that need EH update. (expand_complex_comparison): Likewise. (tree_lower_complex): Allocate and deallocate need_eh_cleanup, perform EH cleanup and schedule CFG cleanup if that did anything. * gcc.dg/tsan/pr88030.c: New testcase. >From 1021924614c39cee9e4241d7f4318100a1590267 Mon Sep 17 00:00:00 2001 From: Richard Guenther Date: Thu, 15 Nov 2018 09:57:49 +0100 Subject: [PATCH] fix-pr88030 diff --git a/gcc/testsuite/gcc.dg/tsan/pr88030.c b/gcc/testsuite/gcc.dg/tsan/pr88030.c new file mode 100644 index 000..0c94c7c53f9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tsan/pr88030.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=thread -fnon-call-exceptions -fexceptions" } */ + +typedef __complex__ float Value; +typedef struct { + Value a[16 / sizeof (Value)]; +} A; + +A sum(A a,A b) +{ + a.a[0]+=b.a[0]; + a.a[1]+=b.a[1]; + return a; +} diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c index 4bf644f9473..2e104c9defb 100644 --- a/gcc/tree-complex.c +++ b/gcc/tree-complex.c @@ -80,6 +80,9 @@ static vec complex_ssa_name_components; non-SSA_NAME/non-invariant args that need to be replaced by SSA_NAMEs. */ static vec phis_to_revisit; +/* BBs that need EH cleanup. */ +static bitmap need_eh_cleanup; + /* Lookup UID in the complex_variable_components hashtable and return the associated tree. */ static tree @@ -701,7 +704,7 @@ update_complex_assignment (gimple_stmt_iterator *gsi, tree r, tree i) stmt = gsi_stmt (*gsi); update_stmt (stmt); if (maybe_clean_eh_stmt (stmt)) -gimple_purge_dead_eh_edges (gimple_bb (stmt)); +bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index); update_complex_components (gsi, gsi_stmt (*gsi), r, i); } @@ -1559,7 +1562,7 @@ expand_complex_comparison (gimple_stmt_iterator *gsi, tree ar, tree ai, update_stmt (stmt); if (maybe_clean_eh_stmt (stmt)) -gimple_purge_dead_eh_edges (gimple_bb (stmt)); +bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index); } /* Expand inline asm that sets some complex SSA_NAMEs. */ @@ -1773,6 +1776,8 @@ tree_lower_complex (void) class complex_propagate complex_propagate; complex_propagate.ssa_propagate (); + need_eh_cleanup = BITMAP_ALLOC (NULL); + complex_variable_components = new int_tree_htab_type (10); complex_ssa_name_components.create (2 * num_ssa_names); @@ -1818,11 +1823,15 @@ tree_lower_complex (void) gsi_commit_edge_inserts (); + unsigned todo += gimple_purge_all_dead_eh_edges (need_eh_cleanup) ? TODO_cleanup_cfg : 0; + BITMAP_FREE (need_eh_cleanup); + delete complex_variable_components; complex_variable_components = NULL; complex_ssa_name_components.release (); complex_lattice_values.release (); - return 0; + return todo; } namespace {
[PATCH] Fix PR88029, change how gimple_call_flags works
The following makes sure we preserve a 'const' attribute that was in effect on an indirect call even after turning it into a direct call of a non-const (pure for the testcase) function. This mimics how we handle nothrow/noreturn. By unioning ECF_ flags from fntype and decl we get the best of both worlds (for optimization as well as IL consistency). This seemed easier (and better) than introducing pure/const flags on the gimple stmt (as opposed to nothrow which can come in via -fno-exceptions on a function as well). I skimmed over flags_from_decl_or_type users to see which might need updating to simply use gimple_call_flags but only found bitrotting TM code (which I updated). Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2018-11-15 Richard Biener PR middle-end/ * gimple.c (gimple_call_flags): Union flags from decl, type and call fntype. * trans-mem.c (is_tm_pure_call): Simplify. * gcc.dg/tree-ssa/pr88029.c: New testcase. diff --git a/gcc/gimple.c b/gcc/gimple.c index 41d9f677c4f..23ccbae6bcb 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -1446,15 +1446,17 @@ gimple_call_same_target_p (const gimple *c1, const gimple *c2) int gimple_call_flags (const gimple *stmt) { - int flags; - tree decl = gimple_call_fndecl (stmt); + int flags = 0; - if (decl) -flags = flags_from_decl_or_type (decl); - else if (gimple_call_internal_p (stmt)) + if (gimple_call_internal_p (stmt)) flags = internal_fn_flags (gimple_call_internal_fn (stmt)); else -flags = flags_from_decl_or_type (gimple_call_fntype (stmt)); +{ + tree decl = gimple_call_fndecl (stmt); + if (decl) + flags = flags_from_decl_or_type (decl); + flags |= flags_from_decl_or_type (gimple_call_fntype (stmt)); +} if (stmt->subcode & GF_CALL_NOTHROW) flags |= ECF_NOTHROW; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88029.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88029.c new file mode 100644 index 000..c169dd543cd --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88029.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O -w -fdump-tree-fre1-vops" } */ + +double foo (double) __attribute__ ((pure)); +double (*fp) (double) __attribute__ ((const)); +double f(double a) +{ + fp = foo; + /* Verify when propagating foo to the call we preserve its constness. */ + return fp (a); +} + +/* { dg-final { scan-tree-dump "foo \\(a" "fre1" } } */ +/* { dg-final { scan-tree-dump-times "VUSE" 1 "fre1" } } */ diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c index 938f4e28b90..bb7146bc153 100644 --- a/gcc/trans-mem.c +++ b/gcc/trans-mem.c @@ -265,20 +265,7 @@ is_tm_safe (const_tree x) static bool is_tm_pure_call (gimple *call) { - if (gimple_call_internal_p (call)) -return (gimple_call_flags (call) & (ECF_CONST | ECF_TM_PURE)) != 0; - - tree fn = gimple_call_fn (call); - - if (TREE_CODE (fn) == ADDR_EXPR) -{ - fn = TREE_OPERAND (fn, 0); - gcc_assert (TREE_CODE (fn) == FUNCTION_DECL); -} - else -fn = TREE_TYPE (fn); - - return is_tm_pure (fn); + return (gimple_call_flags (call) & (ECF_CONST | ECF_TM_PURE)) != 0; } /* Return true if X has been marked TM_CALLABLE. */
Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].
On 11/13/18 8:42 PM, Jason Merrill wrote: > On Tue, Nov 13, 2018 at 9:20 AM Martin Liška wrote: >> >> On 11/13/18 5:43 AM, Jason Merrill wrote: >>> [[likely]] and [[unlikely]] are equivalent to the GNU hot/cold attributes, >>> except that they can be applied to arbitrary statements as well as labels; >>> this is most likely to be useful for marking if/else branches as likely or >>> unlikely. Conveniently, PREDICT_EXPR fits the bill nicely as a >>> representation. >>> >>> I also had to fix marking case labels as hot/cold, which didn't work before. >>> Which then required me to force __attribute ((fallthrough)) to apply to the >>> statement rather than the label. >>> >>> Tested x86_64-pc-linux-gnu. Does this seem like a sane implementation >>> approach to people with more experience with PREDICT_EXPR? >> >> Hi. >> >> In general it makes sense to implement it the same way. Question is how much >> should the hold/cold attribute should be close to __builtin_expect. >> >> Let me present few examples and differences that I see: >> >> 1) ./xgcc -B. -O2 -fdump-tree-profile_estimate=/dev/stdout /tmp/test1.C >> >> ;; Function foo (_Z3foov, funcdef_no=0, decl_uid=2301, cgraph_uid=1, >> symbol_order=3) >> >> Predictions for bb 2 >> first match heuristics: 90.00% >> combined heuristics: 90.00% >> __builtin_expect heuristics of edge 2->3: 90.00% >> >> As seen here __builtin_expect is stronger as it's first match heuristics and >> has probability == 90%. >> >> ;; Function bar (_Z3barv, funcdef_no=1, decl_uid=2303, cgraph_uid=2, >> symbol_order=4) >> >> Predictions for bb 2 >> DS theory heuristics: 74.48% >> combined heuristics: 74.48% >> opcode values nonequal (on trees) heuristics of edge 2->3: 34.00% >> hot label heuristics of edge 2->3: 85.00% >> >> Here we combine hot label prediction with the opcode one, resulting in quite >> poor result 75%. >> So maybe cold/hot prediction cal also happen first match. > > Makes sense. Good, let me prepare patch for it. > >> 2) ./xgcc -B. -O2 -fdump-tree-profile_estimate=/dev/stdout /tmp/test2.C >> ... >> foo () >> { >> ... >> switch (_3) [3.33%], case 3: [3.33%], case 42: >> [3.33%], case 333: [90.00%]> >> >> while: >> >> bar () >> { >> switch (a.1_1) [25.00%], case 3: [25.00%], case 42: >> [25.00%], case 333: [25.00%]> >> ... >> >> Note that support for __builtin_expect was enhanced in this stage1. I can >> definitely cover also situations when one uses >> hot/cold for labels. So definitely place for improvement. > > Hmm, the gimplifier should be adding a PREDICT_EXPR for the case label > now, is it just ignored currently? Yes, for switch multi-branches yes. I'll prepare patch for it, should be quite straightforward. > >> 3) last example where one can use the attribute for function decl, resulting >> in: >> __attribute__((hot, noinline)) >> foo () >> { >> .. >> >> Hope it's desired? If so I would cover that with a test-case in test-suite. > > [[likely]] and [[unlikely]] don't apply to functions; I suppose I > should diagnose that. Yes. > >> Jason can you please point to C++ specification of the attributes? > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0479r5.html > >> Would you please consider an error diagnostics for situations written in >> test4.C? > > A warning seems appropriate. You think the front end is the right > place for that? Probably yes. Note that middle-end can optimize about dead branches and so that theoretically one can end up with a branching where e.g. both branches are [[likely]]. I wouldn't bother users with these. Martin > > Jason >
Re: [PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.
> This may ultimately be too simplistic. There are targets where some > constants are OK, but others may not be. By checking the predicate > like this I think you can cause over-aggressive if-conversion if the > target allows a range of integers in the expander's operand predicate, > but allows a narrower range in the actual define_insn (presumably the > expander loads them into a pseudo or somesuch in that case). > > We know that over-aggressive if-conversion into conditional moves hurts > some targets. > > Ideally you'd create the actual insn with the constants you want to use > and see if that's recognized as well as compute its cost. Is that just > too painful at this point for some reason? Conditional moves in noce_convert_multiple_sets are processed via noce_emit_cmove which itself performs quite some preprocessing and calls optab's emit_conditional_move in the end, so the insn will already be emitted before being able to decide whether to decline it due to costs. In addition, every noce_emit_cmove will emit the condition check again as well as possible temporaries. Comparing the costs of the whole sequence will therefore still prove difficult as all the additionally generated insns will not get removed until reload and make a fair comparison nigh impossible. I was reluctant to factor out all the preprocessing stuff, separate it from the condition check and actual emitting part but that's properly the "right way" to do it, including emitting the condition only once in the beginning. However, for now, I could imagine changing only the conversion_profitable_p hook in our backend to only count the cmovs and ignore everything else in the sequence. This would be somewhat hacky though and still wouldn't allow constants :) Regards Robin
Re: [PATCH 5/6] ifcvt: Only created temporaries as needed.
> This looks pretty reasonable. ISTM it ought to be able to go forward if > it's tested independently. The test suite already passes, any other tests you have in mind? To be honest I suppose noce_convert_multiple_sets will currently never successfully return (due to the costing problems I described before), so a specific test case might be difficult. Regards Robin
[v3 PATCH] PR libstdc++/87855
Tested on Linux-PPC64. Ok for trunk? Backports? 2018-11-15 Ville Voutilainen PR libstdc++/87855 Also implement P0602R4 (variant and optional should propagate copy/move triviality) for std::optional. * include/std/optional (_Optional_payload): Change the main constraints to check constructibility in addition to assignability. (operator=): Make constexpr. (_M_reset): Likewise. (_M_construct): Likewise. (operator->): Likewise. * testsuite/20_util/optional/assignment/8.cc: Adjust. * testsuite/20_util/optional/assignment/9.cc: New. diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional index d0257c0..fefd9a3 100644 --- a/libstdc++-v3/include/std/optional +++ b/libstdc++-v3/include/std/optional @@ -103,10 +103,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template , - bool /*_HasTrivialCopyAssignment*/ = - is_trivially_copy_assignable_v<_Tp>, - bool /*_HasTrivialMoveAssignment*/ = - is_trivially_move_assignable_v<_Tp>> + bool /*_HasTrivialCopy */ = + is_trivially_copy_assignable_v<_Tp> + && is_trivially_copy_constructible_v<_Tp>, + bool /*_HasTrivialMove */ = + is_trivially_move_assignable_v<_Tp> + && is_trivially_move_constructible_v<_Tp>> struct _Optional_payload { constexpr _Optional_payload() noexcept : _M_empty() { } @@ -148,6 +150,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION this->_M_construct(std::move(__other._M_payload)); } + constexpr _Optional_payload& operator=(const _Optional_payload& __other) { @@ -163,6 +166,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } + constexpr _Optional_payload& operator=(_Optional_payload&& __other) noexcept(__and_v, @@ -216,6 +220,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return this->_M_payload; } // _M_reset is a 'safe' operation with no precondition. + constexpr void _M_reset() noexcept { @@ -346,6 +351,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Optional_payload(const _Optional_payload&) = default; _Optional_payload(_Optional_payload&&) = default; + constexpr _Optional_payload& operator=(const _Optional_payload& __other) { @@ -394,6 +400,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return this->_M_payload; } // _M_reset is a 'safe' operation with no precondition. + constexpr void _M_reset() noexcept { @@ -466,6 +473,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Optional_payload& operator=(const _Optional_payload& __other) = default; + constexpr _Optional_payload& operator=(_Optional_payload&& __other) noexcept(__and_v, @@ -513,6 +521,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return this->_M_payload; } // _M_reset is a 'safe' operation with no precondition. + constexpr void _M_reset() noexcept { @@ -581,6 +590,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Optional_payload(const _Optional_payload&) = default; _Optional_payload(_Optional_payload&&) = default; + constexpr _Optional_payload& operator=(const _Optional_payload& __other) { @@ -596,6 +606,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } + constexpr _Optional_payload& operator=(_Optional_payload&& __other) noexcept(__and_v, @@ -624,6 +635,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool _M_engaged; template +constexpr void _M_construct(_Args&&... __args) noexcept(is_nothrow_constructible_v<_Stored_type, _Args...>) @@ -643,6 +655,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return this->_M_payload; } // _M_reset is a 'safe' operation with no precondition. + constexpr void _M_reset() noexcept { @@ -681,6 +694,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } // _M_reset is a 'safe' operation with no precondition. + constexpr void _M_reset() noexcept { @@ -1217,6 +1231,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION operator->() const { return std::__addressof(this->_M_get()); } + constexpr _Tp* operator->() { return std::__addressof(this->_M_get()); } diff --git a/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc b/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc index d573137..7241b92 100644 --- a/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc +++ b/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc @@ -91,6 +91,24 @@ struct S2 { }; static_assert(std::is_trivially_move_assignable_v); +struct S3 { + S3(const S3&); + S3& operator=(const S3&) = default; +}; +static_assert(std::is_trivially_copy_assignable_v); +static_assert(std::is_copy_assignable_v); +static_assert(!std::is_trivially_copy_assignable_v>); +static_assert(std::is_copy_assignable_v>); + +stru
Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
On Thu, 15 Nov 2018 10:36:57 +0100 Andreas Schwab wrote: > On Nov 14 2018, Jozef Lawrynowicz wrote: > > > The timeout as set in the dejagnu configuration for msp430 > > ([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely > > hit. > > I don't think it makes sense for a board file to set a smaller timeout > than the default. > > Andreas. > Hmm, yes I see that some other DejaGNU board files increase the timeout for GCC but no others are decreasing it. I suspect this was just an oversight with the initial port of board file. I'll go ahead and remove the timeout from the board file and rerun the tests, with those dg-timeout directives also removed. Thanks, Jozef
Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].
> > A warning seems appropriate. You think the front end is the right > > place for that? > > Probably yes. Note that middle-end can optimize about dead branches and so > that > theoretically one can end up with a branching where e.g. both branches are > [[likely]]. > I wouldn't bother users with these. Note that what really happens in this case is that if conditional is constant propagated and it has predict_expr, the predict_expr stays and will get assigned to the random control dependence edge which controls execution of the original statement. This is not very intuitive behaviour. Does C++ say what should happen in this case? One option would be to deal with this gratefully at high level gimple and turn predict_exprs into edge probabilities eariler than we do normal branch prediction (which is intended to be later so profile does not end up unnecesarily inconsistent) Honza > > Martin > > > > > Jason > > >
Re: RFA: vectorizer patches 1/2 : WIDEN_MULT_PLUS support
On Wed, Nov 14, 2018 at 4:21 PM Joern Wolfgang Rennecke wrote: > > > On 14/11/18 09:53, Richard Biener wrote: > >> WIDEN_MULT_PLUS is special on our target in that it creates double-sized > >> vectors. > > Are there really double-size vectors or does the target simply produce > > the output in two vectors? Usually targets have WIDEN_MULT_PLUS_LOW/HIGH > > or _EVEN/ODD split operations. Or, like - what I now remember - for the > > DOT_PROD_EXPR optab, the target already reduces element pairs of the > > result vector (unspecified which ones) so the result vector is of the same > > size as the inputs. > The output of widening multiply and widening multiply-add is stored > in two consecutive registers. So, they can be used as separate > vectors, but you can't choose the register numbers indepenndently. > OTOH, you can treat them together as a double-sized vector, but > without any extra alignment requirements over a single-sized vector. > > > > That is, if your target produces two vectors you may instead want to hide > > that fact by claiming you support DOT_PROD_EXPR and expanding > > it to the widen-mult-plus plus reducing (adding) the two result vectors to > > get a single one. > > Doing a part of the reduction in the loop is a bit pointless. The advantage is you can work with lower vectorization factor and thus less unrolling (smaller code). > I have tried another approach, to advertize the WIDEN_MULT_PLUS > and WIDEN_MULT operations as LO/HI part operations of the double > vector size, and also add fake double-vector patterns for move, widening > and add (they get expanded or splitted to single-vector patterns). > That seems to work for the dot product, it's like the code is unrolled by > a factor of two. Yeah, that would work as well. As said, the vectorizer cannot really handle the doubled vector size (means: you sooner or later will run into issues). > There are a few drawbacks though: > - the tree optimizer creates separate WIDEN_MULT and PLUS expressions, > and it is left to the combiner to clean that up. That combination and > register allocation > might be a bit fragile. > - if the input isn't known to be aligned to the doubled vector size, a > run-time > check is inserted to use an unvectorized loop if there is no excess > alignment. > - auto-increment for the loads is lost. I can probably fix this by > keeping double-sized > loads around for longer or with some special-purpose pass, but both > might have > some other drawbacks. But there's actually a configuration option for > an instruction > to load multiple vector registers with register-indirect or > auto-increment, so there is > some merit to have a pattern for it. > - the code size is larger. > - vectorization will fail if any other code is mixed in for which no > double-vector patterns are provided. > - this approach uses SUBREGs in ways that are not safe according to the > documentation. > But then, other ports like i386 and aarch64-little endian do that too. I think they do it if they really have such instructions in the ISA (they also have those that do the in-loop reduction to half of the result vector size -- DOT_PROD_EXPR). > I think it is now (since we have > SUBREG_BYTE) safe to have subregs of registers with hard reg sizes > larger than UNITS_PER_WORD, > as long as you refer to entire hard registers. Maybe we could change > the documentation? > AFAICT, there are also only four places that need to be patched to make > a lowpart access with a SUBREG of such a hard register safe. I'm trying > this at the moment, it was justa few hours late for the > phase 1->3 deadline. > > I suppose for WIDEN_SUM_EXPR, I'd have to have one double-vector-sized > pattern that > adds the products of the two input vectors into the double output > vector, and leave > the rtl loop optimizer to get the constant pool load of the all-ones > vector out of > the loop. But again, there'll be issues with excess alignment > requirements and code size. I think going the DOT_PROD_EXPR way is a lot easier. You simply expand the additional (in-loop) sum. The only drawback I see is that this might be slower code. So yes the a _LO/_HI way maps better to hardware but you rely on CSE to remove the redundant instruction if you implement _LO/_HI as doing the full operation and just taking one of the result vectors. > > > > The vectorizer cannot really deal with multiple sizes, thus for example > > a V4SI * V4SI + V4DI operation and that all those tree codes are exposed > > as "scalar" is sth that continues to confuse me but is mainly done because > > at pattern recognition time there's only the scalars. > Well, the vectorizer makes an exception for reductions as it'll allow to > maintain > either a vector or a scalar during the loop, so why not allow other > sizes for that > value as well? It's not implemented ;) > It's all hidden in the final reduction emitted by the > epilogue. > > For vectorization I would advise to provide expansion patterns fo
Re: [PATCH 8/9][GCC][Arm] Add autovectorization support for complex multiplication and addition
On Wed, Nov 14, 2018 at 4:47 PM Tamar Christina wrote: > > Hi Richard, > > > > Ok for trunk? > > > > +;; The complex mla operations always need to expand to two instructions. > > +;; The first operation does half the computation and the second does > > +the ;; remainder. Because of this, expand early. > > +(define_expand "fcmla4" > > + [(set (match_operand:VF 0 "register_operand") > > + (plus:VF (match_operand:VF 1 "register_operand") > > +(unspec:VF [(match_operand:VF 2 "register_operand") > > +(match_operand:VF 3 "register_operand")] > > +VCMLA)))] > > + "TARGET_COMPLEX" > > +{ > > + emit_insn (gen_neon_vcmla (operands[0], > > operands[1], > > + operands[2], > > +operands[3])); > > + emit_insn (gen_neon_vcmla (operands[0], > > operands[0], > > + operands[2], > > +operands[3])); > > + DONE; > > +}) > > > > What's the two halves? Why hide this from the vectorizer if you go down all > > to the detail and expose the rotation to it? > > > > The two halves are an implementation detail of the instruction in Armv8.3-a. > As far as the > Vectorizer is concerned all you want to do, is an FMA rotating one of the > operands by 0 or 180 degrees. > > Also note that the "rotations" in these instructions aren't exactly the same > as what would fall under rotation of a complex number, > as each instruction can only do half of the final computation you want. > > In the ISA these instructions have to be used in a pair, where rotations > determine > the operation you want to perform. E.g. a rotation of #0 followed by #90 > makes it a multiply and accumulate. > > A rotation of #180 followed by #90 makes this a vector complex subtract, > which is intended to be used by the first call > using a register cleared with 0 (It becomes an "FMS" essentially if you don't > clear the register). > Each "rotation" determine what operation is done and using which parts of the > complex number. You change the > "rotations" and the grouping of the instructions to get different operations. > > I did not expose this to the vectorizer as It seems very ISA specific. > > > +;; The vcadd and vcmla patterns are made UNSPEC for the explicitly due > > +to the ;; fact that their usage need to guarantee that the source > > +vectors are ;; contiguous. It would be wrong to describe the operation > > +without being able ;; to describe the permute that is also required, > > +but even if that is done ;; the permute would have been created as a > > +LOAD_LANES which means the values ;; in the registers are in the wrong > > order. > > > > Hmm, it's totally non-obvious to me how this relates to loads or what a > > "non- > > contiguous" > > register would be? That is, once you make this an unspec combine will never > > be able to synthesize this from intrinsics code that doesn't use this form. > > > > +(define_insn "neon_vcadd" > > + [(set (match_operand:VF 0 "register_operand" "=w") > > + (unspec:VF [(match_operand:VF 1 "register_operand" "w") > > + (match_operand:VF 2 "register_operand" "w")] > > + VCADD))] > > > > Yes that's my goal, as if operand1 and operand2 are loaded by instructions > that > would have permuted the values in the register then the instruction doesn't > work. > > The instruction does the permute itself, so it expects the values to have > been loaded > using a simple load and not a LOAD_LANES. So I am intended to prevent combine > from > recognizing the operation for that reason. But LOAD_LANES is used differently and the ISA probably doesn't really care how you set up the register inputs. You of course have to put in the correct values but how they get there doesn't matter. So I don't see how combine can mess things up here. > For the ADD combine can be used but then you'd > have to match the load and store since you have to change these, for the rest > you'll run far afoul > of combine's 5 instruction limit. Why do you need to change these? You assume the vectorizer vectorizes using interleaving - yes, in that case all hope is lost. I assume the vectorizer will end up doing SLP with the existing TWO_OPERATORS support, thus for complex subtraction you'll see (A and B being complex vectors) add = A + B; sub = A - B; resultAcomplex_minusB = vec_merge (add, sub, 1) basically the vectorizer will perform operations twice and then blend the two results. The add/sub + blend needs to be recognized by combine (x86 does this for the vaddsub instructions which were designed to handle complex subtraction and parts of the multiply). For complex multiplication you'll see the pieces your ISA supports. mul1 = A * B mul2 = A * rot(B) (rotation will be a shuffle) add = mul1 + mul2 sub = mul1 - mul2 result = blend (add, sub, ...) as usual the combiner is helped by intermediate combiner patterns
Re: [Committed][AArch64] Fix PR62178 testcase failures
Hi Segher, > On Wed, Nov 14, 2018 at 12:37:05PM +, Wilco Dijkstra wrote: >> +/* { dg-final { scan-assembler-not { dup } } } */ >> +/* { dg-final { scan-assembler-not { fmov } } } */ > > { dup } is the same as " dup " , that is, with spaces and all. > I don't think you want that (there usually is a tab character before > mnemonics, so \s would work better, or use \m and \M. Or nothing, > just delete the spaces, if you are sure nothing in the generated > assembler code says "dup"). Thanks for spotting that! I removed the spaces (below). The syntax is arcane to say the least but it seems {} has fewer bugs in regular expressions. I can never figure out how many \ to use so it works exactly identically across different regexp/tcl versions. Wilco --- gcc/testsuite/gcc.target/aarch64/pr62178.c (revision 266178) +++ gcc/testsuite/gcc.target/aarch64/pr62178.c (working copy) @@ -18,5 +18,5 @@ /* { dg-final { scan-assembler "ldr\\tq\[0-9\]+, \\\[x\[0-9\]+\\\], \[0-9\]+" } } */ /* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+" } } */ -/* { dg-final { scan-assembler-not { dup } } } */ -/* { dg-final { scan-assembler-not { fmov } } } */ +/* { dg-final { scan-assembler-not {dup} } } */ +/* { dg-final { scan-assembler-not {fmov} } } */
Re: [PATCH 8/9][GCC][Arm] Add autovectorization support for complex multiplication and addition
On Thu, Nov 15, 2018 at 1:42 PM Richard Biener wrote: > > On Wed, Nov 14, 2018 at 4:47 PM Tamar Christina > wrote: > > > > Hi Richard, > > > > > > Ok for trunk? > > > > > > +;; The complex mla operations always need to expand to two instructions. > > > +;; The first operation does half the computation and the second does > > > +the ;; remainder. Because of this, expand early. > > > +(define_expand "fcmla4" > > > + [(set (match_operand:VF 0 "register_operand") > > > + (plus:VF (match_operand:VF 1 "register_operand") > > > +(unspec:VF [(match_operand:VF 2 "register_operand") > > > +(match_operand:VF 3 "register_operand")] > > > +VCMLA)))] > > > + "TARGET_COMPLEX" > > > +{ > > > + emit_insn (gen_neon_vcmla (operands[0], > > > operands[1], > > > + operands[2], > > > +operands[3])); > > > + emit_insn (gen_neon_vcmla (operands[0], > > > operands[0], > > > + operands[2], > > > +operands[3])); > > > + DONE; > > > +}) > > > > > > What's the two halves? Why hide this from the vectorizer if you go down > > > all > > > to the detail and expose the rotation to it? > > > > > > > The two halves are an implementation detail of the instruction in > > Armv8.3-a. As far as the > > Vectorizer is concerned all you want to do, is an FMA rotating one of the > > operands by 0 or 180 degrees. > > > > Also note that the "rotations" in these instructions aren't exactly the > > same as what would fall under rotation of a complex number, > > as each instruction can only do half of the final computation you want. > > > > In the ISA these instructions have to be used in a pair, where rotations > > determine > > the operation you want to perform. E.g. a rotation of #0 followed by #90 > > makes it a multiply and accumulate. > > > > A rotation of #180 followed by #90 makes this a vector complex subtract, > > which is intended to be used by the first call > > using a register cleared with 0 (It becomes an "FMS" essentially if you > > don't clear the register). > > Each "rotation" determine what operation is done and using which parts of > > the complex number. You change the > > "rotations" and the grouping of the instructions to get different > > operations. > > > > I did not expose this to the vectorizer as It seems very ISA specific. > > > > > +;; The vcadd and vcmla patterns are made UNSPEC for the explicitly due > > > +to the ;; fact that their usage need to guarantee that the source > > > +vectors are ;; contiguous. It would be wrong to describe the operation > > > +without being able ;; to describe the permute that is also required, > > > +but even if that is done ;; the permute would have been created as a > > > +LOAD_LANES which means the values ;; in the registers are in the wrong > > > order. > > > > > > Hmm, it's totally non-obvious to me how this relates to loads or what a > > > "non- > > > contiguous" > > > register would be? That is, once you make this an unspec combine will > > > never > > > be able to synthesize this from intrinsics code that doesn't use this > > > form. > > > > > > +(define_insn "neon_vcadd" > > > + [(set (match_operand:VF 0 "register_operand" "=w") > > > + (unspec:VF [(match_operand:VF 1 "register_operand" "w") > > > + (match_operand:VF 2 "register_operand" "w")] > > > + VCADD))] > > > > > > > Yes that's my goal, as if operand1 and operand2 are loaded by instructions > > that > > would have permuted the values in the register then the instruction doesn't > > work. > > > > The instruction does the permute itself, so it expects the values to have > > been loaded > > using a simple load and not a LOAD_LANES. So I am intended to prevent > > combine from > > recognizing the operation for that reason. > > But LOAD_LANES is used differently and the ISA probably doesn't really care > how > you set up the register inputs. You of course have to put in the > correct values but > how they get there doesn't matter. So I don't see how combine can > mess things up here. > > > For the ADD combine can be used but then you'd > > have to match the load and store since you have to change these, for the > > rest you'll run far afoul > > of combine's 5 instruction limit. > > Why do you need to change these? You assume the vectorizer vectorizes using > interleaving - yes, in that case all hope is lost. I assume the > vectorizer will end up > doing SLP with the existing TWO_OPERATORS support You might be bitten by the fact that you tuned the vectorizer to always prefer load/store-lanes over SLP when there are permutations. You could lift that a bit allowing rotation/projection permutes as they occur with complex arithmetic. > , thus for complex subtraction > you'll see (A and B being complex vectors) > >add = A + B; >sub = A - B; >resultAcomplex_minusB = vec_merge (add, sub
RE: [PATCH 1/9][GCC][AArch64][middle-end] Implement SLP recognizer for Complex addition with rotate and complex MLA with rotation
On Wed, 14 Nov 2018, Tamar Christina wrote: > Hi Richard, > > Thanks for the feedback, I've replied inline below. > I'll wait for your answers before making changes. I have commented on the other mail so will leave out redunant parts here. > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, November 14, 2018 12:21 > > To: Tamar Christina > > Cc: GCC Patches ; nd ; Richard > > Guenther ; Zdenek Dvorak ; Richard > > Earnshaw ; James Greenhalgh > > ; Marcus Shawcroft > > > > Subject: Re: [PATCH 1/9][GCC][AArch64][middle-end] Implement SLP > > recognizer for Complex addition with rotate and complex MLA with rotation > > > > On Sun, Nov 11, 2018 at 11:26 AM Tamar Christina > > wrote: > > > > > > Hi All, > > > > > > This patch adds support for SLP vectorization of Complex number > > > arithmetic with rotations along with Argand plane. > > > > > > For this to work it has to recognize two statements in parallel as it > > > needs to match against operations towards both the real and imaginary > > > numbers. The instructions generated by this change is also only > > > available in their vector forms. As such I add them as pattern > > > statements to > > the stmt. The BB is left untouched and so the scalar loop is untouched. > > > > > > The instructions also require the loads to be contiguous and so when a > > > match is made, and the code decides it is able to do the replacement > > > it re-organizes the SLP tree such that the loads are contiguous. > > > Since this operation cannot be undone it only does this if it's sure that > > > the > > resulting loads can be made continguous. > > > > > > It gets this guarantee by only allowing the replacement if between the > > > matched expression and the loads there are no other expressions it > > doesn't know aside from type casts. > > > > > > When a match occurs over multiple expressions, the dead statements are > > > immediately removed from the tree to prevent verification failures later. > > > > > > Because the pattern matching is done after SLP analysis has analyzed > > > the usage of the instruction it also marks the instructions as used and > > > the > > old ones as unusued. > > > > > > When a replacement is done a new internal function is generated which > > > the back-end has to expand to the proper instruction sequences. > > > > > > For now, this patch only adds support for Complex addition with rotate > > > and Complex FMLA with rotation of 0 and 180. However it is the > > > intention to in the future add support for Complex subtraction and > > Complex multiplication. > > > > > > Concretely, this generates > > > > > > ldr q1, [x1, x3] > > > ldr q2, [x0, x3] > > > ldr q0, [x2, x3] > > > fcmla v0.2d, v1.2d, v2.2d, #180 > > > fcmla v0.2d, v1.2d, v2.2d, #90 > > > str q0, [x2, x3] > > > add x3, x3, 16 > > > cmp x3, 3200 > > > bne .L2 > > > ret > > > > > > now instead of > > > > > > add x3, x2, 31 > > > sub x4, x3, x1 > > > sub x3, x3, x0 > > > cmp x4, 62 > > > mov x4, 62 > > > ccmpx3, x4, 0, hi > > > bls .L5 > > > mov x3, x0 > > > mov x0, x1 > > > add x1, x2, 3200 > > > .p2align 3,,7 > > > .L3: > > > ld2 {v16.2d - v17.2d}, [x2] > > > ld2 {v2.2d - v3.2d}, [x3], 32 > > > ld2 {v0.2d - v1.2d}, [x0], 32 > > > mov v7.16b, v17.16b > > > fmulv6.2d, v0.2d, v3.2d > > > fmlav7.2d, v1.2d, v3.2d > > > fmlav6.2d, v1.2d, v2.2d > > > fmlsv7.2d, v2.2d, v0.2d > > > faddv4.2d, v6.2d, v16.2d > > > mov v5.16b, v7.16b > > > st2 {v4.2d - v5.2d}, [x2], 32 > > > cmp x2, x1 > > > bne .L3 > > > ret > > > .L5: > > > add x4, x2, 8 > > > add x6, x0, 8 > > > add x5, x1, 8 > > > mov x3, 0 > > > .p2align 3,,7 > > > .L2: > > > ldr d1, [x6, x3] > > > ldr d4, [x1, x3] > > > ldr d5, [x5, x3] > > > ldr d3, [x0, x3] > > > fmuld2, d4, d1 > > > ldr d0, [x4, x3] > > > fmadd d0, d5, d1, d0 > > > ldr d1, [x2, x3] > > > fmadd d2, d5, d3, d2 > > > fmsub d0, d4, d3, d0 > > > faddd1, d1, d2 > > > str d1, [x2, x3] > > > str d0, [x4, x3] > > > add x3, x3, 16 > > > cmp x3, 3200 > > > bne .L2 > > > ret > > > > > > Bootstrap and Regtests on aarch64-none-linux-gnu, arm-none-gnueabihf > > > and x86_64-pc-linux-gnu are still on going but previous patch showed no > > regressions. > > > > > > The instructions have also been tested on aarch64-none-elf and > > > arm-none-eabi on a Armv8.3-a model and -march=Armv8.3-a+fp16 and all > > tests pass. >
[C++ DR 2336] Clean up synth walkers first
This patch cleans up walk_field_subobs, synthesized_method_base_walk & synthesized_method_walk. They use a mixture of old-style partially-overlapping booleans /and/ new style special_function_kind enumeration. synthesized_method_walk was determining the booleans from sfk and then passing the booleans to synthesized_method_base_walk, but both the bools and sfk to walk_field_subobs. Plus for extra confusion walk_field_subobs & synthesized_method_base_walk wanted the same args, but in different orders. All this confusion gets in the way of resolving DR 2336. This patch removes that confusion. I reorder SFK members to be useful for these functions, add a set of new predicates and then adjust the 3 functions to use those predicates, drop the bools and consistently order their parameters. Applying to trunk after an x6_64-linux bootstrap. nathan -- Nathan Sidwell 2018-11-15 Nathan Sidwell * cp-tree.h (enum special_function_kind): Reorder and comment. * method.c (SFK_CTOR_P, SFK_DTOR_P, SFK_ASSIGN_P, SFK_COPY_P) (SFK_MOVE_P): New predicates. (walk_field_subobs, synthesized_method_base_walk): Drop copy_arg_p, move_p, assign_p args. Use new SFK predicates. Order parameters consistently. (synthesized_method_walk): Drop ctor_p, copy_arg_p, move_p, assign_p calculations. Use new SFK predicates. Adjust calls to worker functions. Index: cp-tree.h === --- cp-tree.h (revision 266161) +++ cp-tree.h (working copy) @@ -5084,20 +5084,22 @@ enum special_function_kind { sfk_none = 0, /* Not a special function. This enumeral must have value zero; see special_function_p. */ + /* The following are ordered, for use by member synthesis fns. */ + sfk_destructor, /* A destructor. */ sfk_constructor, /* A constructor. */ + sfk_inheriting_constructor, /* An inheriting constructor */ sfk_copy_constructor,/* A copy constructor. */ sfk_move_constructor,/* A move constructor. */ sfk_copy_assignment, /* A copy assignment operator. */ sfk_move_assignment, /* A move assignment operator. */ - sfk_destructor, /* A destructor. */ + /* The following are unordered. */ sfk_complete_destructor, /* A destructor for complete objects. */ sfk_base_destructor, /* A destructor for base subobjects. */ sfk_deleting_destructor, /* A destructor for complete objects that deletes the object after it has been destroyed. */ sfk_conversion, /* A conversion operator. */ - sfk_deduction_guide, /* A class template deduction guide. */ - sfk_inheriting_constructor /* An inheriting constructor */ + sfk_deduction_guide /* A class template deduction guide. */ }; /* The various kinds of linkage. From [basic.link], Index: method.c === --- method.c (revision 266161) +++ method.c (working copy) @@ -1283,15 +1283,26 @@ process_subob_fn (tree fn, tree *spec_p, } } +/* Categorize various special_function_kinds. */ +#define SFK_CTOR_P(sfk) \ + ((sfk) >= sfk_constructor && (sfk) <= sfk_move_constructor) +#define SFK_DTOR_P(sfk) \ + ((sfk) == sfk_destructor) +#define SFK_ASSIGN_P(sfk) \ + ((sfk) == sfk_copy_assignment || (sfk) == sfk_move_assignment) +#define SFK_COPY_P(sfk) \ + ((sfk) == sfk_copy_constructor || (sfk) == sfk_copy_assignment) +#define SFK_MOVE_P(sfk) \ + ((sfk) == sfk_move_constructor || (sfk) == sfk_move_assignment) + /* Subroutine of synthesized_method_walk to allow recursion into anonymous aggregates. If DTOR_FROM_CTOR is true, we're walking subobject destructors called from a synthesized constructor, in which case we don't consider the triviality of the subobject destructor. */ static void -walk_field_subobs (tree fields, tree fnname, special_function_kind sfk, - int quals, bool copy_arg_p, bool move_p, - bool assign_p, tree *spec_p, bool *trivial_p, +walk_field_subobs (tree fields, special_function_kind sfk, tree fnname, + int quals, tree *spec_p, bool *trivial_p, bool *deleted_p, bool *constexpr_p, bool diag, int flags, tsubst_flags_t complain, bool dtor_from_ctor) @@ -1315,7 +1326,7 @@ walk_field_subobs (tree fields, tree fnn break; mem_type = strip_array_types (TREE_TYPE (field)); - if (assign_p) + if (SFK_ASSIGN_P (sfk)) { bool bad = true; if (CP_TYPE_CONST_P (mem_type) && !CLASS_TYPE_P (mem_type)) @@ -1419,19 +1430,18 @@ walk_field_subobs (tree fields, tree fnn if (ANON_AGGR_TYPE_P (mem_type)) { - walk_field_subobs (TYPE_FIELDS (mem_type), fnname, sfk, quals, - copy_arg_p, move_p, assign_p, spec_p, trivial_p, - deleted_p, constexpr_p, + walk_field_subobs (TYPE_FIELDS (mem_type), sfk, fnname, quals, + spec_p, trivial_p, deleted_p, constexpr_p, diag, flags, complain, dtor_from_ctor); continue; } - if (copy_a
Re: Don't use %z printf format length specified
Hi, On Thu, 15 Nov 2018, Richard Biener wrote: > > Okay, probably. Then consider the same patch with sprinkling casts to > > long for all these arguments (the actual numbers will fit that type in > > reality). I could of course also use PRIu64 but that makes my eyes > > bleed. > > Hmm. Can you use PRIu64 please and cast to uint64_t? OK with that > change. Instead I'd like to check in this. It adds a new macro expanding to "%" #n PRIu64 "%c" (with n being the width) corresponding to one SIZE_AMOUNT macro "argument" (which are actually two arguments), and hides the cast in the latter macro itself. At least the number of format string args and actual args corresponds again then and my eyes bleed a little less. Still okay? Ciao, Michael. * system.h (PRsa): New macro. (SIZE_AMOUNT): Cast number to uint64_t. * alloc-pool.h (pool_usage::dump): Don't use %zu but PRsa. (pool_usage::dump_footer): Likewise and also use PRIu64. * bitmap.h (bitmap_usage::dump): Likewise. * ggc-common.c (ggc_usage::dump): Likewise. * ggc-page.c (ggc_print_statistics): Likewise. * mem-stats.h (mem_usage::dump): Likewise. (mem_usage::dump_footer): Likewise. * rtl.c (dump_rtx_statistics): Likewise. * vec.c (vec_usage::dump): Likewise. (vec_usage::dump_footer): Likewise. diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h index d17a05ca4fb1..81cb69e227ab 100644 --- a/gcc/alloc-pool.h +++ b/gcc/alloc-pool.h @@ -63,8 +63,8 @@ struct pool_usage: public mem_usage { char *location_string = loc->to_string (); -fprintf (stderr, "%-32s%-48s %5zu%c%9zu%c:%5.1f%%%9zu" -"%c%9zu%c:%5.1f%%%12zu\n", +fprintf (stderr, "%-32s%-48s " PRsa(5) PRsa(9) ":%5.1f%%" +PRsa(9) PRsa(9) ":%5.1f%%%12" PRIu64 "\n", m_pool_name, location_string, SIZE_AMOUNT (m_instances), SIZE_AMOUNT (m_allocated), @@ -72,7 +72,7 @@ struct pool_usage: public mem_usage SIZE_AMOUNT (m_peak), SIZE_AMOUNT (m_times), get_percent (m_times, total.m_times), -m_element_size); +(uint64_t)m_element_size); free (location_string); } @@ -91,7 +91,7 @@ struct pool_usage: public mem_usage dump_footer () { print_dash_line (); -fprintf (stderr, "%s%82zu%c%10zu%c\n", "Total", +fprintf (stderr, "%s" PRsa(82) PRsa(10) "\n", "Total", SIZE_AMOUNT (m_instances), SIZE_AMOUNT (m_allocated)); print_dash_line (); } diff --git a/gcc/bitmap.h b/gcc/bitmap.h index 973ea846baf1..9a180daa7454 100644 --- a/gcc/bitmap.h +++ b/gcc/bitmap.h @@ -239,9 +239,9 @@ struct bitmap_usage: public mem_usage { char *location_string = loc->to_string (); -fprintf (stderr, "%-48s %9zu%c:%5.1f%%" -"%9zu%c%9zu%c:%5.1f%%" -"%11" PRIu64 "%c%11" PRIu64 "%c%10s\n", +fprintf (stderr, "%-48s " PRsa (9) ":%5.1f%%" +PRsa (9) PRsa (9) ":%5.1f%%" +PRsa (11) PRsa (11) "%10s\n", location_string, SIZE_AMOUNT (m_allocated), get_percent (m_allocated, total.m_allocated), SIZE_AMOUNT (m_peak), SIZE_AMOUNT (m_times), diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c index 9fdba23ce4c2..c989fb01e669 100644 --- a/gcc/ggc-common.c +++ b/gcc/ggc-common.c @@ -884,8 +884,8 @@ struct ggc_usage: public mem_usage { size_t balance = get_balance (); fprintf (stderr, -"%-48s %9zu%c:%5.1f%%%9zu%c:%5.1f%%" -"%9zu%c:%5.1f%%%9zu%c:%5.1f%%%9zu%c\n", +"%-48s " PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%" +PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%" PRsa (9) "\n", prefix, SIZE_AMOUNT (m_collected), get_percent (m_collected, total.m_collected), SIZE_AMOUNT (m_freed), get_percent (m_freed, total.m_freed), diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c index 00c2864711f0..f04b22ca8cca 100644 --- a/gcc/ggc-page.c +++ b/gcc/ggc-page.c @@ -2288,14 +2288,15 @@ ggc_print_statistics (void) overhead += (sizeof (page_entry) - sizeof (long) + BITMAP_SIZE (OBJECTS_IN_PAGE (p) + 1)); } - fprintf (stderr, "%-8zu %10zu%c %10zu%c %10zu%c\n", - OBJECT_SIZE (i), + fprintf (stderr, "%-8" PRIu64 " " PRsa (10) " " PRsa (10) " " + PRsa (10) "\n", + (uint64_t)OBJECT_SIZE (i), SIZE_AMOUNT (allocated), SIZE_AMOUNT (in_use), SIZE_AMOUNT (overhead)); total_overhead += overhead; } - fprintf (stderr, "%-8s %10zu%c %10zu%c %10zu%c\n", + fprintf (stderr, "%-8s " PRsa (10) " " PRsa (10) " " PRsa (10) "\n", "Total", SIZE_AMOUNT (G.bytes_mapped), SIZE_AMOUNT (G.allocated), @@ -2306,42 +2307,42 @@ ggc_print_statistics (void) fprintf (stderr, "\nTotal allocations and overheads during " "the compilation process\n");
[C++ Patch] Fix ensure_literal_type_for_constexpr_object locations
Hi, this one should be straightforward. Tested x86_64-linux. Thanks, Paolo. / /cp 2018-11-15 Paolo Carlini * constexpr.c (ensure_literal_type_for_constexpr_object): Use DECL_SOURCE_LOCATION in error_at calls. /testsuite 2018-11-15 Paolo Carlini * g++.dg/cpp0x/constexpr-diag3.C: Check locations too. * g++.dg/cpp0x/constexpr-ice19.C: Likewise. * g++.dg/cpp0x/constexpr-nonlit2.C: Likewise. * g++.dg/cpp1z/constexpr-lambda15.C: Likewise. * g++.dg/ext/constexpr-vla5.C: Likewise. * g++.dg/gomp/pr85134.C: Likewise. Index: cp/constexpr.c === --- cp/constexpr.c (revision 266174) +++ cp/constexpr.c (working copy) @@ -98,8 +98,9 @@ ensure_literal_type_for_constexpr_object (tree dec if (DECL_DECLARED_CONSTEXPR_P (decl)) { auto_diagnostic_group d; - error ("the type %qT of % variable %qD " -"is not literal", type, decl); + error_at (DECL_SOURCE_LOCATION (decl), + "the type %qT of % variable %qD " + "is not literal", type, decl); explain_non_literal_class (type); decl = error_mark_node; } @@ -108,8 +109,9 @@ ensure_literal_type_for_constexpr_object (tree dec if (!is_instantiation_of_constexpr (current_function_decl)) { auto_diagnostic_group d; - error ("variable %qD of non-literal type %qT in % " -"function", decl, type); + error_at (DECL_SOURCE_LOCATION (decl), + "variable %qD of non-literal type %qT in " + "% function", decl, type); explain_non_literal_class (type); decl = error_mark_node; } @@ -119,8 +121,9 @@ ensure_literal_type_for_constexpr_object (tree dec else if (DECL_DECLARED_CONSTEXPR_P (decl) && variably_modified_type_p (type, NULL_TREE)) { - error ("% variable %qD has variably-modified type %qT", -decl, type); + error_at (DECL_SOURCE_LOCATION (decl), + "% variable %qD has variably-modified " + "type %qT", decl, type); decl = error_mark_node; } } Index: testsuite/g++.dg/cpp0x/constexpr-diag3.C === --- testsuite/g++.dg/cpp0x/constexpr-diag3.C(revision 266174) +++ testsuite/g++.dg/cpp0x/constexpr-diag3.C(working copy) @@ -24,7 +24,7 @@ struct complex// { dg-message "no .constexpr. double im; }; -constexpr complex co1(0, 1); // { dg-error "not literal" } +constexpr complex co1(0, 1); // { dg-error "19:the type .const complex. of .constexpr. variable .co1. is not literal" } constexpr double dd2 = co1.real(); // { dg-error "|in .constexpr. expansion of " } // Index: testsuite/g++.dg/cpp0x/constexpr-ice19.C === --- testsuite/g++.dg/cpp0x/constexpr-ice19.C(revision 266174) +++ testsuite/g++.dg/cpp0x/constexpr-ice19.C(working copy) @@ -9,5 +9,6 @@ struct A struct B { - static constexpr A a {}; // { dg-error "not literal|in-class initialization" } + static constexpr A a {}; // { dg-error "22:the type .const A. of .constexpr. variable .B::a. is not literal" } +// { dg-error "in-class initialization" "" { target c++11 } .-1 } }; Index: testsuite/g++.dg/cpp0x/constexpr-nonlit2.C === --- testsuite/g++.dg/cpp0x/constexpr-nonlit2.C (revision 266174) +++ testsuite/g++.dg/cpp0x/constexpr-nonlit2.C (working copy) @@ -16,4 +16,4 @@ template constexpr W make_w(T& w) { return W(w); } A a; -constexpr auto w = make_w(a); // { dg-error "" } +constexpr auto w = make_w(a); // { dg-error "16:the type .const W. of .constexpr. variable .w. is not literal" } Index: testsuite/g++.dg/cpp1z/constexpr-lambda15.C === --- testsuite/g++.dg/cpp1z/constexpr-lambda15.C (revision 266174) +++ testsuite/g++.dg/cpp1z/constexpr-lambda15.C (working copy) @@ -3,7 +3,7 @@ struct S { constexpr S(int i) { -auto f = [i]{};// { dg-error "literal" "" { target c++14_only } } +auto f = [i]{};// { dg-error "10:variable .f. of non-literal type" "" { target c++14_only } } } }; int main() {} Index: testsuite/g++.dg/ext/constexpr-vla5.C === --- testsuite/g++.dg/ext/constexpr-vla5.C (revision 266174) +++ testsuite/g++.dg/ext/constexpr-vla5.C (working copy) @@ -3,5 +3,6 @@ void foo(int i) { - constexpr char x[i] = "";// { dg-error
[PATCH] Fix PR88031
With one of my last changes we regressed here so this goes all the way cleaning up things so we only have a single flag to vectorizable_condition whetehr we are called from reduction context. In theory the !multiple-types restriction could be easily lifted now (just remove the check). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2018-11-15 Richard Biener PR tree-optimization/88031 * tree-vect-loop.c (vectorizable_reduction): Move check for multiple types earlier so we get the expected dump. Simplify calls to vectorizable_condition. * tree-vect-stmts.h (vectorizable_condition): Update prototype. * tree-vect-stmts.c (vectorizable_condition): Instead of reduc_def and reduc_index take just a flag. Simplify code-generation now that we can rely on the defs being set up. (vectorizable_comparison): Remove unused argument. * gcc.dg/pr88031.c: New testcase. diff --git a/gcc/testsuite/gcc.dg/pr88031.c b/gcc/testsuite/gcc.dg/pr88031.c new file mode 100644 index 000..2c1d1c1391d --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr88031.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int a[512]; +int b; +void d() +{ + unsigned char c; + for (; b; b++) { + c = 1; + for (; c; c <<= 1) { + a[b] <<= 8; + if (b & c) + a[b] = 1; + } + } +} diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index eb01acdf717..88b980bb9d7 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -6531,14 +6531,24 @@ vectorizable_reduction (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, double_reduc = true; } + vect_reduction_type reduction_type += STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info); + if ((double_reduc || reduction_type != TREE_CODE_REDUCTION) + && ncopies > 1) +{ + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, +"multiple types in double reduction or condition " +"reduction.\n"); + return false; +} + if (code == COND_EXPR) { /* Only call during the analysis stage, otherwise we'll lose -STMT_VINFO_TYPE. We'll pass ops[0] as reduc_op, it's only -used as a flag during analysis. */ +STMT_VINFO_TYPE. */ if (!vec_stmt && !vectorizable_condition (stmt_info, gsi, NULL, - ops[0], 0, NULL, - cost_vec)) + true, NULL, cost_vec)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6638,8 +6648,6 @@ vectorizable_reduction (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, (and also the same tree-code) when generating the epilog code and when generating the code inside the loop. */ - vect_reduction_type reduction_type -= STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info); if (orig_stmt_info && (reduction_type == TREE_CODE_REDUCTION || reduction_type == FOLD_LEFT_REDUCTION)) @@ -6729,16 +6737,6 @@ vectorizable_reduction (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, return false; } - if ((double_reduc || reduction_type != TREE_CODE_REDUCTION) - && ncopies > 1) -{ - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"multiple types in double reduction or condition " -"reduction.\n"); - return false; -} - /* For SLP reductions, see if there is a neutral value we can use. */ tree neutral_op = NULL_TREE; if (slp_node) @@ -7003,7 +7001,7 @@ vectorizable_reduction (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, { gcc_assert (!slp_node); return vectorizable_condition (stmt_info, gsi, vec_stmt, -NULL, reduc_index, NULL, NULL); +true, NULL, NULL); } /* Create the destination vector */ @@ -7035,9 +7033,7 @@ vectorizable_reduction (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, { gcc_assert (!slp_node); vectorizable_condition (stmt_info, gsi, vec_stmt, - PHI_RESULT (phis[0]->stmt), - reduc_index, NULL, NULL); - /* Multiple types are not supported for condition. */ + true, NULL, NULL); break; } if (code == LSHIFT_EXPR diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 74646570e2a..a67a7f4e348 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -8675,17 +8675,14 @@ vect_is_simple_cond (tree cond, vec_info *vinfo, stmt using VEC_COND_EXPR to replace it, put it in VEC_STMT, and insert it at GSI.
[committed] graphite: add missing dump_enabled_p checks (PR tree-optimization/88015)
As of r266080, calls to dump_print* that aren't guarded by if (dump_enabled_p ()) lead to an assertion failure. This patch fixes a few calls that weren't guarded, avoiding a call to find_loop_location for the no-dumping case. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Committed to trunk as r266184, under the "obvious" rule. gcc/ChangeLog: PR tree-optimization/88015 * graphite-isl-ast-to-gimple.c (translate_isl_ast_to_gimple::scop_to_isl_ast): Add missing check for dump_enabled_p. * graphite-sese-to-poly.c (build_poly_scop): Likewise. --- gcc/graphite-isl-ast-to-gimple.c | 25 ++--- gcc/graphite-sese-to-poly.c | 3 ++- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c index e2dbf6f..cc93fb9 100644 --- a/gcc/graphite-isl-ast-to-gimple.c +++ b/gcc/graphite-isl-ast-to-gimple.c @@ -1411,17 +1411,20 @@ scop_to_isl_ast (scop_p scop) isl_ctx_set_max_operations (scop->isl_context, old_max_operations); if (isl_ctx_last_error (scop->isl_context) != isl_error_none) { - dump_user_location_t loc = find_loop_location - (scop->scop_info->region.entry->dest->loop_father); - if (isl_ctx_last_error (scop->isl_context) == isl_error_quota) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, -"loop nest not optimized, AST generation timed out " -"after %d operations [--param max-isl-operations]\n", -max_operations); - else - dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, -"loop nest not optimized, ISL AST generation " -"signalled an error\n"); + if (dump_enabled_p ()) + { + dump_user_location_t loc = find_loop_location + (scop->scop_info->region.entry->dest->loop_father); + if (isl_ctx_last_error (scop->isl_context) == isl_error_quota) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, +"loop nest not optimized, AST generation timed out " +"after %d operations [--param max-isl-operations]\n", +max_operations); + else + dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, +"loop nest not optimized, ISL AST generation " +"signalled an error\n"); + } isl_ast_node_free (ast_isl); return NULL; } diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c index 69898d4..1d41cff 100644 --- a/gcc/graphite-sese-to-poly.c +++ b/gcc/graphite-sese-to-poly.c @@ -1218,7 +1218,8 @@ build_poly_scop (scop_p scop) enum isl_error err = isl_ctx_last_error (scop->isl_context); isl_ctx_reset_error (scop->isl_context); isl_options_set_on_error (scop->isl_context, old_err); - if (err != isl_error_none) + if (err != isl_error_none + && dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, "ISL error while building poly scop\n"); -- 1.8.5.3
Re: [C++ Patch] Fix ensure_literal_type_for_constexpr_object locations
On Thu, Nov 15, 2018 at 02:39:14PM +0100, Paolo Carlini wrote: > Hi, > > this one should be straightforward. Tested x86_64-linux. > > Thanks, Paolo. > > / > > /cp > 2018-11-15 Paolo Carlini > > * constexpr.c (ensure_literal_type_for_constexpr_object): Use > DECL_SOURCE_LOCATION in error_at calls. > > /testsuite > 2018-11-15 Paolo Carlini > > * g++.dg/cpp0x/constexpr-diag3.C: Check locations too. > * g++.dg/cpp0x/constexpr-ice19.C: Likewise. > * g++.dg/cpp0x/constexpr-nonlit2.C: Likewise. > * g++.dg/cpp1z/constexpr-lambda15.C: Likewise. > * g++.dg/ext/constexpr-vla5.C: Likewise. > * g++.dg/gomp/pr85134.C: Likewise. > Index: cp/constexpr.c > === > --- cp/constexpr.c(revision 266174) > +++ cp/constexpr.c(working copy) > @@ -98,8 +98,9 @@ ensure_literal_type_for_constexpr_object (tree dec > if (DECL_DECLARED_CONSTEXPR_P (decl)) > { > auto_diagnostic_group d; > - error ("the type %qT of % variable %qD " > - "is not literal", type, decl); > + error_at (DECL_SOURCE_LOCATION (decl), > + "the type %qT of % variable %qD " > + "is not literal", type, decl); > explain_non_literal_class (type); > decl = error_mark_node; > } > @@ -108,8 +109,9 @@ ensure_literal_type_for_constexpr_object (tree dec > if (!is_instantiation_of_constexpr (current_function_decl)) > { > auto_diagnostic_group d; > - error ("variable %qD of non-literal type %qT in % > " > - "function", decl, type); > + error_at (DECL_SOURCE_LOCATION (decl), > + "variable %qD of non-literal type %qT in " > + "% function", decl, type); > explain_non_literal_class (type); > decl = error_mark_node; > } > @@ -119,8 +121,9 @@ ensure_literal_type_for_constexpr_object (tree dec >else if (DECL_DECLARED_CONSTEXPR_P (decl) > && variably_modified_type_p (type, NULL_TREE)) > { > - error ("% variable %qD has variably-modified type %qT", > - decl, type); > + error_at (DECL_SOURCE_LOCATION (decl), > + "% variable %qD has variably-modified " > + "type %qT", decl, type); > decl = error_mark_node; How about using location_of instead of DECL_SOURCE_LOCATION? Marek
[committed] v2: Machine-readable diagnostic output (PR other/19165)
Changed in v2: * added test coverage * improved docs * pass orig_diag_kind to the finalizer callback Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Committed to trunk as r266186. Blurb from v1: This patch implements a -fdiagnostics-format=json option which converts the diagnostics to be output to stderr in a JSON format; see the documentation in invoke.texi. Logically-related diagnostics are nested at the JSON level, using the auto_diagnostic_group mechanism. gcc/ChangeLog: PR other/19165 * Makefile.in (OBJS): Move json.o to... (OBJS-libcommon): ...here and add diagnostic-format-json.o. * common.opt (fdiagnostics-format=): New option. (diagnostics_output_format): New enum. * diagnostic-format-json.cc: New file. * diagnostic.c (default_diagnostic_final_cb): New function, taken from start of diagnostic_finish. (diagnostic_initialize): Initialize final_cb to default_diagnostic_final_cb. (diagnostic_finish): Move "being treated as errors" messages to default_diagnostic_final_cb. Call any final_cb. (default_diagnostic_finalizer): Add diagnostic_t param. (diagnostic_report_diagnostic): Pass "orig_diag_kind" to diagnostic_finalizer callback. * diagnostic.h (enum diagnostics_output_format): New enum. (diagnostic_finalizer_fn): Reimplement, adding diagnostic_t param. (struct diagnostic_context): Add "final_cb". (default_diagnostic_finalizer): Add diagnostic_t param. (diagnostic_output_format_init): New decl. * doc/invoke.texi (-fdiagnostics-format): New option. * dwarf2out.c (gen_producer_string): Ignore OPT_fdiagnostics_format_. * gcc.c (driver_handle_option): Handle OPT_fdiagnostics_format_. * lto-wrapper.c (append_diag_options): Ignore it. * opts.c (common_handle_option): Handle it. gcc/c-family/ChangeLog: PR other/19165 * c-opts.c (c_diagnostic_finalizer): Add diagnostic_t param. gcc/fortran/ChangeLog: PR other/19165 * error.c (gfc_diagnostic_finalizer): Add diagnostic_t param. gcc/jit/ChangeLog: PR other/19165 * dummy-frontend.c (jit_begin_diagnostic): Add diagnostic_t param. gcc/testsuite/ChangeLog: PR other/19165 * c-c++-common/diagnostic-format-json-1.c: New test. * c-c++-common/diagnostic-format-json-2.c: New test. * c-c++-common/diagnostic-format-json-3.c: New test. * c-c++-common/diagnostic-format-json-4.c: New test. * c-c++-common/diagnostic-format-json-5.c: New test. * gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (custom_diagnostic_finalizer): Add diagnostic_t param. * gcc.dg/plugin/location_overflow_plugin.c (verify_unpacked_ranges): Likewise. (verify_no_columns): Likewise. * gfortran.dg/diagnostic-format-json-1.F90: New test. * gfortran.dg/diagnostic-format-json-2.F90: New test. * gfortran.dg/diagnostic-format-json-3.F90: New test. --- gcc/Makefile.in| 2 +- gcc/c-family/c-opts.c | 3 +- gcc/common.opt | 17 ++ gcc/diagnostic-format-json.cc | 264 + gcc/diagnostic.c | 45 ++-- gcc/diagnostic.h | 23 +- gcc/doc/invoke.texi| 188 +++ gcc/dwarf2out.c| 1 + gcc/fortran/error.c| 3 +- gcc/gcc.c | 5 + gcc/jit/dummy-frontend.c | 3 +- gcc/lto-wrapper.c | 1 + gcc/opts.c | 5 + .../c-c++-common/diagnostic-format-json-1.c| 25 ++ .../c-c++-common/diagnostic-format-json-2.c| 26 ++ .../c-c++-common/diagnostic-format-json-3.c| 26 ++ .../c-c++-common/diagnostic-format-json-4.c| 55 + .../c-c++-common/diagnostic-format-json-5.c| 46 .../plugin/diagnostic_plugin_test_show_locus.c | 3 +- .../gcc.dg/plugin/location_overflow_plugin.c | 10 +- .../gfortran.dg/diagnostic-format-json-1.F90 | 25 ++ .../gfortran.dg/diagnostic-format-json-2.F90 | 26 ++ .../gfortran.dg/diagnostic-format-json-3.F90 | 26 ++ 23 files changed, 800 insertions(+), 28 deletions(-) create mode 100644 gcc/diagnostic-format-json.cc create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-1.c create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-2.c create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-3.c create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-4.c create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-5.
Re: [C++ Patch] Fix ensure_literal_type_for_constexpr_object locations
Hi, On 15/11/18 15:34, Marek Polacek wrote: How about using location_of instead of DECL_SOURCE_LOCATION? Well, we know that VAR_P is true of that DECL tree node, thus I don't see how location_of would be better. Certainly is more complex. Paolo.
Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
> On Nov 14, 2018, at 5:19 PM, Jozef Lawrynowicz > wrote: > > On Wed, 14 Nov 2018 11:30:39 -0500 > Paul Koning wrote: > >>> On Nov 14, 2018, at 10:44 AM, Jozef Lawrynowicz >>> wrote: >>> >>> Patch 1 tweaks dg directives in tests specifically for msp430. Many of >>> these are extensions to existing target selectors in dg directives. >>> >>> <0001-TESTSUITE-MSP430-Tweak-dg-directives-for-msp430-elf.patch> >> >> For pr41779.c, you have >> >> +/* { dg-skip-if "int is smaller than float" { msp430-*-* } } */ >> >> I take it that means: sizeof(int) < sizeof(float). That property also holds >> for pdp11 and perhaps other targets. Would it make sense to introduce a new >> effective-target flag for that check instead? >> >> paul >> > > Paul, > > Yes you are correct the comment implies sizeof(int) < sizeof(float). > > I believe this was the only test where this property affected the test > results, so a new effective-target flag is probably only worth adding if it > affects at least a couple of tests. > On the other hand, I suppose there is no harm in adding another > check-effective-target and it at least means we'll catch failures across more > targets. > > I'd be curious if the line I added the xfail to in c-c++-common/pr57371-2.c > also fails for pdp11. > > The conversion to float might be getting optimized out whenever > sizeof(int) < sizeof(float). > > Thanks, > Jozef Yes, that test on pr57371-2.c also fails on pdp11. paul
Re: Don't use %z printf format length specified
On Thu, Nov 15, 2018 at 2:31 PM Michael Matz wrote: > > Hi, > > On Thu, 15 Nov 2018, Richard Biener wrote: > > > > Okay, probably. Then consider the same patch with sprinkling casts to > > > long for all these arguments (the actual numbers will fit that type in > > > reality). I could of course also use PRIu64 but that makes my eyes > > > bleed. > > > > Hmm. Can you use PRIu64 please and cast to uint64_t? OK with that > > change. > > Instead I'd like to check in this. It adds a new macro expanding to > "%" #n PRIu64 "%c" > (with n being the width) corresponding to one SIZE_AMOUNT macro "argument" > (which are actually two arguments), and hides the cast in the latter macro > itself. At least the number of format string args and actual args > corresponds again then and my eyes bleed a little less. Still okay? Works for me. Richard. > > Ciao, > Michael. > > * system.h (PRsa): New macro. > (SIZE_AMOUNT): Cast number to uint64_t. > * alloc-pool.h (pool_usage::dump): Don't use %zu but PRsa. > (pool_usage::dump_footer): Likewise and also use PRIu64. > * bitmap.h (bitmap_usage::dump): Likewise. > * ggc-common.c (ggc_usage::dump): Likewise. > * ggc-page.c (ggc_print_statistics): Likewise. > * mem-stats.h (mem_usage::dump): Likewise. > (mem_usage::dump_footer): Likewise. > * rtl.c (dump_rtx_statistics): Likewise. > * vec.c (vec_usage::dump): Likewise. > (vec_usage::dump_footer): Likewise. > > diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h > index d17a05ca4fb1..81cb69e227ab 100644 > --- a/gcc/alloc-pool.h > +++ b/gcc/alloc-pool.h > @@ -63,8 +63,8 @@ struct pool_usage: public mem_usage >{ > char *location_string = loc->to_string (); > > -fprintf (stderr, "%-32s%-48s %5zu%c%9zu%c:%5.1f%%%9zu" > -"%c%9zu%c:%5.1f%%%12zu\n", > +fprintf (stderr, "%-32s%-48s " PRsa(5) PRsa(9) ":%5.1f%%" > +PRsa(9) PRsa(9) ":%5.1f%%%12" PRIu64 "\n", > m_pool_name, location_string, > SIZE_AMOUNT (m_instances), > SIZE_AMOUNT (m_allocated), > @@ -72,7 +72,7 @@ struct pool_usage: public mem_usage > SIZE_AMOUNT (m_peak), > SIZE_AMOUNT (m_times), > get_percent (m_times, total.m_times), > -m_element_size); > +(uint64_t)m_element_size); > > free (location_string); >} > @@ -91,7 +91,7 @@ struct pool_usage: public mem_usage >dump_footer () >{ > print_dash_line (); > -fprintf (stderr, "%s%82zu%c%10zu%c\n", "Total", > +fprintf (stderr, "%s" PRsa(82) PRsa(10) "\n", "Total", > SIZE_AMOUNT (m_instances), SIZE_AMOUNT (m_allocated)); > print_dash_line (); >} > diff --git a/gcc/bitmap.h b/gcc/bitmap.h > index 973ea846baf1..9a180daa7454 100644 > --- a/gcc/bitmap.h > +++ b/gcc/bitmap.h > @@ -239,9 +239,9 @@ struct bitmap_usage: public mem_usage >{ > char *location_string = loc->to_string (); > > -fprintf (stderr, "%-48s %9zu%c:%5.1f%%" > -"%9zu%c%9zu%c:%5.1f%%" > -"%11" PRIu64 "%c%11" PRIu64 "%c%10s\n", > +fprintf (stderr, "%-48s " PRsa (9) ":%5.1f%%" > +PRsa (9) PRsa (9) ":%5.1f%%" > +PRsa (11) PRsa (11) "%10s\n", > location_string, SIZE_AMOUNT (m_allocated), > get_percent (m_allocated, total.m_allocated), > SIZE_AMOUNT (m_peak), SIZE_AMOUNT (m_times), > diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c > index 9fdba23ce4c2..c989fb01e669 100644 > --- a/gcc/ggc-common.c > +++ b/gcc/ggc-common.c > @@ -884,8 +884,8 @@ struct ggc_usage: public mem_usage >{ > size_t balance = get_balance (); > fprintf (stderr, > -"%-48s %9zu%c:%5.1f%%%9zu%c:%5.1f%%" > -"%9zu%c:%5.1f%%%9zu%c:%5.1f%%%9zu%c\n", > +"%-48s " PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%" > +PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%" PRsa (9) "\n", > prefix, SIZE_AMOUNT (m_collected), > get_percent (m_collected, total.m_collected), > SIZE_AMOUNT (m_freed), get_percent (m_freed, total.m_freed), > diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c > index 00c2864711f0..f04b22ca8cca 100644 > --- a/gcc/ggc-page.c > +++ b/gcc/ggc-page.c > @@ -2288,14 +2288,15 @@ ggc_print_statistics (void) > overhead += (sizeof (page_entry) - sizeof (long) >+ BITMAP_SIZE (OBJECTS_IN_PAGE (p) + 1)); > } > - fprintf (stderr, "%-8zu %10zu%c %10zu%c %10zu%c\n", > - OBJECT_SIZE (i), > + fprintf (stderr, "%-8" PRIu64 " " PRsa (10) " " PRsa (10) " " > + PRsa (10) "\n", > + (uint64_t)OBJECT_SIZE (i), >SIZE_AMOUNT (allocated), >SIZE_AMOUNT (in_use), >SIZE_AMOUNT (overhead)); >total_overhead += overhead; > } > - fprintf (stderr, "%-8s %10zu%c %10zu%c %10zu%c\n", > + fprintf (stderr, "%-8s " PRsa (10)
[committed] doc/ux.texi: auto_diagnostic_group now does something
As of r266186, auto_diagnostic_group is now user-visible (via -fdiagnostics-format=json), so update the diagnostic guidelines accordingly. Committed to trunk as r266187. gcc/ChangeLog: * doc/ux.texi (Group logically-related diagnostics): Move discussion of auto_diagnostic_group into this new subsection. Give an example of where this grouping is used. --- gcc/doc/ux.texi | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/gcc/doc/ux.texi b/gcc/doc/ux.texi index 3b395f2..47b4492 100644 --- a/gcc/doc/ux.texi +++ b/gcc/doc/ux.texi @@ -376,12 +376,15 @@ where the @code{double} and @code{int} are colorized to highlight them. @c %H and %I were added in r248698. +@subsection Group logically-related diagnostics + Use @code{auto_diagnostic_group} when issuing multiple related diagnostics (seen in various examples on this page). This informs the diagnostic subsystem that all diagnostics issued within the lifetime -of the @code{auto_diagnostic_group} are related. (Currently it doesn't -do anything with this information, but we may implement that in the -future). +of the @code{auto_diagnostic_group} are related. For example, +@option{-fdiagnostics-format=json} will treat the first diagnostic +emitted within the group as a top-level diagnostic, and all subsequent +diagnostics within the group as its children. @subsection Quoting Text should be quoted by either using the @samp{q} modifier in a directive -- 1.8.5.3
Re: [PATCH, csky] Update dynamic linker'name
Hi, GNU_USER_DYNAMIC_LINKER is defined in linux.h, it will expand to GLIBC_DYNAMIC_LINKER when configured with glibc > 在 2018年11月15日,下午6:30,Richard Biener 写道: > > On Thu, Nov 15, 2018 at 7:02 AM 瞿仙淼 wrote: >> >> Hi, >>I have submitted a patch to update dynamic linker'name >> >> >> Index: gcc/ChangeLog >> === >> --- gcc/ChangeLog (revision 266171) >> +++ gcc/ChangeLog (working copy) >> @@ -1,3 +1,9 @@ >> +2018-11-15 Xianmiao Qu >> + >> + * config/csky/csky-linux-elf.h (LINUX_DYNAMIC_LINKER): Remove. >> + (GLIBC_DYNAMIC_LINKER): Define. >> + (LINUX_TARGET_LINK_SPEC): Update the dynamic linker's name. >> + >> 2018-11-15 Bin Cheng >> >>PR tree-optimization/84648 >> Index: gcc/config/csky/csky-linux-elf.h >> === >> --- gcc/config/csky/csky-linux-elf.h(revision 266171) >> +++ gcc/config/csky/csky-linux-elf.h(working copy) >> @@ -61,7 +61,7 @@ >> %{mvdsp:-mvdsp} \ >> " >> >> -#define LINUX_DYNAMIC_LINKER "/lib/ld.so.1" >> +#define GLIBC_DYNAMIC_LINKER >> "/lib/ld-linux-cskyv2%{mhard-float:-hf}%{mbig-endian:-be}.so.1" >> >> #define LINUX_TARGET_LINK_SPEC "%{h*} %{version:-v}\ >>%{b}\ >> @@ -70,7 +70,7 @@ >>%{symbolic:-Bsymbolic} \ >>%{!static: \ >> %{rdynamic:-export-dynamic} \ >> - %{!shared:-dynamic-linker " LINUX_DYNAMIC_LINKER "}} \ >> + %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} \ > > ^^^ GNU_USER_DYNAMIC_LINKER vs. GLIBC_DYNAMIC_LINKER > >>-X \ >>%{mbig-endian:-EB} %{mlittle-endian:-EL}\ >>%{EB:-EB} %{EL:-EL}" >> Index: libgcc/ChangeLog >> === >> --- libgcc/ChangeLog(revision 266171) >> +++ libgcc/ChangeLog(working copy) >> @@ -1,3 +1,7 @@ >> +2018-11-15 Xianmiao Qu >> + >> + * config/csky/linux-unwind.h: Fix coding style. >> + >> 2018-11-13 Xianmiao Qu >> >>* config/csky/linux-unwind.h (_sig_ucontext_t): Remove. >> Index: libgcc/config/csky/linux-unwind.h >> === >> --- libgcc/config/csky/linux-unwind.h (revision 266171) >> +++ libgcc/config/csky/linux-unwind.h (working copy) >> @@ -25,10 +25,8 @@ >> >> #ifndef inhibit_libc >> >> -/* >> - * Do code reading to identify a signal frame, and set the frame state data >> - * appropriately. See unwind-dw2.c for the structs. >> - */ >> +/* Do code reading to identify a signal frame, and set the frame state data >> + appropriately. See unwind-dw2.c for the structs. */ >> >> #include >> #include
[C++ DR 2336] virtual dtors, exception specs & abstract classes
This patch implements dr2336, a fix to dr1658 dr1658 made virtual bases of abstract classes ignored for synthesize cdtors -- there are no complete objects of such type, so the vbases will never be cdtored by such cdtors. Except when virtual dtors are in play, such a dtor could override a virtual dtor in a virtual base, and thus must not have a stricter exception specification. DR2336 addresses that. I had nearly anticipated this resolution, but I was also looking into virtual bases in other cases, and ignoring access in different other cases. This implements the DR2336 wording (it does not distinguish between virtual and non-virtual dtors of virtual bases). booted on x6_64-linux, committing to trunk. nathan -- Nathan Sidwell 2018-11-15 Nathan Sidwell DR 2336 * cp-tree.h (enum special_function_kind): Add sfk_virtual_destructor. * method.c (type_has_trivial_fn): Add it. (SFK_DTOR_P): Likewise. (synthesized_method_base_walk): Don't check access of vbases of abstract classes when sfk_virtual_destructor. (synthesized_method_walk): Skip vbases of abstract classes except when sfk_virtual_destructor. (get_defaulted_eh_spec): Set sfk_virtual_destructor as needed. * g++.dg/cpp1y/pr79393-3.C: New. Index: gcc/cp/cp-tree.h === --- gcc/cp/cp-tree.h (revision 266180) +++ gcc/cp/cp-tree.h (working copy) @@ -5099,7 +5099,8 @@ enum special_function_kind { deletes the object after it has been destroyed. */ sfk_conversion, /* A conversion operator. */ - sfk_deduction_guide /* A class template deduction guide. */ + sfk_deduction_guide, /* A class template deduction guide. */ + sfk_virtual_destructor /* Used by member synthesis fns. */ }; /* The various kinds of linkage. From [basic.link], Index: gcc/cp/method.c === --- gcc/cp/method.c (revision 266180) +++ gcc/cp/method.c (working copy) @@ -402,6 +402,7 @@ type_has_trivial_fn (tree ctype, special case sfk_move_assignment: return !TYPE_HAS_COMPLEX_MOVE_ASSIGN (ctype); case sfk_destructor: +case sfk_virtual_destructor: return !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (ctype); case sfk_inheriting_constructor: return false; @@ -1287,7 +1288,7 @@ process_subob_fn (tree fn, tree *spec_p, #define SFK_CTOR_P(sfk) \ ((sfk) >= sfk_constructor && (sfk) <= sfk_move_constructor) #define SFK_DTOR_P(sfk) \ - ((sfk) == sfk_destructor) + ((sfk) == sfk_destructor || (sfk) == sfk_virtual_destructor) #define SFK_ASSIGN_P(sfk) \ ((sfk) == sfk_copy_assignment || (sfk) == sfk_move_assignment) #define SFK_COPY_P(sfk) \ @@ -1481,12 +1482,11 @@ synthesized_method_base_walk (tree binfo if (flag_new_inheriting_ctors) defer = dk_deferred; } - /* To be conservative, ignore access to the base dtor that - DR1658 instructs us to ignore. See the comment in - synthesized_method_walk. */ - else if (cxx_dialect >= cxx14 && fnname == complete_dtor_identifier + else if (cxx_dialect >= cxx14 && sfk == sfk_virtual_destructor && BINFO_VIRTUAL_P (base_binfo) && ABSTRACT_CLASS_TYPE_P (BINFO_TYPE (binfo))) +/* Don't check access when looking at vbases of abstract class's + virtual destructor. */ defer = dk_no_check; if (defer != dk_no_deferred) @@ -1572,7 +1572,7 @@ synthesized_method_walk (tree ctype, spe bool check_vdtor = false; tree fnname; -if (SFK_DTOR_P (sfk)) + if (SFK_DTOR_P (sfk)) { check_vdtor = true; /* The synthesized method will call base dtors, but check complete @@ -1696,12 +1696,11 @@ if (SFK_DTOR_P (sfk)) else if (vec_safe_is_empty (vbases)) /* No virtual bases to worry about. */; else if (ABSTRACT_CLASS_TYPE_P (ctype) && cxx_dialect >= cxx14 - /* DR 1658 specifies that vbases of abstract classes are - ignored for both ctors and dtors. However, that breaks - virtual dtor overriding when the ignored base has a - throwing destructor. So, ignore that piece of 1658. A - defect has been filed (no number yet). */ - && sfk != sfk_destructor) + /* DR 1658 specifis that vbases of abstract classes are + ignored for both ctors and dtors. Except DR 2338 + overrides that skipping when determing the eh-spec of a + virtual destructor. */ + && sfk != sfk_virtual_destructor) /* Vbase cdtors are not relevant. */; else { @@ -1748,6 +1747,9 @@ get_defaulted_eh_spec (tree decl, tsubst tree spec = empty_except_spec; bool diag = !DECL_DELETED_FN (decl) && (complain & tf_error); tree inh = DECL_INHERITED_CTOR (decl); + if (SFK_DTOR_P (sfk) && DECL_VIRTUAL_P (decl)) +/* We have to examine virtual bases even if abstract. */ +sfk = sfk_virtual_destructor; synthesized_method_walk (ctype, sfk, const_p, &spec, NULL, NULL, NULL, diag, &inh, parms); return spec; Index: gcc/testsuite/g++.
Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)
Hi, On Wed, 14 Nov 2018, Alexander Monakov wrote: > On Wed, 14 Nov 2018, Segher Boessenkool wrote: > > Yeah, using local register vars to access global registers only works > > by accident. It does work currently though, and people apparently use > > it, so we shouldn't break it :-/ > > In the proposed approach (copying from/to pseudos just before/after the > asm) we can emulate historic behavior by making uninitialized pseudos > take values of the corresponding hardregs. If we decide that doing that > just for must-uninit pseudos is enough, it's a simple extension to the > existing init-regs pass. Does this sound reasonable? Or we can just decide that nothing needs fixing. In particular about this from Jakub and Segher: > > What doesn't work as the reporter expect is assumption that local hard > > register variables that live across function calls must have their > > values preserved; they can be modified by the callees. > > It would be really nice if we could fix that :-) I disagree that there's something to fix. My mental model for local reg vars has always been that such vars are actually an alias for that register, not comparable to normal automatic variables (so, much like global reg vars, except that they don't reserve away the register from regalloc). I.e. like volatile they can arbitrarily change their value. I don't know if other peoples mind model is similar, but it certainly is the model that is implemented in the GIMPLE pipeline (and if memory serves well of the RTL pipeline as well). Copying outof/into pseudos around asms serves no purpose with that model, it rather makes regvars somewhat "safer" to use, without actually making them safer (there _is_ nothing safe about them). Ciao, Michael.
Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses
On Thu, Nov 15, 2018 at 02:26:24PM +0530, Umesh Kalappa wrote: > Thank you Marek for the inputs . > >>In the future, if using diff, please also use the -p option. > We are using svn diif and other comments are addressed . Thanks, but it doesn't seem like the -p option was used. > please let us know your take on the revised attached patch . Index: cp/ChangeLog === --- cp/ChangeLog(revision 266026) +++ cp/ChangeLog(working copy) @@ -1,3 +1,9 @@ +2018-11-14 Kamlesh Kumar + + PR c++/52869 + *parser.c () : restore the old current_class_{ptr,ref} by + inject_this_parameter(). + This is still the same; can you adjust it according to my last suggestion? Index: cp/parser.c === --- cp/parser.c (revision 266026) +++ cp/parser.c (working copy) @@ -24620,6 +24620,12 @@ { matching_parens parens; parens.consume_open (parser); + + tree save_ccp = current_class_ptr; + tree save_ccr = current_class_ref; + Watch out for trailing whitespace in the blank lines. + if (current_class_type) + inject_this_parameter (current_class_type, TYPE_UNQUALIFIED); I think you can remove the if here. if (require_constexpr) { @@ -24640,6 +24646,9 @@ } parens.require_close (parser); + + save_ccp = current_class_ptr = save_ccp; + save_ccr = current_class_ref = save_ccr; You don't need to set save_cc[pr] to itself here. Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 266026) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2018-11-14 Kamlesh Kumar + + PR c++/52869 + * g++.dg//DRs/dr52869.C: New. + So DR != PR. Please name the test dr1207.C Index: testsuite/g++.dg/DRs/dr52869.C === --- testsuite/g++.dg/DRs/dr52869.C (nonexistent) +++ testsuite/g++.dg/DRs/dr52869.C (working copy) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -std=c++11" } */ Instead of this, do: // { dg-do compile { target c++11 } } Also, I wrote a test that fails if current_class_{ptr,ref} aren't properly restored: // DR 1207 // PR c++/52869 // { dg-do compile { target c++11 } } void fn () { struct S { bool operator!() noexcept(false); } s; S t = s; } So you can add that one too, e.g. testsuite/g++.dg/DRs/dr1207-2.C. Thanks, Marek
Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses
On Thu, Nov 15, 2018 at 10:56:15AM -0500, Marek Polacek wrote: > + if (current_class_type) > + inject_this_parameter (current_class_type, TYPE_UNQUALIFIED); > > I think you can remove the if here. Actually you probably need it after all. Marek
[PATCH][libbacktrace] Factor out read_string
Hi, This patch factors out new function read_string in dwarf.c. Bootstrapped and reg-tested on x86_64. OK for trunk (or, for stage1)? Thanks, - Tom [libbacktrace] Factor out read_string 2018-11-15 Tom de Vries * dwarf.c (read_string): Factor out of ... (read_attribute, read_line_header, read_line_program): ... here. --- libbacktrace/dwarf.c | 39 --- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c index 4566d37cf2f..c4f8732c7eb 100644 --- a/libbacktrace/dwarf.c +++ b/libbacktrace/dwarf.c @@ -411,6 +411,25 @@ advance (struct dwarf_buf *buf, size_t count) return 1; } +/* Read one zero-terminated string from BUF and advance past the string. */ + +static const char * +read_string (struct dwarf_buf *buf) +{ + const char *p = (const char *)buf->buf; + size_t len = strnlen (p, buf->left); + + /* - If len == left, we ran out of buffer before finding the zero terminator. + Generate an error by advancing len + 1. + - If len < left, advance by len + 1 to skip past the zero terminator. */ + size_t count = len + 1; + + if (!advance (buf, count)) +return NULL; + + return p; +} + /* Read one byte from BUF and advance 1 byte. */ static unsigned char @@ -694,8 +713,8 @@ read_attribute (enum dwarf_form form, struct dwarf_buf *buf, return 1; case DW_FORM_string: val->encoding = ATTR_VAL_STRING; - val->u.string = (const char *) buf->buf; - return advance (buf, strnlen ((const char *) buf->buf, buf->left) + 1); + val->u.string = read_string (buf); + return val->u.string == NULL ? 0 : 1; case DW_FORM_block: val->encoding = ATTR_VAL_BLOCK; return advance (buf, read_uleb128 (buf)); @@ -1649,11 +1668,10 @@ read_line_header (struct backtrace_state *state, struct unit *u, if (hdr_buf.reported_underflow) return 0; - hdr->dirs[i] = (const char *) hdr_buf.buf; - ++i; - if (!advance (&hdr_buf, - strnlen ((const char *) hdr_buf.buf, hdr_buf.left) + 1)) + hdr->dirs[i] = read_string (&hdr_buf); + if (hdr->dirs[i] == NULL) return 0; + ++i; } if (!advance (&hdr_buf, 1)) return 0; @@ -1687,9 +1705,8 @@ read_line_header (struct backtrace_state *state, struct unit *u, if (hdr_buf.reported_underflow) return 0; - filename = (const char *) hdr_buf.buf; - if (!advance (&hdr_buf, - strnlen ((const char *) hdr_buf.buf, hdr_buf.left) + 1)) + filename = read_string (&hdr_buf); + if (filename == NULL) return 0; dir_index = read_uleb128 (&hdr_buf); if (IS_ABSOLUTE_PATH (filename) @@ -1808,8 +1825,8 @@ read_line_program (struct backtrace_state *state, struct dwarf_data *ddata, const char *f; unsigned int dir_index; - f = (const char *) line_buf->buf; - if (!advance (line_buf, strnlen (f, line_buf->left) + 1)) + f = read_string (line_buf); + if (f == NULL) return 0; dir_index = read_uleb128 (line_buf); /* Ignore that time and length. */
Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)
On Thu, 15 Nov 2018, Michael Matz wrote: > I disagree that there's something to fix. My mental model for local reg > vars has always been that such vars are actually an alias for that > register, not comparable to normal automatic variables (so, much like > global reg vars, except that they don't reserve away the register from > regalloc). I.e. like volatile they can arbitrarily change their value. > I don't know if other peoples mind model is similar, but it certainly is > the model that is implemented in the GIMPLE pipeline (and if memory serves > well of the RTL pipeline as well). Reading the documentation certainly does not make that impression to me. In any case, can you elaborate a bit further please: 1. Regarding the comparison to 'volatile' qualifier. Suppose you have an automatic variable 'int v;' in a correct program. The variable is only used for some arithmetic, never passed to asms, does not have its address taken. Thus, changing its declaration to 'volatile int v;' would not make the program invalid. Now, can declaring it as 'register int v asm("%rbx");' (with a callee-saved register) make the program invalid? My reading of the documentation is that it would provide a regalloc hint and have no ill effects. 2. Are testcases given in PR 87984 valid? Quoting the latest example: int f(void) { int o=0, i; for (i=0; i<3; i++) { register int a asm("eax"); a = 1; asm("add %1, %0" : "+r"(o) : "r"(a)); asm("xor %%eax, %%eax" ::: "eax"); } return o; } This follows both your model and the documentation to the letter, and yet will return 1 rather than 3. I disagree that it is practical to implement your model on GIMPLE. Thanks. Alexander
Re: [C++ Patch] Fix ensure_literal_type_for_constexpr_object locations
OK. On Thu, Nov 15, 2018 at 8:39 AM Paolo Carlini wrote: > > Hi, > > this one should be straightforward. Tested x86_64-linux. > > Thanks, Paolo. > > / >
Re: [PATCH 01/25] Handle vectors that don't fit in an integer.
On 14/09/2018 16:37, Richard Sandiford wrote: diff --git a/gcc/combine.c b/gcc/combine.c index a2649b6..cbf9dae 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -8621,7 +8621,13 @@ gen_lowpart_or_truncate (machine_mode mode, rtx x) { /* Bit-cast X into an integer mode. */ if (!SCALAR_INT_MODE_P (GET_MODE (x))) - x = gen_lowpart (int_mode_for_mode (GET_MODE (x)).require (), x); + { + enum machine_mode imode = + int_mode_for_mode (GET_MODE (x)).require (); + if (imode == BLKmode) + return gen_rtx_CLOBBER (mode, const0_rtx); + x = gen_lowpart (imode, x); require () will ICE if there isn't an integer mode and always returns a scalar_int_mode, so this looks like a no-op. I think you want something like: This is a patch that I inherited from Honza and Martin and didn't know what testcase it fixed. I think that it being broken shows that it's no longer necessary, and reverting the patch and retesting confirms this suspicion. I've removed it from the patch. @@ -11698,6 +11704,11 @@ gen_lowpart_for_combine (machine_mode omode, rtx x) if (omode == imode) return x; + /* This can happen when there is no integer mode corresponding + to a size of vector mode. */ + if (omode == BLKmode) +goto fail; + /* We can only support MODE being wider than a word if X is a constant integer or has a mode the same size. */ if (maybe_gt (GET_MODE_SIZE (omode), UNITS_PER_WORD) This seems like it's working around a bug in ther caller. Again, I inherited this hunk. Removing it and retesting shows no regressions, so I'm dropping it. diff --git a/gcc/expr.c b/gcc/expr.c index cd5cf12..776254a 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -10569,6 +10569,14 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, || maybe_gt (bitpos + bitsize, GET_MODE_BITSIZE (mode2))); + /* If the result is in BLKmode and the underlying object is a + vector in a register, and the size of the vector is larger than + the largest integer mode, then we must force OP0 to be in memory + as this is assumed in later code. */ + if (REG_P (op0) && VECTOR_MODE_P (mode2) && mode == BLKmode + && maybe_gt (bitsize, MAX_FIXED_MODE_SIZE)) + must_force_mem = 1; + /* Handle CONCAT first. */ if (GET_CODE (op0) == CONCAT && !must_force_mem) { Are you sure this is still needed after: 2018-06-04 Richard Sandiford * expr.c (expand_expr_real_1): Force the operand into memory if its TYPE_MODE is BLKmode and if there is no integer mode for the number of bits being extracted. Apparently you're right about this. Hunk dropped. diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 8d94fca..607a2bd 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -6702,12 +6702,12 @@ vectorizable_store (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, supported. */ unsigned lsize = group_size * GET_MODE_BITSIZE (elmode); - elmode = int_mode_for_size (lsize, 0).require (); unsigned int lnunits = const_nunits / group_size; /* If we can't construct such a vector fall back to element extracts from the original vector type and element size stores. */ - if (mode_for_vector (elmode, lnunits).exists (&vmode) + if (int_mode_for_size (lsize, 0).exists (&elmode) + && mode_for_vector (elmode, lnunits).exists (&vmode) && VECTOR_MODE_P (vmode) && targetm.vector_mode_supported_p (vmode) && (convert_optab_handler (vec_extract_optab, @@ -7839,11 +7839,11 @@ vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, to a larger load. */ unsigned lsize = group_size * TYPE_PRECISION (TREE_TYPE (vectype)); - elmode = int_mode_for_size (lsize, 0).require (); unsigned int lnunits = const_nunits / group_size; /* If we can't construct such a vector fall back to element loads of the original vector type. */ - if (mode_for_vector (elmode, lnunits).exists (&vmode) + if (int_mode_for_size (lsize, 0).exists (&elmode) + && mode_for_vector (elmode, lnunits).exists (&vmode) && VECTOR_MODE_P (vmode) && targetm.vector_mode_supported_p (vmode) && (convert_optab_handler (vec_init_optab, vmode, elmode) These two are OK independently of the rest (if that's convenient). Thanks, I've committed the attached. These are the only parts of the patch that re
Re: [PATCH] RFC: elide repeated source locations (PR other/84889)
On 11/14/2018 02:12 PM, David Malcolm wrote: On Mon, 2018-11-12 at 13:37 -0700, Martin Sebor wrote: On 11/11/2018 07:43 PM, David Malcolm wrote: We often emit more than one diagnostic at the same source location. For example, the C++ frontend can emit many diagnostics at the same source location when suggesting overload candidates. For example: ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In function 'int test_3(s, t)': ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: error: no match for 'operator&&' (operand types are 's' and 't') 38 | return param_s && param_t; | ^~ ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: note: candidate: 'operator&&(bool, bool)' ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: note: no known conversion for argument 2 from 't' to 'bool' This is overly verbose. Note how the same location has been printed three times, obscuring the pertinent messages. This patch add a new "elide" value to -fdiagnostics-show-location= and makes it the default (previously it was "once"). With elision the above is printed as: ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In function 'int test_3(s, t)': ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: error: no match for 'operator&&' (operand types are 's' and 't') 38 | return param_s && param_t; | ^~ = note: candidate: 'operator&&(bool, bool)' = note: no known conversion for argument 2 from 't' to 'bool' where the followup notes are printed with a '=' lined up with the source code margin. Thoughts? I agree the long pathname in the notes is at first glance redundant but I'm not sure about using '=' as a shorthand for it. I have written many scripts to parse GCC output to extract all diagnostics (including notes) and publish those on a Web page somewhere, as I'm sure must have others. All those scripts would stop working with this change and require changes to the build system to work again. Making those changes can be a substantial undertaking in some organizations. Have you considered printing just the file name instead? Or any other alternatives? "-fdiagnostics-show-location=once" will restore the old behavior. Alternatively, if you want to parse GCC output, I'm adding a JSON output format; see: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01038.html (I'm testing an updated version of that patch) I understand the change can be disabled by an option. I also like making the repetitive pathnames shorter, but the concern I have is that the change to use the '=' notation by default, besides being rather unusual and not necessarily intuitive, will break scripts that search for the pattern: "[^:]*:[1-9][0-9]*:[1-9][0-9]*: (error|note|warning): " and adding a new option to a large build system can be quite difficult. I suspect it would also make the notes difficult or even impossible to associate with the corresponding errors or warnings in parallel builds (where they may be interleaved with diagnostics from different compilations). I think it's possible to make the output shorter/less repetitive without these side-effects by keeping the current format and instead abbreviating the pathname, e.g. by printing just the file name (or ".../filename.c:" with the ellipsis standing in for the long pathname, though the ellipsis would require adjusting the scripts that sort diagnostics by the pathname). Martin PS As an aside, if we wanted to shorten all pathnames this same idea could be extended to errors and warnings as well by printing the pathname in the first one and just the filename in subsequent ones in the same file.
Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
On Thu, 15 Nov 2018 09:48:05 -0500 Paul Koning wrote: > > On Nov 14, 2018, at 5:19 PM, Jozef Lawrynowicz > > wrote: > > > > > > I'd be curious if the line I added the xfail to in c-c++-common/pr57371-2.c > > also fails for pdp11. > > > > The conversion to float might be getting optimized out whenever > > sizeof(int) < sizeof(float). > > > > Thanks, > > Jozef > > Yes, that test on pr57371-2.c also fails on pdp11. > > paul > Thanks for checking, in that case I'll go ahead an add an effective target for "int_lt_float". I'll make a note to investigate that test failure as well. The test comments: > We can not get rid of comparison in tests below because of > potential inexact exception. If I'm understanding the test correctly, then if the cast to float has been optimized out, users expecting the inexact float exception to be raised will have unexpected behaviour. Jozef
Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)
On 11/14/2018 02:47 AM, Jakub Jelinek wrote: On Tue, Nov 13, 2018 at 09:11:41PM -0700, Martin Sebor wrote: --- gcc/c-family/c-warn.c (revision 266086) +++ gcc/c-family/c-warn.c (working copy) @@ -2609,3 +2609,39 @@ warn_for_multistatement_macros (location_t body_lo inform (guard_loc, "some parts of macro expansion are not guarded by " "this %qs clause", guard_tinfo_to_string (keyword)); } + + +/* Diagnose unsuported use of explicit hardware register variable ARG + as an argument ARGNO to function FNDECL. */ + +void +warn_hw_reg_arg (tree fndecl, int argno, tree arg) +{ + if (!fndecl) +return; + + /* Avoid diagnosing GCC intrinsics with no library fallbacks. */ + if (fndecl_built_in_p (fndecl) + && DECL_IS_BUILTIN (fndecl) + && !c_decl_implicit (fndecl) + && !DECL_ASSEMBLER_NAME_SET_P (fndecl)) +return; + + /* Also avoid diagnosing always_inline functions since those are + often used to implement vectorization intrinsics that make use + of hardware register variables. */ + if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (fndecl))) +return; + + /* Diagnose uses of local variables declared asm register. */ + STRIP_ANY_LOCATION_WRAPPER (arg); + if (VAR_P (arg) + && !TREE_STATIC (arg) + && DECL_HARD_REGISTER (arg) + && warning_at (input_location, OPT_Wasm_register_var, +"unsupported use of hardware register %qD as " +"argument %d of %qD", +arg, argno, fndecl)) +inform (DECL_SOURCE_LOCATION (arg), + "%qD declared here", arg); +} This makes no sense to me. There is nothing unsupported in passing a local hard register variable to a function, that is well defined, PR 88000 was resolved as invalid quoting the following sentence from the manual as the rationale: The only supported use for this feature is to specify registers for input and output operands when calling Extended asm. If these variables are meant to be supported in other contexts (such as in calls to built-ins or even user-defined functions) the manual needs to be clarified. and as your testcase changes show, you broke quite some completely valid testcases with that. Let's not be melodramatic. Nothing was "broken" by a proposed warning. But I did say that "if using explicit register vars in those functions is meant to be supported despite what the manual says the patch will need tweaking (as will the manual)." What doesn't work as the reporter expect is assumption that local hard register variables that live across function calls must have their values preserved; they can be modified by the callees. I'm not sure I understand how what you described as supported is different from what the test case in 88000 does: i.e., pass the value of a register variable to a function and expect its value to be unchanged. If the value happens to be preserved in some cases/for some registers but not for others, unless the two sets can be reliably distinguished it seems like a good candidate for a warning, to help users fall into the same trap. Users who know what they're doing can easily suppress the warning and others will fix their code. Is there a way to do distinguish the two sets of cases? Martin
Re: [PING^2] Re: [PATCH 1/3] Support instrumenting returns of instrumented functions
Andi Kleen writes: Ping!^2 > Andi Kleen writes: > > Ping! > >> From: Andi Kleen >> >> When instrumenting programs using __fentry__ it is often useful >> to instrument the function return too. Traditionally this >> has been done by patching the return address on the stack >> frame on entry. However this is fairly complicated (trace >> function has to emulate a stack) and also slow because >> it causes a branch misprediction on every return. >> >> Add an option to generate call or nop instrumentation for >> every return instead, including patch sections. >> >> This will increase the program size slightly, but can be a >> lot faster and simpler. >> >> This version only instruments true returns, not sibling >> calls or tail recursion. This matches the semantics of the >> original stack. >> >> gcc/: >> >> 2018-11-04 Andi Kleen >> >> * config/i386/i386-opts.h (enum instrument_return): Add. >> * config/i386/i386.c (output_return_instrumentation): Add. >> (ix86_output_function_return): Call output_return_instrumentation. >> (ix86_output_call_insn): Call output_return_instrumentation. >> * config/i386/i386.opt: Add -minstrument-return=. >> * doc/invoke.texi (-minstrument-return): Document. >> >> gcc/testsuite/: >> >> 2018-11-04 Andi Kleen >> >> * gcc.target/i386/returninst1.c: New test. >> * gcc.target/i386/returninst2.c: New test. >> * gcc.target/i386/returninst3.c: New test. >> --- >> gcc/config/i386/i386-opts.h | 6 >> gcc/config/i386/i386.c | 36 + >> gcc/config/i386/i386.opt| 21 >> gcc/doc/invoke.texi | 14 >> gcc/testsuite/gcc.target/i386/returninst1.c | 14 >> gcc/testsuite/gcc.target/i386/returninst2.c | 21 >> gcc/testsuite/gcc.target/i386/returninst3.c | 9 ++ >> 7 files changed, 121 insertions(+) >> create mode 100644 gcc/testsuite/gcc.target/i386/returninst1.c >> create mode 100644 gcc/testsuite/gcc.target/i386/returninst2.c >> create mode 100644 gcc/testsuite/gcc.target/i386/returninst3.c >> >> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h >> index 46366cbfa72..35e9413100e 100644 >> --- a/gcc/config/i386/i386-opts.h >> +++ b/gcc/config/i386/i386-opts.h >> @@ -119,4 +119,10 @@ enum indirect_branch { >>indirect_branch_thunk_extern >> }; >> >> +enum instrument_return { >> + instrument_return_none = 0, >> + instrument_return_call, >> + instrument_return_nop5 >> +}; >> + >> #endif >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >> index f9ef0b4445b..f7cd94a8139 100644 >> --- a/gcc/config/i386/i386.c >> +++ b/gcc/config/i386/i386.c >> @@ -28336,12 +28336,47 @@ ix86_output_indirect_jmp (rtx call_op) >> return "%!jmp\t%A0"; >> } >> >> +/* Output return instrumentation for current function if needed. */ >> + >> +static void >> +output_return_instrumentation (void) >> +{ >> + if (ix86_instrument_return != instrument_return_none >> + && flag_fentry >> + && !DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (cfun->decl)) >> +{ >> + if (ix86_flag_record_return) >> +fprintf (asm_out_file, "1:\n"); >> + switch (ix86_instrument_return) >> +{ >> +case instrument_return_call: >> + fprintf (asm_out_file, "\tcall\t__return__\n"); >> + break; >> +case instrument_return_nop5: >> + /* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1) */ >> + fprintf (asm_out_file, ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n"); >> + break; >> +case instrument_return_none: >> + break; >> +} >> + >> + if (ix86_flag_record_return) >> +{ >> + fprintf (asm_out_file, "\t.section __return_loc, \"a\",@progbits\n"); >> + fprintf (asm_out_file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long"); >> + fprintf (asm_out_file, "\t.previous\n"); >> +} >> +} >> +} >> + >> /* Output function return. CALL_OP is the jump target. Add a REP >> prefix to RET if LONG_P is true and function return is kept. */ >> >> const char * >> ix86_output_function_return (bool long_p) >> { >> + output_return_instrumentation (); >> + >>if (cfun->machine->function_return_type != indirect_branch_keep) >> { >>char thunk_name[32]; >> @@ -28454,6 +28489,7 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op) >> >>if (SIBLING_CALL_P (insn)) >> { >> + output_return_instrumentation (); >>if (direct_p) >> { >>if (ix86_nopic_noplt_attribute_p (call_op)) >> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt >> index e7fbf9b6f99..5925b75244f 100644 >> --- a/gcc/config/i386/i386.opt >> +++ b/gcc/config/i386/i386.opt >> @@ -1063,3 +1063,24 @@ Support WAITPKG built-in functions and code >> generation. >> mcldemote >> Target Report Mask(ISA_CLDEMOTE) Var(ix86_isa_flags2) Save >> Support CLDEMOTE built-in functions and code generation. >> + >> +minstrument-re
Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)
On 11/15/2018 02:39 AM, Matthew Malcomson wrote: On 02/11/18 09:54, Christophe Lyon wrote: Hi, I've noticed failure on targets using newlib (aarch64-elf and arm-eabi): FAIL: gcc.c-torture/execute/printf-2.c FAIL: gcc.c-torture/execute/user-printf.c my gcc.log contains: gcc.c-torture/execute/user-printf.c -O0 execution test (reason: TCL LOOKUP CHANNEL exp5) which is not very helpful @Christophe Can I ask if your DejaGNU board setup has "needs_status_wrapper 1" for both of these boards? I believe this problem is caused by an interaction with the DejaGNU status wrapper. When the status wrapper is needed, DejaGNU looks at stdout for a line saying "*** EXIT code " indicating what the status code was. When it doesn't find that line it assumes an exit code of 2. Without the status wrapper DejaGNU takes the return code from the program executed. Because these tests use "freopen ()" on stdout, the status wrapper fails to print to the IO channel DejaGNU is listening on, hence DejaGNU fails to find it's line, uses an exit code of 2, and fails the test. @Martin Would these tests still be valid having run freopen on stderr and using fprintf instead of printf? That makes the testcases pass for me. The printf-2.c test specifically exercises the printf built-in, i.e., not fprintf, so changing it to use fprintf would defeat its purpose. user-printf.c does something similar but for a user-defined function with attribute format (printf). The purpose of the tests is to verify that what they read from the temporary file matches what they wrote (i.e,, that none of the printf() or user_print() calls was incorrectly eliminated). If not we could add an { dg-require-effective-target unwrapped } directive in the testcases to stop the failure complaints. I'm not familiar with this directive or really know what a status wrapper is but as long as it doesn't change the I/O the test does I think it should be fine. Martin
Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)
On 11/15/2018 03:10 AM, Richard Biener wrote: On Wed, Nov 14, 2018 at 7:08 PM Jeff Law wrote: On 11/6/18 4:56 PM, Martin Sebor wrote: In response to Joseph's comment I've removed the interaction with -Wpedantic from the updated patch. In addition, to help detect bugs like the one in the test case for pr87886, I have also enhanced the detection of incompatible calls to include integer/real type mismatches so that calls like the one below are diagnosed: extern double sqrt (); int f (int x) { return sqrt (x); // passing int where double is expected } With the removal of the -Wpedantic interaction declaring abort() without a prototype is no longer diagnosed and so the test suite changes to add the prototype are not necessary. I decided not to back them out because Jeff indicated a preference for making these kinds of improvements in general in an unrelated discussion. gcc-83656.diff PR c/83656 - missing -Wbuiltin-declaration-mismatch on declaration without prototype gcc/c/ChangeLog: PR c/83656 * c-decl.c (header_for_builtin_fn): Declare. (diagnose_mismatched_decls): Diagnose declarations of built-in functions without a prototype. * c-typeck.c (maybe_warn_builtin_no_proto_arg): New function. (convert_argument): Same. (convert_arguments): Factor code out into convert_argument. Detect mismatches between built-in formal arguments in calls to built-in without prototype. (build_conditional_expr): Same. (type_or_builtin_type): New function. (convert_for_assignment): Add argument. Conditionally issue warnings instead of errors for mismatches. gcc/testsuite/ChangeLog: PR c/83656 * gcc.dg/20021006-1.c * gcc.dg/Wbuiltin-declaration-mismatch.c: New test. * gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test. * gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test. * gcc.dg/Wbuiltin-declaration-mismatch-4.c: New test. * gcc.dg/Walloca-16.c: Adjust. * gcc.dg/Wrestrict-4.c: Adjust. * gcc.dg/Wrestrict-5.c: Adjust. * gcc.dg/atomic/stdatomic-generic.c: Adjust. * gcc.dg/atomic/stdatomic-lockfree.c: Adjust. * gcc.dg/initpri1.c: Adjust. * gcc.dg/pr15698-1.c: Adjust. * gcc.dg/pr69156.c: Adjust. * gcc.dg/pr83463.c: Adjust. * gcc.dg/redecl-4.c: Adjust. * gcc.dg/tls/thr-init-2.c: Adjust. * gcc.dg/torture/pr55890-2.c: Adjust. * gcc.dg/torture/pr55890-3.c: Adjust. * gcc.dg/torture/pr67741.c: Adjust. * gcc.dg/torture/stackalign/sibcall-1.c: Adjust. * gcc.dg/torture/tls/thr-init-1.c: Adjust. * gcc.dg/tree-ssa/builtins-folding-gimple-ub.c: Adjust. @@ -3547,8 +3598,24 @@ convert_arguments (location_t loc, vec arg_loc, tree typelist, if (parmval == error_mark_node) error_args = true; + if (!type && builtin_type && TREE_CODE (builtin_type) != VOID_TYPE) + { + /* For a call to a built-in function declared without a prototype, + perform the coversions from the argument to the expected type + but issue warnings rather than errors for any mismatches. + Ignore the converted argument and use the PARMVAL obtained + above by applying default coversions instead. */ s/coversions/conversions/ Two of 'em in that comment. OK with that nit fixed. I once had -Wunprototyped-calls... (attached). Because the issue doesn't exist only for builtins... (originally inspired by a Xorg bug) You're right, thanks for the suggestion. I'm still planning to resubmit the patch I posted last June to enable -Wstrict-prototypes (PR 82922). Detecting this problem there would be a good enhancement. If it's too late to do this for GCC 9 I will submit the patch for GCC 10. Martin
Re: [PATCH] RFC: elide repeated source locations (PR other/84889)
On 11/15/18, Martin Sebor wrote: > On 11/14/2018 02:12 PM, David Malcolm wrote: >> On Mon, 2018-11-12 at 13:37 -0700, Martin Sebor wrote: >>> On 11/11/2018 07:43 PM, David Malcolm wrote: We often emit more than one diagnostic at the same source location. For example, the C++ frontend can emit many diagnostics at the same source location when suggesting overload candidates. For example: ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In function 'int test_3(s, t)': ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: error: no match for 'operator&&' (operand types are 's' and 't') 38 | return param_s && param_t; | ^~ ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: note: candidate: 'operator&&(bool, bool)' ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: note: no known conversion for argument 2 from 't' to 'bool' This is overly verbose. Note how the same location has been printed three times, obscuring the pertinent messages. This patch add a new "elide" value to -fdiagnostics-show-location= and makes it the default (previously it was "once"). With elision the above is printed as: ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In function 'int test_3(s, t)': ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18: error: no match for 'operator&&' (operand types are 's' and 't') 38 | return param_s && param_t; | ^~ = note: candidate: 'operator&&(bool, bool)' = note: no known conversion for argument 2 from 't' to 'bool' where the followup notes are printed with a '=' lined up with the source code margin. Thoughts? >>> >>> I agree the long pathname in the notes is at first glance redundant >>> but I'm not sure about using '=' as a shorthand for it. I like the -fdiagnostics-show-location=elide option but I also agree about the '=' looking weird. However, I don't think the '=' is that big a problem that needs to provoke huge rethinking, just some minor bikeshedding about which symbol to use instead. How about '+'? >>> I have >>> written many scripts to parse GCC output to extract all diagnostics >>> (including notes) and publish those on a Web page somewhere, as I'm >>> sure must have others. All those scripts would stop working with >>> this change and require changes to the build system to work again. >>> Making those changes can be a substantial undertaking in some >>> organizations. >>> >>> Have you considered printing just the file name instead? Or any >>> other alternatives? >> >> "-fdiagnostics-show-location=once" will restore the old behavior. >> Alternatively, if you want to parse GCC output, I'm adding a JSON >> output format; see: >> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01038.html >> (I'm testing an updated version of that patch) > > I understand the change can be disabled by an option. I also > like making the repetitive pathnames shorter, but the concern > I have is that the change to use the '=' notation by default, > besides being rather unusual and not necessarily intuitive, will > break scripts that search for the pattern: > >"[^:]*:[1-9][0-9]*:[1-9][0-9]*: (error|note|warning): " > > and adding a new option to a large build system can be quite > difficult. I suspect it would also make the notes difficult > or even impossible to associate with the corresponding errors > or warnings in parallel builds (where they may be interleaved > with diagnostics from different compilations). > > I think it's possible to make the output shorter/less repetitive > without these side-effects by keeping the current format and > instead abbreviating the pathname, e.g. by printing just the file > name (or ".../filename.c:" with the ellipsis standing in for the > long pathname, though the ellipsis would require adjusting > the scripts that sort diagnostics by the pathname). > > Martin > > PS As an aside, if we wanted to shorten all pathnames this same > idea could be extended to errors and warnings as well by printing > the pathname in the first one and just the filename in subsequent > ones in the same file. >
Re: [PATCH] Support simd function declarations via a pre-include. (was: [PATCH][RFC]Overloading intrinsics)
On 14 November 2018 12:35:27 CET, Jakub Jelinek wrote: >> --- a/gcc/config/gnu-user.h >> +++ b/gcc/config/gnu-user.h >> @@ -170,3 +170,6 @@ see the files COPYING3 and COPYING.RUNTIME >respectively. If not, see >>LD_STATIC_OPTION " --whole-archive -llsan --no-whole-archive " \ >>LD_DYNAMIC_OPTION "}}%{!static-liblsan:-llsan}" >> #endif >> + >> +#undef TARGET_F951_NOSTDINC_OPTIONS >> +#define TARGET_F951_NOSTDINC_OPTIONS >"%:fortran-header-file(-fpre-include= math-vector-fortran.h)" > >Too long line, use some \s to split it up. > Can we use plain -include like in C? This has an established meaning and is already documented. Thanks,
[PR c++/86246] ICE tsubst explicit operator call
We weren't noticing that 'expr.operator T()' is dependent if T is dependent. Therefore we looked up 'operator T' at template definition time and went down hill from there. 86246 and 87989 had different failure modes (one iced, one generated wrong code), so adding both. booted & tested on x86_64-linux, applying to trunk. nathan -- Nathan Sidwell 2018-11-15 Nathan Sidwell PR c++/86246 PR c++/87989 * typeck.c (finish_class_member_access_expr): Conversion operator to dependent type is dependent. * g++.dg/template/pr86246.C: New. * g++.dg/template/pr87989.C: New. Index: cp/typeck.c === --- cp/typeck.c (revision 266191) +++ cp/typeck.c (working copy) @@ -2884,7 +2884,12 @@ finish_class_member_access_expr (cp_expr expression is dependent. */ || (TREE_CODE (name) == SCOPE_REF && TYPE_P (TREE_OPERAND (name, 0)) - && dependent_scope_p (TREE_OPERAND (name, 0 + && dependent_scope_p (TREE_OPERAND (name, 0))) + /* If NAME is operator T where "T" is dependent, we can't + lookup until we instantiate the T. */ + || (TREE_CODE (name) == IDENTIFIER_NODE + && IDENTIFIER_CONV_OP_P (name) + && dependent_type_p (TREE_TYPE (name { dependent: return build_min_nt_loc (UNKNOWN_LOCATION, COMPONENT_REF, Index: testsuite/g++.dg/template/pr86246.C === --- testsuite/g++.dg/template/pr86246.C (revision 0) +++ testsuite/g++.dg/template/pr86246.C (working copy) @@ -0,0 +1,38 @@ +// { dg-do compile { target c++11 } } +// PR c++/86246 ICE in tsubst + +namespace std { + template struct is_class { +static constexpr bool value = true; + }; + template<> struct is_class { +static constexpr bool value = false; + }; +} + +class MyClass { + public: + operator double() const { +return 1; + } + template + operator T() const { +static_assert(std::is_class::value, "problem"); +return T(); + } +}; + +template +void SetValue(const MyClass& obj, T* value) { + // erroneously dispatched to operator T when T is double + *value = obj.operator T(); +} + +int main() { + MyClass obj; + // works fine + obj.operator double (); + double x; + // error, when operator T is called in SetValue + SetValue(obj, &x); +} Index: testsuite/g++.dg/template/pr87989.C === --- testsuite/g++.dg/template/pr87989.C (revision 0) +++ testsuite/g++.dg/template/pr87989.C (working copy) @@ -0,0 +1,20 @@ +// PR c++/87989 +// { dg-do link } +// Resolved to template instantiation rather than non-template fn. + +struct X { + template operator T() const; // no definition + operator float() const {return 0.f;} +}; + +template +T f(const X &x) { + // Resoved in error to X::operator float() const` + // instead of correct `X::operator float() const + return x.operator T(); +} + +int main () +{ + return f(X ()); +}
Re: [PATCH][rs6000] inline expansion of memcmp using vsx
On 11/15/18 4:02 AM, Richard Biener wrote: > On Wed, Nov 14, 2018 at 5:43 PM Aaron Sawdey wrote: >> >> This patch generalizes some the functions added earlier to do vsx expansion >> of strncmp >> so that the can also generate the code needed for memcmp. I reorganized >> expand_block_compare() a little to be able to make use of this there. The >> vsx code is more >> compact so I've changed the default block compare inline limit to 63 bytes. >> The vsx >> code is only used if there is at least 16 bytes to compare as this means we >> don't have to >> do complex code to compare less than one chunk. If vsx is not available the >> limit is cut >> in half. The performance is good, vsx memcmp is considerably faster than the >> gpr inline code >> if the strings are equal and is comparable if the strings have a 10% chance >> of being >> equal (spread across the string). > > How is performance affected if there are close earlier char-size > stores to one of the string/memory? > Can power still do store forwarding in this case? Store forwarding between scalar and vector is not great, but it's better than having to make a plt call to memcmp() which may well use vsx anyway. I had set the crossover between scalar and vsx at 16 bytes because the vsx code is more compact. The performance is similar for 16-32 byte sizes. But you could make an argument for switching at 33 bytes. This way builtin memcmp of 33-64 bytes would now use inline vsx code instead of memcmp() call. At 33 bytes the vsx inline code is 3x faster than a memcmp() call so would likely remain faster even if there was an ugly vector-load-hit-scalar-store. Also small structures 32 bytes and less being compared would use scalar code and the same as gcc 8 and would avoid this issue. Aaron > >> Currently regtesting, ok for trunk if tests pass? >> >> Thanks! >>Aaron >> >> 2018-11-14 Aaron Sawdey >> >> * config/rs6000/rs6000-string.c (emit_vsx_zero_reg): New function. >> (expand_cmp_vec_sequence): Rename and modify >> expand_strncmp_vec_sequence. >> (emit_final_compare_vec): Rename and modify >> emit_final_str_compare_vec. >> (generate_6432_conversion): New function. >> (expand_block_compare): Add support for vsx. >> (expand_block_compare_gpr): New function. >> * config/rs6000/rs6000.opt (rs6000_block_compare_inline_limit): >> Increase >> default limit to 63 because of more compact vsx code. >> >> >> >> >> Index: gcc/config/rs6000/rs6000-string.c >> === >> --- gcc/config/rs6000/rs6000-string.c (revision 266034) >> +++ gcc/config/rs6000/rs6000-string.c (working copy) >> @@ -615,6 +615,283 @@ >> } >> } >> >> +static rtx >> +emit_vsx_zero_reg() >> +{ >> + unsigned int i; >> + rtx zr[16]; >> + for (i = 0; i < 16; i++) >> +zr[i] = GEN_INT (0); >> + rtvec zv = gen_rtvec_v (16, zr); >> + rtx zero_reg = gen_reg_rtx (V16QImode); >> + rs6000_expand_vector_init (zero_reg, gen_rtx_PARALLEL (V16QImode, zv)); >> + return zero_reg; >> +} >> + >> +/* Generate the sequence of compares for strcmp/strncmp using vec/vsx >> + instructions. >> + >> + BYTES_TO_COMPARE is the number of bytes to be compared. >> + ORIG_SRC1 is the unmodified rtx for the first string. >> + ORIG_SRC2 is the unmodified rtx for the second string. >> + S1ADDR is the register to use for the base address of the first string. >> + S2ADDR is the register to use for the base address of the second string. >> + OFF_REG is the register to use for the string offset for loads. >> + S1DATA is the register for loading the first string. >> + S2DATA is the register for loading the second string. >> + VEC_RESULT is the rtx for the vector result indicating the byte >> difference. >> + EQUALITY_COMPARE_REST is a flag to indicate we need to make a cleanup >> call >> + to strcmp/strncmp if we have equality at the end of the inline >> comparison. >> + P_CLEANUP_LABEL is a pointer to rtx for a label we generate if we need >> code >> + to clean up and generate the final comparison result. >> + FINAL_MOVE_LABEL is rtx for a label we can branch to when we can just >> + set the final result. >> + CHECKZERO indicates whether the sequence should check for zero bytes >> + for use doing strncmp, or not (for use doing memcmp). */ >> +static void >> +expand_cmp_vec_sequence (unsigned HOST_WIDE_INT bytes_to_compare, >> +rtx orig_src1, rtx orig_src2, >> +rtx s1addr, rtx s2addr, rtx off_reg, >> +rtx s1data, rtx s2data, rtx vec_result, >> +bool equality_compare_rest, rtx *p_cleanup_label, >> +rtx final_move_label, bool checkzero) >> +{ >> + machine_mode load_mode; >> + unsigned int load_mode_size; >> + unsigned HOST_WIDE_INT cmp_bytes = 0; >> + unsigned HOST_WIDE_INT offset = 0; >> + rtx zero_reg = NULL; >
Re: [PATCH] Support simd function declarations via a pre-include. (was: [PATCH][RFC]Overloading intrinsics)
On Thu, Nov 15, 2018 at 08:40:13PM +0100, Bernhard Reutner-Fischer wrote: > On 14 November 2018 12:35:27 CET, Jakub Jelinek wrote: > > >> --- a/gcc/config/gnu-user.h > >> +++ b/gcc/config/gnu-user.h > >> @@ -170,3 +170,6 @@ see the files COPYING3 and COPYING.RUNTIME > >respectively. If not, see > >>LD_STATIC_OPTION " --whole-archive -llsan --no-whole-archive " \ > >>LD_DYNAMIC_OPTION "}}%{!static-liblsan:-llsan}" > >> #endif > >> + > >> +#undef TARGET_F951_NOSTDINC_OPTIONS > >> +#define TARGET_F951_NOSTDINC_OPTIONS > >"%:fortran-header-file(-fpre-include= math-vector-fortran.h)" > > > >Too long line, use some \s to split it up. > > > > Can we use plain -include like in C? Wouldn't that be confusing whether it is included in preprocessor only or if it is included as a magic fortran include line at the beginning? Jakub
[PATCH] v2: C/C++: add fix-it hints for missing '&' and '*' (PR c++/87850)
On Tue, 2018-11-13 at 16:34 -0500, Jason Merrill wrote: > On Mon, Nov 12, 2018 at 4:32 PM Martin Sebor > wrote: > > On 11/11/2018 02:02 PM, David Malcolm wrote: > > > On Sun, 2018-11-11 at 11:01 -0700, Martin Sebor wrote: > > > > On 11/10/2018 12:01 AM, Eric Gallager wrote: > > > > > On 11/9/18, David Malcolm wrote: > > > > > > This patch adds a fix-it hint to various pointer-vs-non- > > > > > > pointer > > > > > > diagnostics, suggesting the addition of a leading '&' or > > > > > > '*'. > > > > > > > > > > > > For example, note the ampersand fix-it hint in the > > > > > > following: > > > > > > > > > > > > demo.c:5:22: error: invalid conversion from 'pthread_key_t' > > > > > > {aka > > > > > > 'unsigned > > > > > > int'} > > > > > >to 'pthread_key_t*' {aka 'unsigned int*'} [-fpermissive] > > > > > > 5 | pthread_key_create(key, NULL); > > > > > > | ^~~ > > > > > > | | > > > > > > | pthread_key_t {aka unsigned > > > > > > int} > > > > > > | & > > > > > > > > > > Having both the type and the fixit underneath the caret looks > > > > > kind > > > > > of confusing > > > > > > > > I agree it's rather subtle. Keeping the diagnostics separate > > > > from > > > > the suggested fix should avoid the confusion. > > > > > > FWIW, the fix-it hint is in a different color (assuming that gcc > > > is > > > invoked in an environment that prints that...) > > > > I figured it would be, but I'm still not sure it's good design > > to be relying on color alone to distinguish between the problem > > and the suggested fix. Especially when they are so close to one > > another and the fix is just a single character with no obvious > > relationship to the rest of the text on the screen. In other > > warnings there's at least the "did you forget the '@'?" part > > to give a clue, even though even there the connection between > > the "did you forget" and the & several lines down wouldn't > > necessarily be immediately apparent. > > Agreed, something along those lines would help to understand why the > compiler is throwing a random & into the diagnostic. > > Jason Here's an updated version which adds a note, putting the fix-it hint on that instead (I attempted adding the text to the initial error, but there was something of a combinatorial explosion of messages). The above example becomes: demo.c: In function 'int main()': demo.c:5:22: error: invalid conversion from 'pthread_key_t' {aka 'unsigned int'} to 'pthread_key_t*' {aka 'unsigned int*'} [-fpermissive] 5 | pthread_key_create(key, NULL); | ^~~ | | | pthread_key_t {aka unsigned int} demo.c:5:22: note: possible fix: take the address with '&' 5 | pthread_key_create(key, NULL); | ^~~ | & In file included from demo.c:1: /usr/include/pthread.h:1122:47: note: initializing argument 1 of 'int pthread_key_create(pthread_key_t*, void (*)(void*))' 1122 | extern int pthread_key_create (pthread_key_t *__key, |~~~^ Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. OK for trunk? gcc/c-family/ChangeLog: PR c++/87850 * c-common.c: Include "gcc-rich-location.h". (maybe_emit_indirection_note): New function. * c-common.h (maybe_emit_indirection_note): New decl. gcc/c/ChangeLog: PR c++/87850 * c-typeck.c (convert_for_assignment): Call maybe_emit_indirection_note for pointer vs non-pointer diagnostics. gcc/cp/ChangeLog: PR c++/87850 * call.c (convert_like_real): Call maybe_emit_indirection_note for "invalid conversion" diagnostic. gcc/testsuite/ChangeLog: PR c++/87850 * c-c++-common/indirection-fixits.c: New test. --- gcc/c-family/c-common.c | 33 +++ gcc/c-family/c-common.h | 2 + gcc/c/c-typeck.c| 10 +- gcc/cp/call.c | 2 + gcc/testsuite/c-c++-common/indirection-fixits.c | 270 5 files changed, 315 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/indirection-fixits.c diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index cd88f3a..d5c7c7f 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see #include "gimplify.h" #include "substring-locations.h" #include "spellcheck.h" +#include "gcc-rich-location.h" #include "selftest.h" cpp_reader *parse_in; /* Declared in c-pragma.h. */ @@ -8405,6 +8406,38 @@ maybe_suggest_missing_token_insertion (rich_location *richloc, } } +/* Potentially emit a note about likely missing '&' or '*', + depending on EXPR and EXPEC
Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)
On 11/14/2018 02:39 AM, Alexander Monakov wrote: On Tue, 13 Nov 2018, Martin Sebor wrote: In PR 88000 the reporter expects to be able to use an explicit register local variable in a context where it isn't supported i.e., for something other than an input or output of an asm statement: namely to pass it as argument to a user-defined function. GCC emits unexpected object code in this case which the reporter thought was a GCC bug. I appreciate warnings for misuse of extensions in general, but in this particular case I think GCC's behavior is misdesigned, so instead of enshrining a bad engineering choice in a warning and in the manual, I'd rather see GCC implement the extension sanely. Merely changing a normal auto variable to a register asm variable should not invalidate the program. As the manual says, it should amount to providing a hint to the register allocator, at most. Have a look at PR 87984, where code is miscompiled despite following the documentation to the letter. This is because we lower accesses to register variables when transitioning from GIMPLE to RTL incorrectly. Fixing that should make the warning unnecessary. I hope I can work on that before stage 4. I think LLVM is doing the right thing there, and so should we. That would indeed be preferable but it's not something I expect to have the cycles to work on. I put the patch together only because it seemed like an easy way to help keep users from shooting themselves in the foot (to borrow the phrase from the comment in PR 88000). If there's consensus that the general use case of passing hard registers to functions should be supported then I suggest to acknowledge PR 88000 as a bug. We can discuss whether it's possible and worthwhile to warn on the unsupported subset of the use cases. In any case, I think the least we can for now is to clarify in the manual what that supported subset is, since as I (and others) read it, passing hard registers to functions is not. Martin
[doc, committed] clarify that "packed" attribute can apply to C++ classes
I've checked in this patch for PR 25759, another minor documentation improvement. -Sandra 2018-11-15 Sandra Loosemore PR c++/25759 gcc/ * doc/extend.texi (Common Type Attributes): Make it explicit that attribute "packed" can apply to C++ classes. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 266195) +++ gcc/doc/extend.texi (working copy) @@ -7314,17 +7314,16 @@ or @code{__pointer__} for the mode used @item packed @cindex @code{packed} type attribute -This attribute, attached to @code{struct} or @code{union} type -definition, specifies that each member (other than zero-width bit-fields) -of the structure or union is placed to minimize the memory required. When -attached to an @code{enum} definition, it indicates that the smallest -integral type should be used. +This attribute, attached to @code{struct}, @code{union}, or C++ @code{class} +type definition, specifies that each of its members (other than zero-width +bit-fields) is placed to minimize the memory required. This is equivalent +to specifying the @code{packed} attribute on each of the members. @opindex fshort-enums -Specifying the @code{packed} attribute for @code{struct} and @code{union} -types is equivalent to specifying the @code{packed} attribute on each -of the structure or union members. Specifying the @option{-fshort-enums} -flag on the command line is equivalent to specifying the @code{packed} +When attached to an @code{enum} definition, the @code{packed} attribute +indicates that the smallest integral type should be used. +Specifying the @option{-fshort-enums} flag on the command line +is equivalent to specifying the @code{packed} attribute on all @code{enum} definitions. In the following example @code{struct my_packed_struct}'s members are @@ -7348,8 +7347,9 @@ struct __attribute__ ((__packed__)) my_p @end smallexample You may only specify the @code{packed} attribute attribute on the definition -of an @code{enum}, @code{struct} or @code{union}, not on a @code{typedef} -that does not also define the enumerated type, structure or union. +of an @code{enum}, @code{struct}, @code{union}, or @code{class}, +not on a @code{typedef} that does not also define the enumerated type, +structure, union, or class. @item scalar_storage_order ("@var{endianness}") @cindex @code{scalar_storage_order} type attribute
Re: [PATCH] RFC: C/C++: print help when a header can't be found
On 11/12/18, Martin Sebor wrote: > On 11/11/2018 04:33 PM, David Malcolm wrote: >> When gcc can't find a header file, it's a hard error that stops the >> build, >> typically requiring the user to mess around with compile flags, >> Makefiles, >> dependencies, and so forth. >> >> Often the exact search paths aren't obvious to the user. Consider the >> case where the include paths are injected via a tool such as pkg-config, >> such as e.g.: >> >> gcc $(pkg-config --cflags glib-2.0) demo.c >> >> This patch is an attempt at being more helpful for such cases. Given >> that >> the user can't proceed until the issue is resolved, I think it's >> reasonable >> to default to telling the user as much as possible about what happened. >> This patch list all of the search paths, and any close matches (e.g. for >> misspellings). >> >> Without the patch, the current behavior is: >> >> misspelled-header-1.c:1:10: fatal error: test-header.hpp: No such file or >> directory >> 1 | #include "test-header.hpp" >> | ^ >> compilation terminated. >> >> With the patch, the user gets this output: >> >> misspelled-header-1.c:1:10: fatal error: test-header.hpp: No such file or >> directory >> 1 | #include "test-header.hpp" >> | ^ >> misspelled-header-1.c:1:10: note: paths searched: >> misspelled-header-1.c:1:10: note: path: '' >> misspelled-header-1.c:1:10: note: not found: 'test-header.hpp' >> misspelled-header-1.c:1:10: note: close match: 'test-header.h' >> 1 | #include "test-header.hpp" >> | ^ >> | "test-header.h" >> misspelled-header-1.c:1:10: note: path: '/usr/include/glib-2.0' (via >> '-I') >> misspelled-header-1.c:1:10: note: not found: >> '/usr/include/glib-2.0/test-header.hpp' >> misspelled-header-1.c:1:10: note: path: '/usr/lib64/glib-2.0/include' >> (via '-I') >> misspelled-header-1.c:1:10: note: not found: >> '/usr/lib64/glib-2.0/include/test-header.hpp' >> misspelled-header-1.c:1:10: note: path: './include' (system directory) >> misspelled-header-1.c:1:10: note: not found: >> './include/test-header.hpp' >> misspelled-header-1.c:1:10: note: path: './include-fixed' (system >> directory) >> misspelled-header-1.c:1:10: note: not found: >> './include-fixed/test-header.hpp' >> misspelled-header-1.c:1:10: note: path: '/usr/local/include' (system >> directory) >> misspelled-header-1.c:1:10: note: not found: >> '/usr/local/include/test-header.hpp' >> misspelled-header-1.c:1:10: note: path: '/usr/include' (system >> directory) >> misspelled-header-1.c:1:10: note: not found: >> '/usr/include/test-header.hpp' >> compilation terminated. >> >> showing the paths that were tried, and why (e.g. the -I paths injected by >> the pkg-config invocation), and the .hpp vs .h issue (with a fix-it >> hint). >> >> It's verbose, but as I said above, the user can't proceed until they >> resolve it, so I think being verbose is appropriate here. >> >> Thoughts? > > I think printing the directories and especially the near matches > will be very helpful, especially for big projects with lots of -I > options. > > The output could be made substantially shorter, less repetitive, > and so easier to read -- basically cut in half -- by avoiding > most of the duplication and collapsing two notes into one, e.g. > like so: > >fatal error: test-header.hpp: No such file or directory >1 | #include "test-header.hpp" > | ^ >note: paths searched: >note: -I '.' >note: close match: 'test-header.h' >1 | #include "test-header.hpp" > | ^ > | "test-header.h" >note: -I '/usr/include/glib-2.0' >note: -I '/usr/lib64/glib-2.0/include' >note: -isystem './include' >note: -isystem './include-fixed' >note: -isystem '/usr/local/include' >note: -isystem '/usr/include' > > or by printing the directories in sections: > >note: -I paths searched: >note: '.' >note: close match: 'test-header.h' >1 | #include "test-header.hpp" > | ^ > | "test-header.h" >note: '/usr/include/glib-2.0' >note: '/usr/lib64/glib-2.0/include' >note: -isystem paths searched: >note: './include' >note: './include-fixed' >note: '/usr/local/include' >note: '/usr/include' > > Martin > > How would this interact with -Wmissing-include-dirs?
[PATCH, csky] Update linux-unwind.h for kernel
Hi, I have submitted a patch to update linux-unwind for kernel Index: libgcc/ChangeLog === --- libgcc/ChangeLog(revision 266199) +++ libgcc/ChangeLog(working copy) @@ -1,5 +1,11 @@ 2018-11-15 Xianmiao Qu + * config/csky/linux-unwind.h (sc_pt_regs): Update for kernel. + (sc_pt_regs_lr): Update for kernel. + (sc_pt_regs_tls): Update for kernel. + +2018-11-15 Xianmiao Qu + * config/csky/linux-unwind.h: Fix coding style. 2018-11-13 Xianmiao Qu Index: libgcc/config/csky/linux-unwind.h === --- libgcc/config/csky/linux-unwind.h (revision 266199) +++ libgcc/config/csky/linux-unwind.h (working copy) @@ -47,9 +47,9 @@ #define MOVI_R7_139_V2_PART0 0xea07 #define MOVI_R7_139_V2_PART1 139 -#define sc_pt_regs(x) (sc->sc_##x) -#define sc_pt_regs_lr (sc->sc_r15) -#define sc_pt_regs_tls(x) (sc->sc_exregs[15]) +#define sc_pt_regs(x) (sc->sc_pt_regs.x) +#define sc_pt_regs_lr (sc->sc_pt_regs.lr) +#define sc_pt_regs_tls(x) sc_pt_regs(x) #define MD_FALLBACK_FRAME_STATE_FOR csky_fallback_frame_state
PING [PATCH] add simple attribute introspection
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01473.html (Still looking for an approval.) On 11/09/2018 04:43 PM, Martin Sebor wrote: On 11/09/2018 12:59 PM, Jeff Law wrote: On 10/31/18 10:27 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01473.html With the C++ bits approved I'm still looking for a review/approval of the remaining changes: the C front end and the shared c-family handling of the new built-in. I thought I acked those with just a couple minor doc nits: I don't see a formal approval for the rest in my Inbox or in the archive. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 8ffb0cd..dcf4747 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -2649,8 +2649,9 @@ explicit @code{externally_visible} attributes are still necessary. @cindex @code{flatten} function attribute Generally, inlining into a function is limited. For a function marked with this attribute, every call inside this function is inlined, if possible. -Whether the function itself is considered for inlining depends on its size and -the current inlining parameters. +Functions declared with attribute @code{noinline} and similar are not +inlined. Whether the function itself is considered for inlining depends +on its size and the current inlining parameters. Guessing this was from another doc patch that I think has already been approved Yes. It shouldn't be in the latest patch at the link above. @@ -11726,6 +11728,33 @@ check its compatibility with @var{size}. @end deftypefn +@deftypefn {Built-in Function} bool __builtin_has_attribute (@var{type-or-expression}, @var{attribute}) +The @code{__builtin_has_attribute} function evaluates to an integer constant +expression equal to @code{true} if the symbol or type referenced by +the @var{type-or-expression} argument has been declared with +the @var{attribute} referenced by the second argument. Neither argument +is valuated. The @var{type-or-expression} argument is subject to the same s/valuated/evaluated/ ? This should also be fixed in the latest patch at the link above. Did the implementation change significantly requiring another review iteration? I don't think it changed too significantly between the last two revisions but I don't have a record of anyone having approved the C FE and the middle-end bits. (Sorry if I missed it.) Other than this response from you all I see in the archive is this: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00606.html Please let me if the last revision is okay to commit. Thanks Martin
[PING #4] [PATCH] look harder for MEM_REF operand equality to avoid -Wstringop-truncation (PR 84561)
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html Do I need to change something in this patch to make it acceptable or should I assume we will leave this bug in GCC unfixed? On 10/31/2018 10:35 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html On 10/08/2018 03:40 PM, Martin Sebor wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html On 10/01/2018 03:30 PM, Martin Sebor wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html We have discussed a number of different approaches to moving the warning somewhere else but none is feasible in the limited amount of time remaining in stage 1 of GCC 9. I'd like to avoid the false positive in GCC 9 by using the originally submitted, simple approach and look into the suggested design changes for GCC 10. On 09/21/2018 08:36 AM, Martin Sebor wrote: On 09/20/2018 03:06 AM, Richard Biener wrote: On Wed, Sep 19, 2018 at 4:19 PM Martin Sebor wrote: On 09/18/2018 10:23 PM, Jeff Law wrote: On 9/18/18 1:46 PM, Martin Sebor wrote: On 09/18/2018 12:58 PM, Jeff Law wrote: On 9/18/18 11:12 AM, Martin Sebor wrote: My bad. Sigh. CCP doesn't track copies, just constants, so there's not going to be any data structure you can exploit. And I don't think there's a value number you can use to determine the two objects are the same. Hmm, let's back up a bit, what is does the relevant part of the IL look like before CCP? Is the real problem here that we have unpropagated copies lying around in the IL? Hmm, more likely the IL looksl ike: _8 = &pb_3(D)->a; _9 = _8; _1 = _9; strncpy (MEM_REF (&pb_3(D)->a), ...); MEM[(struct S *)_1].a[n_7] = 0; Yes, that is what the folder sees while the strncpy call is being transformed/folded by ccp. The MEM_REF is folded just after the strncpy call and that's when it's transformed into MEM[(struct S *)_8].a[n_7] = 0; (The assignments to _1 and _9 don't get removed until after the dom walk finishes). If we were to propagate the copies out we'd at best have: _8 = &pb_3(D)->a; strncpy (MEM_REF (&pb_3(D)->a), ...); MEM[(struct S *)_8].a[n_7] = 0; Is that in a form you can handle? Or would we also need to forward propagate the address computation into the use of _8? The above works as long as we look at the def_stmt of _8 in the MEM_REF (we currently don't). That's also what the last iteration of the loop does. In this case (with _8) it would be discovered in the first iteration, so the loop could be replaced by a simple if statement. But I'm not sure I understand the concern with the loop. Is it that we are looping at all, i.e., the cost? Or that ccp is doing something wrong or suboptimal? (Should have propagated the value of _8 earlier?) I suspect it's more a concern that things like copies are typically propagated away. So their existence in the IL (and consequently your need to handle them) raises the question "has something else failed to do its job earlier". During which of the CCP passes is this happening? Can we pull the warning out of the folder (even if that means having a distinct warning pass over the IL?) It happens during the third run of the pass. The only way to do what you suggest that I could think of is to defer the strncpy to memcpy transformation until after the warning pass. That was also my earlier suggestion: defer both it and the warning until the tree-ssa-strlen pass (where the warning is implemented to begin with -- the folder calls into it). If it's happening that late (CCP3) in general, then ISTM we ought to be able to get the warning out of the folder. We just have to pick the right spot. warn_restrict runs before fold_all_builtins, but after dom/vrp so we should have the IL in pretty good shape. That seems like about the right time. I wonder if we could generalize warn_restrict to be a more generic warning pass over the IL and place it right before fold_builtins. The restrict pass doesn't know about string lengths so it can't handle all the warnings about string built-ins (the strlen pass now calls into it to issue some). The strlen pass does so it could handle most if not all of them (the folder also calls into it to issue some warnings). It would work even better if it were also integrated with the object size pass. We're already working on merging strlen with sprintf. It seems to me that the strlen pass would benefit not only from that but also from integrating with object size and warn-restrict. With that, -Wstringop-overflow could be moved from builtins.c into it as well (and also benefit not only from accurate string lengths but also from the more accurate object size info). What do you think about that? I think integrating the various "passes" (objectsize is also as much a facility as a pass) generally makes sense given it might end up improving all of them and reduce code duplication. Okay. If Jeff agrees I'll see if I can make it happen for GCC 10.
[PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html Please let me know if there is something I need to change here to make the fix acceptable or if I should stop trying. On 10/31/2018 10:33 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html On 10/20/2018 06:01 PM, Martin Sebor wrote: On 10/16/2018 03:21 PM, Jeff Law wrote: On 10/4/18 9:51 AM, Martin Sebor wrote: On 10/04/2018 08:58 AM, Jeff Law wrote: On 8/27/18 9:42 AM, Richard Biener wrote: On Mon, Aug 27, 2018 at 5:32 PM Jeff Law wrote: On 08/27/2018 02:29 AM, Richard Biener wrote: On Sun, Aug 26, 2018 at 7:26 AM Jeff Law wrote: On 08/24/2018 09:58 AM, Martin Sebor wrote: The warning suppression for -Wstringop-truncation looks for the next statement after a truncating strncpy to see if it adds a terminating nul. This only works when the next statement can be reached using the Gimple statement iterator which isn't until after gimplification. As a result, strncpy calls that truncate their constant argument that are being folded to memcpy this early get diagnosed even if they are followed by the nul assignment: const char s[] = "12345"; char d[3]; void f (void) { strncpy (d, s, sizeof d - 1); // -Wstringop-truncation d[sizeof d - 1] = 0; } To avoid the warning I propose to defer folding strncpy to memcpy until the pointer to the basic block the strnpy call is in can be used to try to reach the next statement (this happens as early as ccp1). I'm aware of the preference to fold things early but in the case of strncpy (a relatively rarely used function that is often misused), getting the warning right while folding a bit later but still fairly early on seems like a reasonable compromise. I fear that otherwise, the false positives will drive users to adopt other unsafe solutions (like memcpy) where these kinds of bugs cannot be as readily detected. Tested on x86_64-linux. Martin PS There still are outstanding cases where the warning can be avoided. I xfailed them in the test for now but will still try to get them to work for GCC 9. gcc-87028.diff PR tree-optimization/87028 - false positive -Wstringop-truncation strncpy with global variable source string gcc/ChangeLog: PR tree-optimization/87028 * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding when statement doesn't belong to a basic block. * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Handle MEM_REF on the left hand side of assignment. gcc/testsuite/ChangeLog: PR tree-optimization/87028 * c-c++-common/Wstringop-truncation.c: Remove xfails. * gcc.dg/Wstringop-truncation-5.c: New test. diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 07341eb..284c2fb 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator *gsi, if (tree_int_cst_lt (ssize, len)) return false; + /* Defer warning (and folding) until the next statement in the basic + block is reachable. */ + if (!gimple_bb (stmt)) +return false; I think you want cfun->cfg as the test here. They should be equivalent in practice. Please do not add 'cfun' references. Note that the next stmt is also accessible when there is no CFG. I guess the issue is that we fold this during gimplification where the next stmt is not yet "there" (but still in GENERIC)? That was my assumption. I almost suggested peeking at gsi_next and avoiding in that case. So I'd rather add guards to maybe_fold_stmt in the gimplifier then. So I think the concern with adding the guards to maybe_fold_stmt is the possibility of further fallout. I guess they could be written to target this case specifically to minimize fallout, but that feels like we're doing the same thing (band-aid) just in a different place. We generally do not want to have unfolded stmts in the IL when we can avoid that which is why we fold most stmts during gimplification. We also do that because we now do less folding on GENERIC. But an unfolded call in the IL should always be safe and we've got plenty of opportunities to fold it later. Well - we do. The very first one is forwprop though which means we'll miss to re-write some memcpy parts into SSA: NEXT_PASS (pass_ccp, false /* nonzero_p */); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_forwprop); likewise early object-size will be confused by memcpy calls that just exist to avoid TBAA issues (another of our recommendations besides using unions). We do fold mem* early for a reason ;) "We can always do warnings earlier" would be a similar true sentence. I'm not disagreeing at all. There's a natural tension between the benefits of folding early to enable more optimizations downstream and leaving the IL in a state where we can give actionable warnings. Similar trade-offs between folding early and lo
Updated version of vartrace pass
First patch is x86 specific for Uros, the rest is the updated generic middle end implementation. Support automatic data tracing using the PTWRITE instruction on Intel CPUs. This is an updated version addressing (nearly) all the excellent review comments. It ended up being large scale changes in the pass, so the code is quite different now. It now instruments more cases and has many bug fixes. Still open issues (hopefully some can be addressed in later phases): - There are still (and even more now) redundant PTWRITEs. - The tracing only accepts int and long, but not short and char or bitfields. - The vartrace pass still requires a full SSA rebuild if it changes anything in a function. I tried to fix it, but was not successfull. - There are some holes in tracing globals/local reads for function arguments. - Copies and memsets are not traced. - The store tracing re-reads the target memory, so could suffer from data races. This is needed for now to get the debug information of the store right. Later this could be improved to use the input of the store instead, but would require some changes to var tracing. Passed boot strap and testing with PTWRITE enabled in the compiler and also some fuzz testing with csmith. Ok to apply now? -Andi
[PATCH v2 1/3] Allow memory operands for PTWRITE
From: Andi Kleen The earlier PTWRITE builtin definition was unnecessarily restrictive, only allowing register input to PTWRITE. The instruction actually supports memory operands too, so allow that too. gcc/: 2018-11-15 Andi Kleen * config/i386/i386.md: Allow memory operands to ptwrite. --- gcc/config/i386/i386.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 44db8ac954c..9c359c0ca04 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -19501,7 +19501,7 @@ (set_attr "prefix_extra" "2")]) (define_insn "ptwrite" - [(unspec_volatile [(match_operand:SWI48 0 "register_operand" "r")] + [(unspec_volatile [(match_operand:SWI48 0 "nonimmediate_operand" "rm")] UNSPECV_PTWRITE)] "TARGET_PTWRITE" "ptwrite\t%0" -- 2.19.1
[PATCH v2 3/3] Add tests for the vartrace pass
From: Andi Kleen So far they are mostly i386 target specific. Later they could be moved up to architecture specific if some other architecture adds vartracing. gcc/testsuite/: 2018-11-15 Andi Kleen * g++.dg/vartrace-3.C: New test. * g++.dg/vartrace-ret.C: New test. * g++.dg/vartrace-ret2.C: New test. * gcc.target/i386/vartrace-1.c: New test. * gcc.target/i386/vartrace-10.c: New test. * gcc.target/i386/vartrace-11.c: New test. * gcc.target/i386/vartrace-12.c: New test. * gcc.target/i386/vartrace-13.c: New test. * gcc.target/i386/vartrace-14.c: New test. * gcc.target/i386/vartrace-15.c: New test. * gcc.target/i386/vartrace-16.c: New test. * gcc.target/i386/vartrace-17.c: New test. * gcc.target/i386/vartrace-18.c: New test. * gcc.target/i386/vartrace-19.c: New test. * gcc.target/i386/vartrace-20.c: New test. * gcc.target/i386/vartrace-2.c: New test. * gcc.target/i386/vartrace-3.c: New test. * gcc.target/i386/vartrace-4.c: New test. * gcc.target/i386/vartrace-5.c: New test. * gcc.target/i386/vartrace-6.c: New test. * gcc.target/i386/vartrace-7.c: New test. * gcc.target/i386/vartrace-8.c: New test. * gcc.target/i386/vartrace-9.c: New test. --- gcc/testsuite/g++.dg/vartrace-3.C | 14 +++ gcc/testsuite/g++.dg/vartrace-ret.C | 17 + gcc/testsuite/g++.dg/vartrace-ret2.C| 24 gcc/testsuite/gcc.target/i386/vartrace-1.c | 41 + gcc/testsuite/gcc.target/i386/vartrace-10.c | 13 +++ gcc/testsuite/gcc.target/i386/vartrace-11.c | 16 gcc/testsuite/gcc.target/i386/vartrace-12.c | 16 gcc/testsuite/gcc.target/i386/vartrace-13.c | 18 + gcc/testsuite/gcc.target/i386/vartrace-14.c | 17 + gcc/testsuite/gcc.target/i386/vartrace-15.c | 12 ++ gcc/testsuite/gcc.target/i386/vartrace-16.c | 12 ++ gcc/testsuite/gcc.target/i386/vartrace-17.c | 23 gcc/testsuite/gcc.target/i386/vartrace-18.c | 20 ++ gcc/testsuite/gcc.target/i386/vartrace-19.c | 22 +++ gcc/testsuite/gcc.target/i386/vartrace-2.c | 9 + gcc/testsuite/gcc.target/i386/vartrace-20.c | 23 gcc/testsuite/gcc.target/i386/vartrace-3.c | 9 + gcc/testsuite/gcc.target/i386/vartrace-4.c | 13 +++ gcc/testsuite/gcc.target/i386/vartrace-5.c | 11 ++ gcc/testsuite/gcc.target/i386/vartrace-6.c | 13 +++ gcc/testsuite/gcc.target/i386/vartrace-7.c | 11 ++ gcc/testsuite/gcc.target/i386/vartrace-8.c | 11 ++ gcc/testsuite/gcc.target/i386/vartrace-9.c | 10 + gcc/testsuite/gcc.target/vartrace-19.c | 23 24 files changed, 398 insertions(+) create mode 100644 gcc/testsuite/g++.dg/vartrace-3.C create mode 100644 gcc/testsuite/g++.dg/vartrace-ret.C create mode 100644 gcc/testsuite/g++.dg/vartrace-ret2.C create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-1.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-10.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-11.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-12.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-13.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-14.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-15.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-16.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-17.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-18.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-19.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-2.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-20.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-3.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-4.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-5.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-6.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-7.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-8.c create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-9.c create mode 100644 gcc/testsuite/gcc.target/vartrace-19.c diff --git a/gcc/testsuite/g++.dg/vartrace-3.C b/gcc/testsuite/g++.dg/vartrace-3.C new file mode 100644 index 000..217db297baa --- /dev/null +++ b/gcc/testsuite/g++.dg/vartrace-3.C @@ -0,0 +1,14 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -mptwrite -fvartrace=args " } */ +/* { dg-final { scan-assembler "ptwrite" } } */ + +int a; +int b(int c) +{ + if (a) +c += 1; + else +c += b(a); + b(c); + return 0; +} diff --git a/gcc/testsuite/g++.dg/vartrace-ret.C b/gcc/testsuite/g++.dg/vartrace-ret.C new file mode 100644 index 000..5824e3f3738 --- /dev/null +++ b/gcc/testsuite/g++.dg/vartrace-ret.C @@ -0,0 +1,17 @@ +/* {
[PATCH v2 2/3] Add a pass to automatically add ptwrite instrumentation
From: Andi Kleen Add a new pass to automatically instrument changes to variables with the new PTWRITE instruction on x86. PTWRITE writes a 4 or 8 byte field into an Processor Trace log, which allows low over head logging of information. Essentially it's a hardware accelerated printf. This allows to reconstruct how values later from the log, which can be useful for debugging or other analysis of the program behavior. With the compiler support this can be done with without having to manually add instrumentation to the code. Using dwarf information this can be later mapped back to the variables. The decoder decodes the PTWRITE instructions using IP information in the log, and then looks up the argument in the debug information. Then this can be used to reconstruct the original variable name to display a value history for the variable. There are new options to enable instrumentation for different types, and also a new attribute to control analysis fine grained per function or variable level. The attributes can be set on both the variable and the type level, and also on structure fields. This allows to enable tracing only for specific code in large programs in a flexible matter. The pass is generic, but only the x86 backend enables the necessary hooks. When the backend enables the necessary hooks (with -mptwrite) there is an additional pass that looks through the code for attribute vartrace enabled functions or variables. The -fvartrace=locals option is experimental: it works, but it generates redundant ptwrites because the pass doesn't use the SSA information to minimize instrumentation. This could be optimized later. Currently the code can be tested with SDE, or on a Intel Gemini Lake system with a new enough Linux kernel (v4.10+) that supports PTWRITE for PT. Gemini Lake is used in low end laptops ("Intel Pentium Silver J5.. / Celeron N4... / Celeron J4...") Linux perf can be used to record the values perf record -e intel_pt/ptw=1,branch=0/ program perf script --itrace=crw -F +synth ... I have an experimential version of perf that can also use dwarf information to symbolize many[1] values back to their variable names. So far it is not in standard perf, but available at https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=perf/var-resolve-4 It is currently not able to decode all variable locations to names, but a large subset. Longer term hopefully gdb will support this information too. The CPU can potentially generate very data high bandwidths when code doing a lot of computation is heavily instrumented. This can cause some data loss in both the CPU and also in perf logging the data when the disk cannot keep up. Running some larger workloads most workloads do not cause CPU level overflows, but I've seen it with -fvartrace with crafty, and with more workloads with -fvartrace-locals. Recommendation is to not fully instrument programs, but only areas of interest either at the file level or using the attributes. The other thing is that perf and the disk often cannot keep up with the data bandwidth for longer computations. In this case it's possible to use perf snapshot mode (add --snapshot to the command line above). The data will be only logged to a memory ring buffer then, and only dump the buffers on events of interest by sending SIGUSR2 to the perf binrary. In the future this will be hopefully better supported with core files and gdb. Passes bootstrap and test suite on x86_64-linux, also bootstrapped and tested gcc itself with full -fvartrace and -fvartrace=locals instrumentation. gcc/: 2018-11-15 Andi Kleen * Makefile.in: Add tree-vartrace.o. * common.opt: Add -fvartrace * opts.c (parse_vartrace_options): Add. (common_handle_option): Call parse_vartrace_options. * config/i386/i386.c (ix86_vartrace_func): Add. (TARGET_VARTRACE_FUNC): Add. * doc/extend.texi: Document vartrace/no_vartrace attributes. * doc/invoke.texi: Document -fvartrace. * doc/tm.texi (TARGET_VARTRACE_FUNC): Add. * passes.def: Add vartrace pass. * target.def (vartrace_func): Add. * tree-pass.h (make_pass_vartrace): Add. * tree-vartrace.c: New file to implement vartrace pass. gcc/c-family/: 2018-11-15 Andi Kleen * c-attribs.c (handle_vartrace_attribute, handle_no_vartrace_attribute): New functions. (attr_vartrace_exclusions): Add. config/: 2018-11-03 Andi Kleen * bootstrap-vartrace.mk: New. * bootstrap-vartrace-locals.mk: New. --- config/bootstrap-vartrace-locals.mk | 3 + config/bootstrap-vartrace.mk| 3 + gcc/Makefile.in | 1 + gcc/c-family/c-attribs.c| 77 + gcc/common.opt | 8 + gcc/config/i386/i386.c | 32 ++ gcc/doc/extend.texi | 32 ++ gcc/doc/invoke.texi | 27 ++ gcc/doc/tm.texi |
Re: [PATCH] minor FDO profile related fixes
On 11/12/2018 01:48 AM, Martin Liška wrote: make check-gcc on x86_64 shows no new failures. (A related PR washttps://gcc.gnu.org/bugzilla/show_bug.cgi?id=86957 where we added diagnostics for the NO PROFILE case.) Hi. Thanks for the patch. I'm not a maintainer, but the idea of the patch looks correct to me. One question about adding "(precise)", have you verified that the patch can survive regression tests? Thanks, Martin make check-gcc (configured with --enable-languages=c,c++,fortran,go,lto) shows no new failures. make -k check also looks ok.
Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses
Thank you Marek,Appreciate your valuable feedback on the patch . Attached the latest ,please do let us know your thoughts. ~Umesh On Thu, Nov 15, 2018 at 9:26 PM Marek Polacek wrote: > > On Thu, Nov 15, 2018 at 02:26:24PM +0530, Umesh Kalappa wrote: > > Thank you Marek for the inputs . > > >>In the future, if using diff, please also use the -p option. > > We are using svn diif and other comments are addressed . > > Thanks, but it doesn't seem like the -p option was used. > > > please let us know your take on the revised attached patch . > > > Index: cp/ChangeLog > === > --- cp/ChangeLog(revision 266026) > +++ cp/ChangeLog(working copy) > @@ -1,3 +1,9 @@ > +2018-11-14 Kamlesh Kumar > + > + PR c++/52869 > + *parser.c () : restore the old current_class_{ptr,ref} by > + inject_this_parameter(). > + > > This is still the same; can you adjust it according to my last suggestion? > > Index: cp/parser.c > === > --- cp/parser.c (revision 266026) > +++ cp/parser.c (working copy) > @@ -24620,6 +24620,12 @@ > { > matching_parens parens; > parens.consume_open (parser); > + > + tree save_ccp = current_class_ptr; > + tree save_ccr = current_class_ref; > + > > Watch out for trailing whitespace in the blank lines. > > + if (current_class_type) > + inject_this_parameter (current_class_type, TYPE_UNQUALIFIED); > > I think you can remove the if here. > > if (require_constexpr) > { > @@ -24640,6 +24646,9 @@ > } > > parens.require_close (parser); > + > + save_ccp = current_class_ptr = save_ccp; > + save_ccr = current_class_ref = save_ccr; > > You don't need to set save_cc[pr] to itself here. > > Index: testsuite/ChangeLog > === > --- testsuite/ChangeLog (revision 266026) > +++ testsuite/ChangeLog (working copy) > @@ -1,3 +1,8 @@ > +2018-11-14 Kamlesh Kumar > + > + PR c++/52869 > + * g++.dg//DRs/dr52869.C: New. > + > > So DR != PR. Please name the test dr1207.C > > Index: testsuite/g++.dg/DRs/dr52869.C > === > --- testsuite/g++.dg/DRs/dr52869.C (nonexistent) > +++ testsuite/g++.dg/DRs/dr52869.C (working copy) > @@ -0,0 +1,22 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O0 -std=c++11" } */ > > Instead of this, do: > > // { dg-do compile { target c++11 } } > > Also, I wrote a test that fails if current_class_{ptr,ref} aren't properly > restored: > > // DR 1207 > // PR c++/52869 > // { dg-do compile { target c++11 } } > > void > fn () > { > struct S { > bool operator!() noexcept(false); > } s; > S t = s; > } > > So you can add that one too, e.g. testsuite/g++.dg/DRs/dr1207-2.C. > > Thanks, > Marek pr52869.patch Description: Binary data
Re: [PATCH v2 1/3] Allow memory operands for PTWRITE
On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen wrote: > > From: Andi Kleen > > The earlier PTWRITE builtin definition was unnecessarily restrictive, > only allowing register input to PTWRITE. The instruction actually > supports memory operands too, so allow that too. > > gcc/: > > 2018-11-15 Andi Kleen > > * config/i386/i386.md: Allow memory operands to ptwrite. OK. Thanks, Uros. > --- > gcc/config/i386/i386.md | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 44db8ac954c..9c359c0ca04 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -19501,7 +19501,7 @@ > (set_attr "prefix_extra" "2")]) > > (define_insn "ptwrite" > - [(unspec_volatile [(match_operand:SWI48 0 "register_operand" "r")] > + [(unspec_volatile [(match_operand:SWI48 0 "nonimmediate_operand" "rm")] > UNSPECV_PTWRITE)] >"TARGET_PTWRITE" >"ptwrite\t%0" > -- > 2.19.1 >