Re: [match.pd] Fix for PR35691
On Thu, 3 Nov 2016, Marc Glisse wrote: > On Thu, 3 Nov 2016, Richard Biener wrote: > > > > > > The transform would also work for vectors (element_precision for > > > > > the test but also a value-matching zero which should ensure the > > > > > same number of elements). > > > > Um sorry, I didn't get how to check vectors to be of equal length by a > > > > matching zero. > > > > Could you please elaborate on that ? > > > > > > He may have meant something like: > > > > > > (op (cmp @0 integer_zerop@2) (cmp @1 @2)) > > > > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the > > point of the pattern. > > Oups, that's what I had written first, and then I somehow managed to confuse > myself enough to remove it so as to remove the call to types_match :-( > > > > So the last operand is checked with operand_equal_p instead of > > > integer_zerop. But the fact that we could compute bit_ior on the > > > comparison results should already imply that the number of elements is the > > > same. > > > > Though for equality compares we also allow scalar results IIRC. > > Oh, right, I keep forgetting that :-( And I have no idea how to generate one > for a testcase, at least until the GIMPLE FE lands... > > > > On platforms that have IOR on floats (at least x86 with SSE, maybe some > > > vector mode on s390?), it would be cool to do the same for floats (most > > > likely at the RTL level). > > > > On GIMPLE view-converts could come to the rescue here as well. Or we cab > > just allow bit-and/or on floats as much as we allow them on pointers. > > Would that generate sensible code on targets that do not have logic insns for > floats? Actually, even on x86_64 that generates inefficient code, so there > would be some work (for instance grep finds no gen_iordf3, only gen_iorv2df3). > > I am also a bit wary of doing those obfuscating optimizations too early... > a==0 is something that other optimizations might use. long > c=(long&)a|(long&)b; (double&)c==0; less so... > > (and I am assuming that signaling NaNs don't make the whole transformation > impossible, which might be wrong) Yeah. I also think it's not so much important - I just wanted to mention vectors... Btw, I still think we need a more sensible infrastructure for passes to gather, analyze and modify complex conditions. (I'm always pointing to tree-affine.c as an, albeit not very good, example for handling a similar problem) Richard.
Re: [PATCH GCC]Clean pedantic calls and useless lvalue code in fold_cond_expr_with_comparison
On Thu, Nov 3, 2016 at 3:11 PM, Bin Cheng wrote: > Hi, > According to analysis given by > https://gcc.gnu.org/ml/gcc/2016-10/msg00228.html, calls to > pedantic_non_lvalue_loc and code handling lvalue in > fold_cond_expr_with_comparison are useless now. Given this is complicated > legacy code, it may be better to change code step by step, rather than doing > this cleanup together with moving simplification from > fold_cond_expr_with_comparison to match.pd. > BTW, after last cleanup of pedantic_lvalues, function pedantic_non_lvalue_loc > now has nothing to do with lvalue. It could be further cleaned up, or at > least renamed into something else. This patch doesn't do that because that > depends on the answer to the question of the aforementioned message. > > Bootstrap and test on x86_64 and AArch64. Any comments? Ok. Note removal of [pedantic_]non_lvalue can at most result in accepting invalid code where we might not have any testsuite coverage. For the 2nd case with /* Avoid adding NOP_EXPRs in case this is an lvalue. */ and C++ lvalue ?: I'm not sure we have testsuite coverage given Jason failed to add a testcase when adding the code in r34416. Thanks, Richard. > Thanks, > bin > > 2016-10-27 Bin Cheng > > * fold-const.c (fold_cond_expr_with_comparison): Remove call > to pedantic_non_lvalue_loc. Remove useless code for lvalue > where cond_expr can't be a lvalue.
Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)
On Thu, Nov 3, 2016 at 4:46 PM, Segher Boessenkool wrote: > On Thu, Nov 03, 2016 at 09:29:07AM +0100, Richard Biener wrote: >> On Wed, Nov 2, 2016 at 4:27 PM, Segher Boessenkool >> wrote: >> > On Wed, Nov 02, 2016 at 02:51:41PM +0100, Richard Biener wrote: >> >> >> I don't believe it needs a cleanup on every iteration. One cleanup at >> >> >> the end should work fine. >> >> > >> >> > But as the comment there says: >> >> > >> >> > /* Merge the duplicated blocks into predecessors, when possible. >> >> > */ >> >> > cleanup_cfg (0); >> >> > >> >> > (this is not a new comment), and without merging blocks this whole >> >> > patch does zilch. >> >> > >> >> > There is no need to create new basic blocks at all (can replace the >> >> > final branch in a block with a copy of the whole block it jumps to, >> >> > instead); and then it is painfully obvious that switching to layout >> >> > mode here is pointless, too. >> >> > >> >> > Do you want me to do that? >> >> > >> >> > Btw, this isn't quadratic anyway; it is a constant number (the maximal >> >> > length allowed of the block to copy) linear. Unless there are >> >> > unboundedly >> >> > many forwarder blocks, which there shouldn't be, but I cannot prove >> >> > that. >> >> >> >> Well, you iterate calling functions (find candidates, merge, cleanup-cfg) >> >> that >> >> walk the whole function. So unless the number of iterations is bound >> >> with a constant I call this quadratic (in function size). >> > >> > It is bounded (with that caveat above): new candidates will be bigger than >> > the block merged into it, so there won't be more than >> > PARAM_MAX_GOTO_DUPLICATION_INSNS passes. >> > >> > But I can make it all work simpler, in non-layout mode, if you prefer. >> >> Iteration is fine but we should restrict where we look for new >> candidates. Likewise >> block merging opportunities should be easier to exploit than by >> running cfg-cleanup... > > Yes, but that was what the original code does already, so I didn't look > deeper. "It was just a simple bugfix". > >> I'm just thinking of those gigantic machine-generated state machines where we >> ATM do quite well (at least at -O1 ...) with respect to compile-time. > > Like I said, I tested it on a 2000 node VM, very artificial so almost > no code except the threading itself, and the compgotos pass takes less > than 1% time both before and after the patch. I ony tested at -O2 though. I see. > I'll get back to it, but first I have some things that need handling during > stage 1. Fair enough. Don't consider my comments blocking the patch. Richard. > > Segher
Re: [RFC] Check number of uses in simplify_cond_using_ranges().
On Thu, Nov 3, 2016 at 4:03 PM, Dominik Vogt wrote: > I've been trying to fix some bad tree-ssa related optimisation for > s390x and come up with the attached experimental patch. The patch > is not really good - it breaks some situations in which the > optimisation was useful. With this code: > > void bar(long); > void foo (char a) > { > long l; > char b; > > b = a & 63; > l = b; > if (l > 9) > bar(l); > } > > We get this representation before value range propagation: > > ... > a_4 = *p_3(D); > b_5 = a_4 & 63; > l_6 = (long int) b_5; > if (l_6 > 9) > ... > > Now, there's some code in tree-vrp.c:simplify_cond_using_ranges() > that folds b_5 into the if condition, because l_6 is just a sign > extension of b_5, and the value range of l_6 can also be > represented by the type of b (char). > > if (b_5 > 9) > > (On s390x we end up with "a & 63" stored in two separate > registers, extended to 32 bits in one and to 64 bits in the other, > adding up to two unnecessary instructions.) > > A naive idea to prevent folding in this situation was to suppress > it if it would introduce a second use of b_5 (i.e. b_5 was only > used in the cast before) while not eliminating all uses of l_6. > However, calling has_single_use() for both purposes proves to be > not good enough, and VRP does not do this kind of optimisation > yet. It does not catch cases like > > if (l_6 > 9) > ... > else if (l_6 > 7) > ... > > where all occurences of l_6 could be replaced, and simply looking > at the use counts is too coarse. > > -- > > Is VRP the right pass to do this optimisation or should a later > pass rather attempt to eliminate the new use of b_5 instead? Uli > has brought up the idea a mini "sign extend elimination" pass that > checks if the result of a sign extend could be replaced by the > original quantity in all places, and if so, eliminate the ssa > name. (I guess that won't help with the above code because l is > used also as a function argument.) How could a sensible approach > to deal with the situation look like? We run into this kind of situation regularly and for general foldings in match.pd we settled with single_use () even though it is not perfect. Note the usual complaint is not extra extension instructions but the increase of register pressure. This is because it is hard to do better when you are doing local optimization. As for the question on whether VRP is the right pass to do this the answer is two-fold -- VRP has the most precise range information. But the folding itself should be moved to generic code and use get_range_info (). Richard. > Ciao > > Dominik ^_^ ^_^ > > -- > > Dominik Vogt > IBM Germany >
Re: [patch] Clean up WORD_REGISTER_OPERATIONS & LOAD_EXTEND_OP tests
On Thu, Nov 3, 2016 at 9:56 PM, Eric Botcazou wrote: > Hi, > > WORD_REGISTER_OPERATIONS and LOAD_EXTEND_OP are partially used directly as > preprocessor macros and partially tested in the code. This patch brings a bit > of consistency into this by converting the remaining macro cases. > > Tested on SPARC/Solaris and x86-64/Linux, OK for the mainline? Ok. Richard. > > 2016-11-03 Eric Botcazou > > * defaults.h (LOAD_EXTEND_OP): Define if not already defined. > * combine.c (LOAD_EXTEND_OP): Delete. > (simplify_comparison): Fix comment on LOAD_EXTEND_OP. > * cse.c (LOAD_EXTEND_OP): Delete. > * fold-const.c (LOAD_EXTEND_OP): Likewise. > * fwprop.c (free_load_extend): Remove #ifdef LOAD_EXTEND_OP/#endif. > * postreload.c (LOAD_EXTEND_OP): Delete. > * reload.c (push_reload): Remove #ifdef LOAD_EXTEND_OP/#endif. > Convert conditional compilation based on WORD_REGISTER_OPERATIONS. > (find_reloads): Likewise. > * reload1.c (eliminate_regs_1): Likewise. > * rtlanal.c (nonzero_bits1): Remove #ifdef LOAD_EXTEND_OP/#endif. > (num_sign_bit_copies1): Likewise. > > -- > Eric Botcazou
Re: [match.pd] Fix for PR35691
On 4 November 2016 at 13:41, Richard Biener wrote: > On Thu, 3 Nov 2016, Marc Glisse wrote: > >> On Thu, 3 Nov 2016, Richard Biener wrote: >> >> > > > > The transform would also work for vectors (element_precision for >> > > > > the test but also a value-matching zero which should ensure the >> > > > > same number of elements). >> > > > Um sorry, I didn't get how to check vectors to be of equal length by a >> > > > matching zero. >> > > > Could you please elaborate on that ? >> > > >> > > He may have meant something like: >> > > >> > > (op (cmp @0 integer_zerop@2) (cmp @1 @2)) >> > >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the >> > point of the pattern. >> >> Oups, that's what I had written first, and then I somehow managed to confuse >> myself enough to remove it so as to remove the call to types_match :-( >> >> > > So the last operand is checked with operand_equal_p instead of >> > > integer_zerop. But the fact that we could compute bit_ior on the >> > > comparison results should already imply that the number of elements is >> > > the >> > > same. >> > >> > Though for equality compares we also allow scalar results IIRC. >> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate one >> for a testcase, at least until the GIMPLE FE lands... >> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some >> > > vector mode on s390?), it would be cool to do the same for floats (most >> > > likely at the RTL level). >> > >> > On GIMPLE view-converts could come to the rescue here as well. Or we cab >> > just allow bit-and/or on floats as much as we allow them on pointers. >> >> Would that generate sensible code on targets that do not have logic insns for >> floats? Actually, even on x86_64 that generates inefficient code, so there >> would be some work (for instance grep finds no gen_iordf3, only >> gen_iorv2df3). >> >> I am also a bit wary of doing those obfuscating optimizations too early... >> a==0 is something that other optimizations might use. long >> c=(long&)a|(long&)b; (double&)c==0; less so... >> >> (and I am assuming that signaling NaNs don't make the whole transformation >> impossible, which might be wrong) > > Yeah. I also think it's not so much important - I just wanted to mention > vectors... > > Btw, I still think we need a more sensible infrastructure for passes > to gather, analyze and modify complex conditions. (I'm always pointing > to tree-affine.c as an, albeit not very good, example for handling > a similar problem) Thanks for mentioning the value-matching capture @@, I wasn't aware of this match.pd feature. The current patch keeps it restricted to only bitwise operators on integers. Bootstrap+test running on x86_64-unknown-linux-gnu. OK to commit if passes ? Thanks, Prathamesh > > Richard. 2016-11-04 Prathamesh Kulkarni PR middle-end/35691 * match.pd: Add following two patterns: (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0. (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0. testsuite/ * gcc.dg/pr35691-1.c: New test-case. * gcc.dg/pr35691-2.c: Likewise. * gcc.dg/pr35691-3.c: Likewise. * gcc.dg/pr35691-4.c: Likewise. diff --git a/gcc/match.pd b/gcc/match.pd index 48f7351..4f74942 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -519,6 +519,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (TYPE_UNSIGNED (type)) (bit_and @0 (bit_not (lshift { build_all_ones_cst (type); } @1) +/* PR35691: Transform + (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0. + (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0. */ + +(for bitop (bit_and bit_ior) + cmp (eq ne) + (simplify + (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1))) +(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0)); } + /* Fold (A & ~B) - (A & B) into (A ^ B) - B. */ (simplify (minus (bit_and:cs @0 (bit_not @1)) (bit_and:cs @0 @1)) diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c b/gcc/testsuite/gcc.dg/pr35691-1.c new file mode 100644 index 000..5211f815 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr35691-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop-details" } */ + +int foo(int z0, unsigned z1) +{ + int t0 = (z0 == 0); + int t1 = (z1 == 0); + int t2 = (t0 && t1); + return t2; +} + +/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "forwprop1" } } */ diff --git a/gcc/testsuite/gcc.dg/pr35691-2.c b/gcc/testsuite/gcc.dg/pr35691-2.c new file mode 100644 index 000..90cbf6d --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr35691-2.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop-details" } */ + +int foo(int z0, unsigned z1) +{ + int t0 = (z0 != 0); + int t1 = (z1 != 0); + i
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On 11/02/2016 03:35 PM, Jakub Jelinek wrote: > On Wed, Nov 02, 2016 at 03:27:42PM +0100, Martin Liška wrote: >>> So is there anything I should do wrt -Wswitch-unreachable? >>> >>> Marek >>> >> >> Probably not. I'm having a patch puts GIMPLE_SWITCH statement to a proper >> place >> in GIMPLE_BIND. Let's see whether such patch can bootstrap and survive >> regression >> tests. > > Please do that only for -fsanitize-use-after-scope, it will likely affect at > least for -O0 the debugging experience. For -O0 -fsanitize=address > -fsanitize-use-after-scope > perhaps we could arrange for some extra stmt to have the locus of the > switch (where we still don't want the vars to appear in scope) and then > have no locus on the ASAN_MARK and actual GIMPLE_SWITCH or something > similar. > > Jakub > I'm sending patch where I put gimple switch statement to a place where all BIND_EXPR vars are unpoisoned. I'm sending diff to a previous version and new version of the patch. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Apart from that, asan bootstrap successfully finished on x86_64-linux-gnu. Martin >From 742ba5e3f5eddb069a13bc51347d21a4ec6a45c6 Mon Sep 17 00:00:00 2001 From: marxin Date: Tue, 3 May 2016 15:35:22 +0200 Subject: [PATCH 1/2] Introduce -fsanitize-address-use-after-scope gcc/c-family/ChangeLog: 2016-10-27 Martin Liska * c-warn.c (warn_for_unused_label): Save all labels used in goto or in &label. gcc/ChangeLog: 2016-10-27 Martin Liska * asan.c (enum asan_check_flags): Move the enum to header file. (asan_init_shadow_ptr_types): Make type creation more generic. (shadow_mem_size): New function. (asan_emit_stack_protection): Use newly added ASAN_SHADOW_GRANULARITY. Rewritten stack unpoisoning code. (build_shadow_mem_access): Add new argument return_address. (instrument_derefs): Instrument local variables if use after scope sanitization is enabled. (asan_store_shadow_bytes): New function. (asan_expand_mark_ifn): Likewise. (asan_sanitize_stack_p): Moved from asan_sanitize_stack_p. * asan.h (enum asan_mark_flags): Moved here from asan.c (asan_protect_stack_decl): Protect all declaration that need to live in memory. (asan_sanitize_use_after_scope): New function. (asan_no_sanitize_address_p): Likewise. * cfgexpand.c (partition_stack_vars): Consider asan_sanitize_use_after_scope in condition. (expand_stack_vars): Likewise. * common.opt (-fsanitize-address-use-after-scope): New option. * doc/invoke.texi (use-after-scope-direct-emission-threshold): Explain the parameter. * flag-types.h (enum sanitize_code): Define SANITIZE_USE_AFTER_SCOPE. * gimplify.c (build_asan_poison_call_expr): New function. (asan_poison_variable): Likewise. (gimplify_bind_expr): Generate poisoning/unpoisoning for local variables that have address taken. (gimplify_decl_expr): Likewise. (gimplify_target_expr): Likewise for C++ temporaries. (sort_by_decl_uid): New function. (gimplify_expr): Unpoison all variables for a label we can jump from outside of a scope. (gimplify_function_tree): Clear asan_poisoned_variables. * internal-fn.c (expand_ASAN_MARK): New function. * internal-fn.def (ASAN_MARK): Declare. * opts.c (finish_options): Handle -fstack-reuse if -fsanitize-address-use-after-scope is enabled. (common_handle_option): Enable address sanitization if -fsanitize-address-use-after-scope is enabled. (warn_switch_unreachable_r): Return iterator instead of gimple *. (maybe_warn_switch_unreachable): Release the iterator. (gimplify_switch_expr): In case of enabled sanitization, put switch gimple statement to a place where all variables in BIND_EXPR are sanitized. * params.def (PARAM_USE_AFTER_SCOPE_DIRECT_EMISSION_THRESHOLD): New parameter. * params.h: Likewise. * sancov.c (pass_sanopt::execute): Handle IFN_ASAN_MARK. sanitizer.def: Define __asan_poison_stack_memory and __asan_unpoison_stack_memory functions. * tree-eh.c (verify_norecord_switch_expr): Verify switch expr only when not doing sanitization. --- gcc/asan.c| 287 +- gcc/asan.h| 66 ++-- gcc/c-family/c-warn.c | 9 +- gcc/cfgexpand.c | 18 +--- gcc/common.opt| 3 + gcc/doc/invoke.texi | 15 ++- gcc/gimplify.c| 239 + gcc/internal-fn.c | 9 ++ gcc/internal-fn.def | 1 + gcc/opts.c| 27 - gcc/params.def| 6 ++ gcc/params.h | 2 + gcc/sanitizer.def | 4 + gcc/sanopt.c | 3 + gcc/tree-eh.c | 4 + 15 files changed, 592 insertions(+), 101 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index c6d9240..95495d2 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -245,6 +245,13 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value; static bool asan_shadow_offset_computed; static vec sanitized_sections; +/* Set of variable declarations that are going to be guarded
Re: [PATCH] Make direct emission of time profiler counter
On 3 November 2016 at 16:11, Jan Hubicka wrote: >> >> 2016-10-31 Martin Liska >> >> * libgcov-profiler.c (__gcov_time_profiler): Remove. >> (__gcov_time_profiler_atomic): Likewise. >> >> gcc/ChangeLog: >> >> 2016-10-31 Martin Liska >> >> * profile.c (instrument_values): Fix coding style. >> (branch_prob): Use renamed function. >> * tree-profile.c (init_ic_make_global_vars): Likewise. >> (gimple_init_edge_profiler): Rename to >> gimple_init_gcov_profiler. >> tree_time_profiler_counter variable declaration. >> (gimple_gen_time_profiler): Rewrite to do a direct gimple code >> emission. >> * value-prof.h: Remove an argument. >> >> gcc/testsuite/ChangeLog: >> >> 2016-11-03 Martin Liska >> >> * gcc.dg/no_profile_instrument_function-attr-1.c: Update scanned >> output. >> * gcc.dg/tree-prof/time-profiler-3.c: New test. > > OK, > Thanks! > Honza Hi, It seems this patch causes an ICE when compiling gcc.dg/gomp/pr27573.c for instance on arm-linux-gnueabi --wtih-cpu=cortex-a9 The backtrace is: /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c: In function 'main._omp_fn.0': /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c:12:9: internal compiler error: in convert_memory_address_addr_sp ace_1, at explow.c:284 0x79b19e convert_memory_address_addr_space_1(machine_mode, rtx_def*, unsigned char, bool, bool) /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/explow.c:284 0x670903 get_builtin_sync_mem /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:4933 0x671fef expand_builtin_atomic_fetch_op /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:5452 0x674aa4 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:6859 0x7b729a expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10773 0x7c1460 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, bool, tree_node*) /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5551 0x7c7451 expand_assignment(tree_node*, tree_node*, bool) /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5317 0x69a0de expand_call_stmt /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:2666 0x69b264 expand_gimple_stmt_1 /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3581 0x69b264 expand_gimple_stmt /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3747 0x69cc8a expand_gimple_basic_block /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5754 0x69fefe execute /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6368 Please submit a full bug report, Christophe
Re: [PATCHv2 5/7, GCC, ARM, V8M] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute
Hi Andre, On 25/10/16 17:29, Andre Vieira (lists) wrote: On 24/08/16 12:01, Andre Vieira (lists) wrote: On 25/07/16 14:25, Andre Vieira (lists) wrote: This patch adds support for the ARMv8-M Security Extensions 'cmse_nonsecure_call' attribute. This attribute may only be used for function types and when used in combination with the '-mcmse' compilation flag. See Section 5.5 of ARM®v8-M Security Extensions (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html). We currently do not support cmse_nonsecure_call functions that pass arguments or return variables on the stack and we diagnose this. *** gcc/ChangeLog *** 2016-07-25 Andre Vieira Thomas Preud'homme * config/arm/arm.c (gimplify.h): New include. (arm_handle_cmse_nonsecure_call): New. (arm_attribute_table): Added cmse_nonsecure_call. *** gcc/testsuite/ChangeLog *** 2016-07-25 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse-3.c: Add tests. * gcc.target/arm/cmse/cmse-4.c: Add tests. Added more documentation as requested. --- This patch adds support for the ARMv8-M Security Extensions 'cmse_nonsecure_call' attribute. This attribute may only be used for function types and when used in combination with the '-mcmse' compilation flag. See Section 5.5 of ARM®v8-M Security Extensions (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html). We currently do not support cmse_nonsecure_call functions that pass arguments or return variables on the stack and we diagnose this. *** gcc/ChangeLog *** 2016-07-xx Andre Vieira Thomas Preud'homme * config/arm/arm.c (gimplify.h): New include. (arm_handle_cmse_nonsecure_call): New. (arm_attribute_table): Added cmse_nonsecure_call. * doc/extend.texi (ARM ARMv8-M Security Extensions): New attribute. *** gcc/testsuite/ChangeLog *** 2016-07-xx Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse-3.c: Add tests. * gcc.target/arm/cmse/cmse-4.c: Add tests. Hi, Rebased previous patch on top of trunk as requested. No changes to ChangeLog. This looks ok to me. Thanks, Kyrill Cheers, Andre
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
Hi! On Fri, Nov 04, 2016 at 10:17:31AM +0100, Martin Liška wrote: > diff --git a/gcc/gimplify.c b/gcc/gimplify.c > index 813777d..86ce793 100644 > --- a/gcc/gimplify.c > +++ b/gcc/gimplify.c > @@ -1678,7 +1678,9 @@ warn_switch_unreachable_r (gimple_stmt_iterator *gsi_p, > bool *handled_ops_p, >worse location info. */ >if (gimple_try_eval (stmt) == NULL) > { > - wi->info = stmt; > + gimple_stmt_iterator *it = XNEW (gimple_stmt_iterator); > + memcpy (it, gsi_p, sizeof (gimple_stmt_iterator)); That would be cleaner as *it = *gsi_p; That set, I fail to see 1) the need to use a gsi pointer in wi->info compared to stmt itself, you can gsi_for_stmt cheaply at any time 2) why is anything done about this in warn_switch_unreachable_r - the problem isn't related to this warning IMHO. Even switch (x) { case 1: int x; x = 6; ptr = &x; break; case 2: ptr = &x; *ptr = 7; break; } has the same issue and there is no switch unreachable code there, but you still want for -fsanitize-use-after-scope pretend it is actually: x_tmp = x; { int x; switch (x_tmp) { case 1: x = 6; ptr = &x; break; case 2: ptr = &x; *ptr = 7; break; } } and put ASAN_MARK unpoisoning before GIMPLE_SWITCH. Jakub
Re: [patch,testsuite] Support dg-require-effective-target label_offsets.
On 04.11.2016 03:48, Mike Stump wrote: On Nov 3, 2016, at 8:25 AM, Georg-Johann Lay wrote: On 28.10.2016 17:58, Mike Stump wrote: On Oct 27, 2016, at 3:16 AM, Georg-Johann Lay wrote: Now imagine some arithmetic like &&LAB2 - &&LAB1. This might result in one or two stub addresses, and difference between such addresses is a complete different thing than the difference between the original labels: The result might differ in absolute value and in sign, i.e. you can't even determine whether LAB1 or LAB2 comes first in the generated code as the order of stubs might differ from the order of respective labels. So, I think this all doesn't matter any. Every address gs(LAB) fits in 16-bits by definition, and every gs(LAB1) - gs(LAB2) fits into 16 bits and Yes, you are right. Label differences could be implemented. AFAIK there is currently no activity by the Binutils folks to add respective support, so it is somewhat pointless to add that support to avr-gcc... [ pause, built an avr compiler ] So, I checked code-gen for gcc.c-torture/compile/labels-2.c and it looked fine: ldi r24,lo8(gs(.L2)) ldi r25,hi8(gs(.L2)) std Y+2,r25 std Y+1,r24 ldi r18,lo8(gs(.L3)) ldi r19,hi8(gs(.L3)) ldi r24,lo8(gs(.L4)) ldi r25,hi8(gs(.L4)) mov r20,r18 mov r21,r19 sub r20,r24 So, apparently quite a lot of the code-gen is already wired up. Since this generated code, I wasn't sure what error you're trying to hide? I tried all the test cases your mentioned, none failed to produce code or failed to assemble that code with -mmcu=atmega2560 nor any other cpu I could find, so, trivially, I must not quite understand what doesn't work yet. I did notice that gcc.c-torture/execute/pr70460.c produced: .word .L3-(.L2) which seems wrong, given all the other code-gen. I think this has to be gs(.L3)-(gs(.L2)), no? I tried to get binutils to do that for me directly with a .word and it seemed resistant; which is unfortunate. Well, the GCC specification on void* arithmetic requires that void is handled as if it has a size of one, and .L3-.L2 meets that requirement. As code addresses on avr are word addresses, not byte addresses, the above code will crash. A working expression would be (.L3-.L2)/2 but 1) this conflicts with the gcc specification 2) it does not work with stubs 3) the linker does not handle this correctly if relaxation is on. All the label-arithmetic test cases in GCC testsuite don't actually need stub generation because the binaries are not big enough. Using gs, the expression would be gs(.L3)-(gs(.L2)) as gs implements word addresses, not byte addresses. This also means the problem is not only present with 3-byte PC devises but also for smaller ones. If you consider the above to be a code-gen bug, then you can do something like: Index: config/avr/avr.c === --- config/avr/avr.c(revision 241837) +++ config/avr/avr.c(working copy) @@ -9422,6 +9422,21 @@ x = plus_constant (Pmode, x, AVR_TINY_PM_OFFSET); } + /* We generate stubs using gs(), but binutils doesn't support that + here. */ + if (AVR_3_BYTE_PC + && GET_CODE (x) == MINUS + && GET_CODE (XEXP (x, 0)) == LABEL_REF + && GET_CODE (XEXP (x, 1)) == LABEL_REF) +return false; + if (AVR_3_BYTE_PC + && GET_CODE (x) == MINUS + && GET_CODE (XEXP (x, 0)) == SUBREG + && GET_CODE (XEXP (XEXP (x, 0), 0)) == LABEL_REF + && GET_CODE (XEXP (x, 1)) == SUBREG + && GET_CODE (XEXP (XEXP (x, 1), 0)) == LABEL_REF) +return false; + return default_assemble_integer (x, size, aligned_p); } to get the compiler to error out to prevent bad code-gen that binutils can't handle. If you want, you can also declare by fiat that subtraction might be performed on the stub or the actual function, and so any code that does this isn't supported (runtime unspecified behavior), and then you just mark the execute testcase as bad (not supportable on avr, since avr blew the binutils support for gs). The compile only test cases work fine now, so nothing needs to be done for them. ICE is not a nice approach, IMO. I'd prefer some more graceful error using code locations of the labels (which are not available). If you go for the above and make it a hard error, then a patch in the style of the original patch would be fine, but please just mark the ones that fail and please key of avr*, as this is an avr-only misfeature. If avr did it right, they would complete gs() support so that gs(LAB)-gs(LAB) works. gcc has a fetish for + and - working with labels and constants. The completely different solution would be to never use gs(), and generate code that is 3 byte aware, but, I suspect you're unlikely to want to do that. That's the worst solution of all because it changes the ABI, leads to code bloat, and many more issues. And
Re: [patch,testsuite] Support dg-require-effective-target label_offsets.
On 04.11.2016 06:18, Senthil Kumar Selvaraj wrote: Georg-Johann Lay writes: State of matters is that Binutils support is missing, and if I understand you correctly, dg-require is not appropriate to factor out availability of such features? I'll take a stab at adding the missing binutils support for avr label diffs. Thanks for taking care of this. I had a look into it but got stuck with a test case derived from gcc.c-torture/execute/pr70460.c int c; __attribute__((noinline, noclone)) void call (void) { __asm ("nop"); } __attribute__((noinline, noclone)) void foo (int x) { static int b[] = { &&lab1 - &&lab0, &&lab2 - &&lab0 }; void *a = &&lab0 + b[x]; goto *a; lab1: c += 2; call(); lab2: c++; lab0: ; } int main () { foo (0); if (c != 3) __builtin_abort (); foo (1); if (c != 4) __builtin_abort (); return 0; } The problem is when relaxation is turned on and the CALL is shortened to RCALL but respective literals are not fixed up. Johann Regards Senthil
Re: [PATCH] Make direct emission of time profiler counter
On 11/04/2016 10:18 AM, Christophe Lyon wrote: > On 3 November 2016 at 16:11, Jan Hubicka wrote: >>> >>> 2016-10-31 Martin Liska >>> >>> * libgcov-profiler.c (__gcov_time_profiler): Remove. >>> (__gcov_time_profiler_atomic): Likewise. >>> >>> gcc/ChangeLog: >>> >>> 2016-10-31 Martin Liska >>> >>> * profile.c (instrument_values): Fix coding style. >>> (branch_prob): Use renamed function. >>> * tree-profile.c (init_ic_make_global_vars): Likewise. >>> (gimple_init_edge_profiler): Rename to >>> gimple_init_gcov_profiler. >>> tree_time_profiler_counter variable declaration. >>> (gimple_gen_time_profiler): Rewrite to do a direct gimple code >>> emission. >>> * value-prof.h: Remove an argument. >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2016-11-03 Martin Liska >>> >>> * gcc.dg/no_profile_instrument_function-attr-1.c: Update scanned >>> output. >>> * gcc.dg/tree-prof/time-profiler-3.c: New test. >> >> OK, >> Thanks! >> Honza > > Hi, > > It seems this patch causes an ICE when compiling > gcc.dg/gomp/pr27573.c > for instance on arm-linux-gnueabi --wtih-cpu=cortex-a9 Hello. Sorry for the breakage, I'm attaching untested patch which fixes that. I'm going to trigger regression tests. Martin > > The backtrace is: > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c: > In function 'main._omp_fn.0': > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c:12:9: > internal compiler error: in convert_memory_address_addr_sp > ace_1, at explow.c:284 > 0x79b19e convert_memory_address_addr_space_1(machine_mode, rtx_def*, > unsigned char, bool, bool) > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/explow.c:284 > 0x670903 get_builtin_sync_mem > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:4933 > 0x671fef expand_builtin_atomic_fetch_op > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:5452 > 0x674aa4 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:6859 > 0x7b729a expand_expr_real_1(tree_node*, rtx_def*, machine_mode, > expand_modifier, rtx_def**, bool) > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10773 > 0x7c1460 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, bool, > tree_node*) > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5551 > 0x7c7451 expand_assignment(tree_node*, tree_node*, bool) > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5317 > 0x69a0de expand_call_stmt > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:2666 > 0x69b264 expand_gimple_stmt_1 > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3581 > 0x69b264 expand_gimple_stmt > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3747 > 0x69cc8a expand_gimple_basic_block > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5754 > 0x69fefe execute > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6368 > Please submit a full bug report, > > Christophe > >From 531392d44eb195bd39cb49a169047f5bd898242f Mon Sep 17 00:00:00 2001 From: marxin Date: Fri, 4 Nov 2016 11:12:06 +0100 Subject: [PATCH] time_profiler: Set proper type to time_profiler_counter_ptr. gcc/ChangeLog: 2016-11-04 Martin Liska * tree-profile.c (gimple_gen_time_profiler): Set proper type to time_profiler_counter_ptr. --- gcc/tree-profile.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c index 09a702f..d18b954 100644 --- a/gcc/tree-profile.c +++ b/gcc/tree-profile.c @@ -489,8 +489,9 @@ gimple_gen_time_profiler (unsigned tag, unsigned base) /* Emit: counters[0] = ++__gcov_time_profiler_counter. */ if (flag_profile_update == PROFILE_UPDATE_ATOMIC) { - tree ptr = make_temp_ssa_name (type, NULL, "time_profiler_counter_ptr"); - tree addr = build1 (ADDR_EXPR, build_pointer_type (type), + tree ptr = make_temp_ssa_name (build_pointer_type (type), NULL, + "time_profiler_counter_ptr"); + tree addr = build1 (ADDR_EXPR, TREE_TYPE (ptr), tree_time_profiler_counter); gassign *assign = gimple_build_assign (ptr, NOP_EXPR, addr); gsi_insert_before (&gsi, assign, GSI_NEW_STMT); -- 2.10.1
Re: [PATCH] Convert SPARC to LRA
On 08/12/15 17:55, David Miller wrote: From: Sebastian Huber Date: Tue, 8 Dec 2015 11:17:53 +0100 since the LRA patch is still reverted on the trunk, I guess the switch to LRA will not happen in GCC 6? Indeed, it is unlikely I will have time to work on this for at least a month. Are there any plans to switch the SPARC GCC to LRA for GCC 7 or later? -- Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)
On 11/04/2016 10:32 AM, Jakub Jelinek wrote: > Hi! > > On Fri, Nov 04, 2016 at 10:17:31AM +0100, Martin Liška wrote: >> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >> index 813777d..86ce793 100644 >> --- a/gcc/gimplify.c >> +++ b/gcc/gimplify.c >> @@ -1678,7 +1678,9 @@ warn_switch_unreachable_r (gimple_stmt_iterator >> *gsi_p, bool *handled_ops_p, >> worse location info. */ >>if (gimple_try_eval (stmt) == NULL) >> { >> - wi->info = stmt; >> + gimple_stmt_iterator *it = XNEW (gimple_stmt_iterator); >> + memcpy (it, gsi_p, sizeof (gimple_stmt_iterator)); > > That would be cleaner as *it = *gsi_p; I know that it's kind of ugly, but as the gimple stmt is not yet added to a BB, using gsi_for_stmt ICEs: /tmp/use-after-scope-switch.c:12:5: internal compiler error: Segmentation fault switch (argc) ^~ 0xe16a14 crash_signal ../../gcc/toplev.c:338 0xadf890 bb_seq_addr ../../gcc/gimple.h:1658 0xae01dd gsi_start_bb ../../gcc/gimple-iterator.h:129 0xae111f gsi_for_stmt(gimple*) ../../gcc/gimple-iterator.c:617 > That set, I fail to see > 1) the need to use a gsi pointer in wi->info compared to stmt itself, >you can gsi_for_stmt cheaply at any time > 2) why is anything done about this in warn_switch_unreachable_r >- the problem isn't related to this warning IMHO. Even > switch (x) > { > case 1: > int x; > x = 6; > ptr = &x; > break; > case 2: > ptr = &x; > *ptr = 7; > break; > } >has the same issue and there is no switch unreachable code there, but you >still want for -fsanitize-use-after-scope pretend it is actually: You're right, that's not handled. I'm wondering whether it's always profitable because when you do not reach the case 1, you're not doing a poisoning operation? Martin > x_tmp = x; > { > int x; > switch (x_tmp) > { > case 1: > x = 6; > ptr = &x; > break; > case 2: > ptr = &x; > *ptr = 7; > break; > } > } > and put ASAN_MARK unpoisoning before GIMPLE_SWITCH. > > Jakub >
Re: [PATCH] Generate reproducible output independently of the build-path
Mike Stump: > On Nov 3, 2016, at 1:01 PM, Ximin Luo wrote: >> Log snippets attached. > > Ick. You missed the utility of contrib/compare_tests. :-) > > If you say: > > contrib/compare_tests oldbuilddir newbuilddir > > You can then sit back and see everything as you'd expect and want. The > output is priority sorted and usually around 0 lines line or so. > Oh, thanks! Here is the output: ~/reproducible/gcc-7-20161030$ contrib/compare_tests ../gcc-build-0/ ../gcc-build-1/ # Comparing directories ## Dir1=../gcc-build-0/: 8 sum files ## Dir2=../gcc-build-1/: 8 sum files # Comparing 8 common sum files ## /bin/sh contrib/compare_tests /tmp/gxx-sum1.9339 /tmp/gxx-sum2.9339 New tests that PASS: gcc.dg/cpp/source_date_epoch-4.c (test for excess errors) gcc.dg/cpp/source_date_epoch-4.c execution test gcc.dg/cpp/source_date_epoch-5.c (test for excess errors) gcc.dg/cpp/source_date_epoch-5.c execution test gcc.dg/cpp/source_prefix_map-1.c (test for excess errors) gcc.dg/cpp/source_prefix_map-1.c execution test gcc.dg/cpp/source_prefix_map-2.c (test for excess errors) gcc.dg/cpp/source_prefix_map-2.c execution test gcc.dg/debug/dwarf2/source_prefix_map-1.c (test for excess errors) gcc.dg/debug/dwarf2/source_prefix_map-1.c scan-assembler-dem DW_AT_comp_dir: "DWARF2TEST/gcc gcc.dg/debug/dwarf2/source_prefix_map-2.c (test for excess errors) gcc.dg/debug/dwarf2/source_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/ # No differences found in 8 common sum files -- GPG: ed25519/56034877E1F87C35 GPG: rsa4096/1318EFAC5FBBDBCE https://github.com/infinity0/pubkeys.git
Re: [RFC] Check number of uses in simplify_cond_using_ranges().
On Fri, Nov 04, 2016 at 09:47:26AM +0100, Richard Biener wrote: > On Thu, Nov 3, 2016 at 4:03 PM, Dominik Vogt wrote: > > Is VRP the right pass to do this optimisation or should a later > > pass rather attempt to eliminate the new use of b_5 instead? Uli > > has brought up the idea a mini "sign extend elimination" pass that > > checks if the result of a sign extend could be replaced by the > > original quantity in all places, and if so, eliminate the ssa > > name. (I guess that won't help with the above code because l is > > used also as a function argument.) How could a sensible approach > > to deal with the situation look like? > > We run into this kind of situation regularly and for general foldings > in match.pd we settled with single_use () even though it is not perfect. > Note the usual complaint is not extra extension instructions but > the increase of register pressure. > > This is because it is hard to do better when you are doing local > optimization. > > As for the question on whether VRP is the right pass to do this the > answer is two-fold -- VRP has the most precise range information. > But the folding itself should be moved to generic code and use > get_range_info (). All right, is this a sensible approach then? 1. Using has_single_use as in the experimental patch is good enough (provided further testing does not show serious regressions). 2. Rip the whole top level if-block from simplify_cond_using_ranges(). 3. Translate that code to match.pd syntax. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
RE: [PATCH] [ARC] New option handling, refurbish multilib support.
Andrew, > you create the TargetVariable arc_mpy_option, however, I think it > would be neater to fold this variable into the mmpy-option as it was > before, but, changing the type to Enum. This would allow the big > option checking switch to be removed from arc-common.c, which I think > is a win overall. > > I wasn't 100% certain that the above would actually be possible, so I > put together a prototype, I've included the patch below, that applies > on top of your patch. Hopefully you'll agree that it's a nice clean > up. Thanks for the suggestion. > I wonder where the vale 0x1000 comes from, what's the significance > of it. I could ask the same question about the magic -1 constant, but > it's rather more obvious that -1 is just a no-value-selected magic > number. I guess my questions for 0x1000 are why this specific > value? What does it mean? I need a value to mark if the variable in question is changed by a command line option or not. This is required when we set the cpu's specific configuration. I'll include this comment in the description of the define for clarity. > and also the driver-arc.c file. You seem to be missing support for > nps400 in here. Specifically, if I pass -mcpu=nps400 to GCC I'd > expect the generated assembler file to include ".cpu NPS400", and the > assembler to be driven with "-mcpu=nps400". This patch was crafted way before the new ARC binutils to be upstreamed. Hence, it needed to work with the "old" binutils which had no support for nps400 :) > Admittedly we're missing a GCC test for this (there's a patch below > for just such a new test). > > I think the solution could be fairly easy, if we tracked the specific > processor type in arc_cpu_t structure we could specialise in > arc_override_options and in driver-arc.c, though I don't know if you'd > agree that this is the right approach... I'm not entirely sure > myself... but it might be the easiest approach to move us forward. > Anyway, I include a patch for that below too, feel free to use or not > as you see fit. I am preparing a patch to binutils which should be able to recognize all GCC's cpu variations as valid .cpu pseudo-op argument. Hence, we can just pass to the assembler the appropriate .cpu variation seamlessly. If it is ok with you, I would like to come with a new patch on this topic after I extend .cpu support in binutils. Alternatively, I can include your suggestion now, but then I will refurbish it later on. Please let me know what do u fancy most. Thanks, Claudiu
[Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions
Dear All, The associate construct does not readily permit the identification of the associate name as an array during parsing. However, this can be done whilst matching rvalues within the associate block. This patch extends this identification in gfc_match_varspec, whilst excluding character types from the present test. This latter change prevents scalar substrings from being pegged as array references, which was the origin of this bug. Bootstraps and regtests on FC21/x86_64. OK for trunk? Paul 2016-04-19 Paul Thomas PR fortran/64933 * primary.c (gfc_match_varspec): If selector expression is unambiguously an array, make sure that the associate name is an array and has an array spec. Modify the original condition for doing this to exclude character types. 2016-04-19 Paul Thomas PR fortran/64933 * gfortran.dg/associate_23.f90: New test. -- The difference between genius and stupidity is; genius has its limits. Albert Einstein Index: gcc/fortran/primary.c === *** gcc/fortran/primary.c (revision 241831) --- gcc/fortran/primary.c (working copy) *** gfc_match_varspec (gfc_expr *primary, in *** 1931,1945 } /* For associate names, we may not yet know whether they are arrays or not. ! Thus if we have one and parentheses follow, we have to assume that it ! actually is one for now. The final decision will be made at ! resolution time, of course. */ ! if (sym->assoc && gfc_peek_ascii_char () == '(' ! && !(sym->assoc->dangling && sym->assoc->st && sym->assoc->st->n.sym ! && sym->assoc->st->n.sym->attr.dimension == 0) ! && sym->ts.type != BT_CLASS) sym->attr.dimension = 1; if ((equiv_flag && gfc_peek_ascii_char () == '(') || gfc_peek_ascii_char () == '[' || sym->attr.codimension --- 1931,1966 } /* For associate names, we may not yet know whether they are arrays or not. ! If the selector expression is unambiguously an array; eg. a full array ! or an array section, then the associate name must be an array and we can ! fix it now. Otherwise, if parentheses follow and it is not a character ! type, we have to assume that it actually is one for now. The final ! decision will be made at resolution, of course. */ ! if (sym->assoc ! && gfc_peek_ascii_char () == '(' ! && sym->ts.type != BT_CLASS ! && !sym->attr.dimension) ! { ! if ((!sym->assoc->dangling ! && sym->assoc->target ! && sym->assoc->target->ref ! && sym->assoc->target->ref->type == REF_ARRAY ! && (sym->assoc->target->ref->u.ar.type == AR_FULL ! || sym->assoc->target->ref->u.ar.type == AR_SECTION)) ! || ! (!(sym->assoc->dangling || sym->ts.type == BT_CHARACTER) ! && sym->assoc->st && sym->assoc->st->n.sym ! && sym->assoc->st->n.sym->attr.dimension == 0)) ! { sym->attr.dimension = 1; + if (sym->as == NULL && sym->assoc + && sym->assoc->st + && sym->assoc->st->n.sym + && sym->assoc->st->n.sym->as) + sym->as = gfc_copy_array_spec (sym->assoc->st->n.sym->as); + } + } if ((equiv_flag && gfc_peek_ascii_char () == '(') || gfc_peek_ascii_char () == '[' || sym->attr.codimension Index: gcc/testsuite/gfortran.dg/associate_23.f90 === *** gcc/testsuite/gfortran.dg/associate_23.f90 (revision 0) --- gcc/testsuite/gfortran.dg/associate_23.f90 (working copy) *** *** 0 --- 1,36 + ! { dg-do run } + ! + ! Tests the fix for PR64933 + ! + ! Contributed by Olivier Marsden + ! + program test_this + implicit none + character(len = 15) :: char_var, char_var_dim (3) + character(len = 80) :: buffer + + ! Original failing case reported in PR + ASSOCIATE(should_work=>char_var) + should_work = "test succesful" + write (buffer, *) should_work(5:14) + END ASSOCIATE + + if (trim (buffer) .ne. " succesful") call abort + + ! Found to be failing during debugging + ASSOCIATE(should_work=>char_var_dim) + should_work = ["test SUCCESFUL", "test_SUCCESFUL", "test.SUCCESFUL"] + write (buffer, *) should_work(:)(5:14) + END ASSOCIATE + + if (trim (buffer) .ne. " SUCCESFUL_SUCCESFUL.SUCCESFUL") call abort + + ! Found to be failing during debugging + ASSOCIATE(should_work=>char_var_dim(1:2)) + should_work = ["test SUCCESFUL", "test_SUCCESFUL", "test.SUCCESFUL"] + write (buffer, *) should_work(:)(5:14) + END ASSOCIATE + + if (trim (buffer) .ne. " SUCCESFUL_SUCCESFUL") call abort + + end program
Re: [PATCH] Make direct emission of time profiler counter
On 11/04/2016 11:14 AM, Martin Liška wrote: > Hello. > > Sorry for the breakage, I'm attaching untested patch which fixes that. > I'm going to trigger regression tests. > > Martin Regression tests have finished on ppc64le-redhat-linux. Ready to be installed? Martin
Re: [PATCH] Convert SPARC to LRA
> Are there any plans to switch the SPARC GCC to LRA for GCC 7 or later? There are plans to switch SPARC to LRA at some point, but no ETA. -- Eric Botcazou
RE: [PATCH] [ARC] Various small miscellaneous fixes.
> All the rest looks good. > Committed with the suggested changes. Thank you for your review, Claudiu
[PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
Hi! One of the common reasons why nonoverlapping_memrefs_p is called with MEM_EXPRs that are FUNCTION_DECLs and don't have DECL_RTL set is that DSE considers all calls to be memory reads from what is being called. The following patch avoids that. Of course the question is if we want to support something like: char var[64]; ... var[0] = '0xe8'; var[1] = ...; (void (*fn) (void) = (void (*) (void)) &var[0]; fn (); without any kind of barrier, or not. If we do, then the patch below might break it by DSEing the stores. So, do we want the patch as is, or just limit it to direct function calls (call (mem (symbol_ref))), or only those where the symbol is actually a FUNCTION_DECL? Or do we want to consider arbitrary pointers possibly writing into function function text? In any case, the second hunk fixes an important DSE bug that just got revealed by the former change. 2016-11-04 Jakub Jelinek PR target/77834 * dse.c (check_mem_read_use): For CALL with MEM as first operand, ignore that MEM, but recurse on the MEM's operand if not SYMBOL_REF. (dse_step5): Call scan_reads even if just insn_info->frame_read. Improve and fix dump file messages. --- gcc/dse.c.jj2016-11-03 15:52:12.521965058 +0100 +++ gcc/dse.c 2016-11-04 09:42:27.640098125 +0100 @@ -2159,6 +2159,19 @@ check_mem_read_use (rtx *loc, void *data rtx *loc = *iter; if (MEM_P (*loc)) check_mem_read_rtx (loc, (bb_info_t) data); + else if (GET_CODE (*loc) == CALL && MEM_P (XEXP (*loc, 0))) + { + /* Ignore the MEM in first operand of CALL, that isn't +any kind of read of the function text memory. */ + iter.skip_subrtxes (); + /* But for indirect function calls consider the MEM +containing the function pointer if any. Avoid recursing +in the most common case. */ + if (GET_CODE (XEXP (XEXP (*loc, 0), 0)) != SYMBOL_REF) + check_mem_read_use (&XEXP (XEXP (*loc, 0), 0), data); + if (!CONST_INT_P (XEXP (*loc, 1))) + check_mem_read_use (&XEXP (*loc, 1), data); + } } } @@ -3298,12 +3311,19 @@ dse_step5 (void) bitmap_clear (v); } else if (insn_info->read_rec - || insn_info->non_frame_wild_read) + || insn_info->non_frame_wild_read + || insn_info->frame_read) { - if (dump_file && !insn_info->non_frame_wild_read) - fprintf (dump_file, "regular read\n"); - else if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "non-frame wild read\n"); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + if (!insn_info->non_frame_wild_read + && !insn_info->frame_read) + fprintf (dump_file, "regular read\n"); + if (insn_info->non_frame_wild_read) + fprintf (dump_file, "non-frame wild read\n"); + if (insn_info->frame_read) + fprintf (dump_file, "frame read\n"); + } scan_reads (insn_info, v, NULL); } } Jakub
[PATCH] Fix sched-deps not to consider calls as reads from function's body (PR target/77834)
Hi! Similarly to the previous patch, this one is the second spot where nonoverlapping_memrefs_p is called with FUNCTION_DECLs without DECL_RTL. With both patches in the only case this happens in bootstrap/regtest is the pr77834.c testcase. Same questions as for the DSE patch apply. 2016-11-04 Jakub Jelinek PR target/77834 * sched-deps.c (sched_analyze_2): Don't recurse in CALL first argument if it is a MEM, just on its address. --- gcc/sched-deps.c.jj 2016-10-31 13:28:12.0 +0100 +++ gcc/sched-deps.c2016-11-04 09:59:28.192249487 +0100 @@ -2808,6 +2808,19 @@ sched_analyze_2 (struct deps_desc *deps, return; +case CALL: + /* Don't consider MEM in CALL's first argument + as a memory read, that is the first byte of + the function to be called. Only look at its + address. */ + if (MEM_P (XEXP (x, 0))) + { + sched_analyze_2 (deps, XEXP (XEXP (x, 0), 0), insn); + sched_analyze_2 (deps, XEXP (x, 1), insn); + return; + } + break; + default: break; } Jakub
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On 11/04/2016 01:26 PM, Jakub Jelinek wrote: The following patch avoids that. Of course the question is if we want to support something like: char var[64]; ... var[0] = '0xe8'; var[1] = ...; (void (*fn) (void) = (void (*) (void)) &var[0]; fn (); without any kind of barrier, or not. I was going to say no, but trampolines come to mind as a situation where we might come close to the above. Bernd
Simplify X /[ex] 8 == 0
Hello, this kind of simplification is already handled by fold_comparison, but the code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can be turned into X==0. When we have an explicit division by 0, there is a better optimisation possible (insert __builtin_unreachable() or __builtin_trap() after that statement, as in find_explicit_erroneous_behavior), so I don't touch it. For the overflow inequality case, it would have been a bit clearer to generate (cmp { build_zero_cst (TREE_TYPE (@0)); } @2) and let that be constant folded instead of having that ugly and error-prone test in constant_boolean_node, but I saved one tree ;-) Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are in the libitm part of the testsuite, they should be fixed by https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the testsuite when that patch is in. 2016-11-07 Marc Glisse gcc/ * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR. * match.pd (A /[ex] B CMP C): New simplifications. gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv.c: New file. -- Marc GlisseIndex: gcc/fold-const.c === --- gcc/fold-const.c (revision 241840) +++ gcc/fold-const.c (working copy) @@ -8740,22 +8740,21 @@ fold_comparison (location_t loc, enum tr SET_EXPR_LOCATION (tem, loc); return tem; } return fold_build2_loc (loc, code, type, cval1, cval2); } } } /* We can fold X/C1 op C2 where C1 and C2 are integer constants into a single range test. */ - if ((TREE_CODE (arg0) == TRUNC_DIV_EXPR - || TREE_CODE (arg0) == EXACT_DIV_EXPR) + if (TREE_CODE (arg0) == TRUNC_DIV_EXPR && TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST && !integer_zerop (TREE_OPERAND (arg0, 1)) && !TREE_OVERFLOW (TREE_OPERAND (arg0, 1)) && !TREE_OVERFLOW (arg1)) { tem = fold_div_compare (loc, code, type, arg0, arg1); if (tem != NULL_TREE) return tem; } Index: gcc/match.pd === --- gcc/match.pd (revision 241840) +++ gcc/match.pd (working copy) @@ -2314,20 +2314,50 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (ne @0 { build_real (TREE_TYPE (@0), c2); } /* sqrt(x) < c is the same as x < c*c, if we ignore NaNs. */ (if (! HONOR_NANS (@0)) (cmp @0 { build_real (TREE_TYPE (@0), c2); }) /* sqrt(x) < c is the same as x >= 0 && x < c*c. */ (if (GENERIC) (truth_andif (ge @0 { build_real (TREE_TYPE (@0), dconst0); }) (cmp @0 { build_real (TREE_TYPE (@0), c2); } +/* Fold A /[ex] B CMP C to A CMP B * C. */ +(for cmp (eq ne) + (simplify + (cmp (exact_div @0 @1) INTEGER_CST@2) + (if (!integer_zerop (@1)) + (if (wi::eq_p (@2, 0)) +(cmp @0 @2) +(if (TREE_CODE (@1) == INTEGER_CST) + (with + { + bool ovf; + wide_int prod = wi::mul (@2, @1, TYPE_SIGN (TREE_TYPE (@1)), &ovf); + } + (if (ovf) + { constant_boolean_node (cmp == NE_EXPR, type); } + (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); } +(for cmp (lt le gt ge) + (simplify + (cmp (exact_div @0 INTEGER_CST@1) INTEGER_CST@2) + (if (wi::gt_p (@1, 0, TYPE_SIGN (TREE_TYPE (@1 + (with +{ + bool ovf; + wide_int prod = wi::mul (@2, @1, TYPE_SIGN (TREE_TYPE (@1)), &ovf); +} +(if (ovf) + { constant_boolean_node (wi::lt_p (@2, 0, TYPE_SIGN (TREE_TYPE (@2))) + != (cmp == LT_EXPR || cmp == LE_EXPR), type); } + (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); })) + /* Unordered tests if either argument is a NaN. */ (simplify (bit_ior (unordered @0 @0) (unordered @1 @1)) (if (types_match (@0, @1)) (unordered @0 @1))) (simplify (bit_and (ordered @0 @0) (ordered @1 @1)) (if (types_match (@0, @1)) (ordered @0 @1))) (simplify Index: gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c === --- gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c (working copy) @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized" } */ + +int f(int *p, int *q){ + __SIZE_TYPE__ n = q - p; + return n == 0; +} + +int g(int *p, int *q){ + __PTRDIFF_TYPE__ n = q - p; + return n <= 2; +} + +int h(long *p, long *q){ + __SIZE_TYPE__ n = q - p; + return n == (__SIZE_TYPE__)(-1)/2; +} + +/* { dg-final { scan-tree-dump-not "== 0" "optimized" } } */ +/* { dg-final { scan-tree-dump "<= 8" "optimized" { target { int32 } } } } */ +/* { dg-final { scan-tree-dump "return 0" "optimized" } } */
Re: [PATCH] Fix wrong declarations of builtin functions
On 11/03/2016 10:52 PM, Bernd Edlinger wrote: I thought in preparation of the new C++ warning about wrong declarations of builtin functions it would be good to fix some of the more obvious bugs in the testsuite first. Bootstrapped and reg-tested on x86_64-pc-linux-gnu. Ok. Bernd
Re: [PATCH GCC]Clean pedantic calls and useless lvalue code in fold_cond_expr_with_comparison
On Fri, Nov 4, 2016 at 4:40 AM, Richard Biener wrote: > On Thu, Nov 3, 2016 at 3:11 PM, Bin Cheng wrote: >> Hi, >> According to analysis given by >> https://gcc.gnu.org/ml/gcc/2016-10/msg00228.html, calls to >> pedantic_non_lvalue_loc and code handling lvalue in >> fold_cond_expr_with_comparison are useless now. Given this is complicated >> legacy code, it may be better to change code step by step, rather than doing >> this cleanup together with moving simplification from >> fold_cond_expr_with_comparison to match.pd. >> BTW, after last cleanup of pedantic_lvalues, function >> pedantic_non_lvalue_loc now has nothing to do with lvalue. It could be >> further cleaned up, or at least renamed into something else. This patch >> doesn't do that because that depends on the answer to the question of the >> aforementioned message. >> >> Bootstrap and test on x86_64 and AArch64. Any comments? > > Ok. > > Note removal of [pedantic_]non_lvalue can at most result in accepting > invalid code where we might not have any testsuite coverage. For the 2nd > case with > /* Avoid adding NOP_EXPRs in case this is an lvalue. */ and C++ lvalue ?: > I'm not sure we have testsuite coverage given Jason failed to add a testcase > when adding the code in r34416. Now that we're delaying folding in C++, it shouldn't matter whether fold is lvalue-safe. Jason
C++ PATCH for c++/78198 (inherited template ctor with default arg)
Default arguments of an inherited ctor remain in the context of the inherited function, not the artificial inheriting constructor. Tested x86_64-pc-linux-gnu, applying to trunk. commit d6ceff13167956b660317ef37a40ba7110d325c4 Author: Jason Merrill Date: Thu Nov 3 16:14:51 2016 -0400 PR c++/78198 - inherited template ctor with default arg * call.c (convert_default_arg): Look through inheriting ctors. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 27aa7fd..d2e99bc 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -7193,6 +7193,9 @@ convert_default_arg (tree type, tree arg, tree fn, int parmnum, /* See through clones. */ fn = DECL_ORIGIN (fn); + /* And inheriting ctors. */ + if (flag_new_inheriting_ctors) +fn = strip_inheriting_ctors (fn); /* Detect recursion. */ FOR_EACH_VEC_SAFE_ELT (default_arg_context, i, t) diff --git a/gcc/testsuite/g++.dg/cpp0x/inh-ctor22.C b/gcc/testsuite/g++.dg/cpp0x/inh-ctor22.C new file mode 100644 index 000..1b0e242 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/inh-ctor22.C @@ -0,0 +1,16 @@ +// { dg-do compile { target c++11 } } + +class A { }; +template using UniquePtr = int; +template struct BufferList { + BufferList(unsigned, unsigned, unsigned, AllocPolicy = AllocPolicy()); +}; +class D : BufferList { + using BufferList::BufferList; +}; +template UniquePtr MakeUnique(Args... aArgs) +{ + D d(aArgs...); + return 0; +} +UniquePtr setCloneBuffer_impl_buf = MakeUnique(0, 0, 0);
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, 4 Nov 2016, Jakub Jelinek wrote: > Hi! > > One of the common reasons why nonoverlapping_memrefs_p is called with > MEM_EXPRs that are FUNCTION_DECLs and don't have DECL_RTL set is that > DSE considers all calls to be memory reads from what is being called. > > The following patch avoids that. Of course the question is if we want > to support something like: > char var[64]; > ... > var[0] = '0xe8'; > var[1] = ...; > (void (*fn) (void) = (void (*) (void)) &var[0]; > fn (); > without any kind of barrier, or not. > If we do, then the patch below might break it by DSEing the stores. > So, do we want the patch as is, or just limit it to direct function calls > (call (mem (symbol_ref))), or only those where the symbol is actually > a FUNCTION_DECL? Or do we want to consider arbitrary pointers possibly > writing into function function text? I'm not sure the current code handled this correctly? At least I see nothing on the GIMPLE level that would disallow the DSE: int foo () { char a[256]; a[0] = 'a'; a[1] = 'b'; int __attribute__((const)) (*fn) (void) = (int (*) (void)) &a[0]; return fn (); } even with pure or regular fn this gets DSEd, the call is appearantly not an escape point for &a. That is, for DSE it would only matter for locals and/or const function calls (if stores after the call make the earlier appear dead). We did have bugs like PR70128 fixed though (but that was "global memory"). It might be still broken if the patched function is const and we unpatch right after calling... So to answer, yes, I think we want to treat this conservatively (but we may have existing bugs here). > In any case, the second hunk fixes an important DSE bug that just got > revealed by the former change. Do you have a testcase for the (wrong-code?) bug? > 2016-11-04 Jakub Jelinek > > PR target/77834 > * dse.c (check_mem_read_use): For CALL with MEM as first operand, > ignore that MEM, but recurse on the MEM's operand if not SYMBOL_REF. > (dse_step5): Call scan_reads even if just insn_info->frame_read. > Improve and fix dump file messages. > > --- gcc/dse.c.jj 2016-11-03 15:52:12.521965058 +0100 > +++ gcc/dse.c 2016-11-04 09:42:27.640098125 +0100 > @@ -2159,6 +2159,19 @@ check_mem_read_use (rtx *loc, void *data >rtx *loc = *iter; >if (MEM_P (*loc)) > check_mem_read_rtx (loc, (bb_info_t) data); > + else if (GET_CODE (*loc) == CALL && MEM_P (XEXP (*loc, 0))) > + { > + /* Ignore the MEM in first operand of CALL, that isn't > + any kind of read of the function text memory. */ > + iter.skip_subrtxes (); > + /* But for indirect function calls consider the MEM > + containing the function pointer if any. Avoid recursing > + in the most common case. */ > + if (GET_CODE (XEXP (XEXP (*loc, 0), 0)) != SYMBOL_REF) > + check_mem_read_use (&XEXP (XEXP (*loc, 0), 0), data); > + if (!CONST_INT_P (XEXP (*loc, 1))) > + check_mem_read_use (&XEXP (*loc, 1), data); > + } > } > } > > @@ -3298,12 +3311,19 @@ dse_step5 (void) > bitmap_clear (v); > } > else if (insn_info->read_rec > - || insn_info->non_frame_wild_read) > +|| insn_info->non_frame_wild_read > +|| insn_info->frame_read) > { > - if (dump_file && !insn_info->non_frame_wild_read) > - fprintf (dump_file, "regular read\n"); > - else if (dump_file && (dump_flags & TDF_DETAILS)) > - fprintf (dump_file, "non-frame wild read\n"); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + if (!insn_info->non_frame_wild_read > + && !insn_info->frame_read) > + fprintf (dump_file, "regular read\n"); > + if (insn_info->non_frame_wild_read) > + fprintf (dump_file, "non-frame wild read\n"); > + if (insn_info->frame_read) > + fprintf (dump_file, "frame read\n"); > + } > scan_reads (insn_info, v, NULL); > } > } > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, Nov 04, 2016 at 01:33:54PM +0100, Bernd Schmidt wrote: > On 11/04/2016 01:26 PM, Jakub Jelinek wrote: > > > >The following patch avoids that. Of course the question is if we want > >to support something like: > >char var[64]; > >... > >var[0] = '0xe8'; > >var[1] = ...; > >(void (*fn) (void) = (void (*) (void)) &var[0]; > >fn (); > >without any kind of barrier, or not. > > I was going to say no, but trampolines come to mind as a situation where we > might come close to the above. So then (untested so far) the following instead? Indirect function calls will then be as non-efficient as in the past for DSE and scheduling, but for direct calls, whether the target has a direct call or only indirect call instruction, should be handled more efficiently. 2016-11-04 Jakub Jelinek PR target/77834 * dse.c (check_mem_read_use): For CALL with MEM as first operand, ignore that MEM, but recurse on the MEM's operand if not SYMBOL_REF. (dse_step5): Call scan_reads even if just insn_info->frame_read. Improve and fix dump file messages. --- gcc/dse.c.jj2016-11-03 15:52:12.521965058 +0100 +++ gcc/dse.c 2016-11-04 09:42:27.640098125 +0100 @@ -2159,6 +2159,19 @@ check_mem_read_use (rtx *loc, void *data rtx *loc = *iter; if (MEM_P (*loc)) check_mem_read_rtx (loc, (bb_info_t) data); + else if (GET_CODE (*loc) == CALL + && MEM_P (XEXP (*loc, 0)) + && MEM_EXPR (XEXP (*loc, 0)) + && TREE_CODE (MEM_EXPR (XEXP (*loc, 0))) == FUNCTION_DECL) + { + /* Ignore the MEM in first operand of CALL if it is +a direct call. */ + iter.skip_subrtxes (); + if (GET_CODE (XEXP (XEXP (*loc, 0), 0)) != SYMBOL_REF) + check_mem_read_use (&XEXP (XEXP (*loc, 0), 0), data); + if (!CONST_INT_P (XEXP (*loc, 1))) + check_mem_read_use (&XEXP (*loc, 1), data); + } } } @@ -3298,12 +3311,19 @@ dse_step5 (void) bitmap_clear (v); } else if (insn_info->read_rec - || insn_info->non_frame_wild_read) + || insn_info->non_frame_wild_read + || insn_info->frame_read) { - if (dump_file && !insn_info->non_frame_wild_read) - fprintf (dump_file, "regular read\n"); - else if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "non-frame wild read\n"); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + if (!insn_info->non_frame_wild_read + && !insn_info->frame_read) + fprintf (dump_file, "regular read\n"); + if (insn_info->non_frame_wild_read) + fprintf (dump_file, "non-frame wild read\n"); + if (insn_info->frame_read) + fprintf (dump_file, "frame read\n"); + } scan_reads (insn_info, v, NULL); } } Jakub
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, Nov 04, 2016 at 01:47:36PM +0100, Richard Biener wrote: > I'm not sure the current code handled this correctly? At least I > see nothing on the GIMPLE level that would disallow the DSE: > > int foo () > { > char a[256]; > a[0] = 'a'; > a[1] = 'b'; > int __attribute__((const)) (*fn) (void) = (int (*) (void)) &a[0]; > return fn (); > } > > even with pure or regular fn this gets DSEd, the call is appearantly > not an escape point for &a. > > That is, for DSE it would only matter for locals and/or const > function calls (if stores after the call make the earlier appear dead). > > We did have bugs like PR70128 fixed though (but that was "global memory"). > It might be still broken if the patched function is const and we > unpatch right after calling... > > So to answer, yes, I think we want to treat this conservatively > (but we may have existing bugs here). So like the variant patch I've just posted? > > In any case, the second hunk fixes an important DSE bug that just got > > revealed by the former change. > > Do you have a testcase for the (wrong-code?) bug? FAIL: gcc.dg/torture/pr67690.c -Os execution test that appeared on i686-linux with the first hunk without the second hunk. Jakub
Re: [RFC] Check number of uses in simplify_cond_using_ranges().
On Fri, Nov 4, 2016 at 12:08 PM, Dominik Vogt wrote: > On Fri, Nov 04, 2016 at 09:47:26AM +0100, Richard Biener wrote: >> On Thu, Nov 3, 2016 at 4:03 PM, Dominik Vogt wrote: >> > Is VRP the right pass to do this optimisation or should a later >> > pass rather attempt to eliminate the new use of b_5 instead? Uli >> > has brought up the idea a mini "sign extend elimination" pass that >> > checks if the result of a sign extend could be replaced by the >> > original quantity in all places, and if so, eliminate the ssa >> > name. (I guess that won't help with the above code because l is >> > used also as a function argument.) How could a sensible approach >> > to deal with the situation look like? >> >> We run into this kind of situation regularly and for general foldings >> in match.pd we settled with single_use () even though it is not perfect. >> Note the usual complaint is not extra extension instructions but >> the increase of register pressure. >> >> This is because it is hard to do better when you are doing local >> optimization. >> >> As for the question on whether VRP is the right pass to do this the >> answer is two-fold -- VRP has the most precise range information. >> But the folding itself should be moved to generic code and use >> get_range_info (). > > All right, is this a sensible approach then? Yes. > 1. Using has_single_use as in the experimental patch is good > enough (provided further testing does not show serious > regressions). I'd approve that, yes. > 2. Rip the whole top level if-block from simplify_cond_using_ranges(). > 3. Translate that code to match.pd syntax. Might be some work but yes, that's also desired (you'd lose the ability to emit the warnings though). Richard. > > Ciao > > Dominik ^_^ ^_^ > > -- > > Dominik Vogt > IBM Germany >
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, 4 Nov 2016, Jakub Jelinek wrote: > On Fri, Nov 04, 2016 at 01:47:36PM +0100, Richard Biener wrote: > > I'm not sure the current code handled this correctly? At least I > > see nothing on the GIMPLE level that would disallow the DSE: > > > > int foo () > > { > > char a[256]; > > a[0] = 'a'; > > a[1] = 'b'; > > int __attribute__((const)) (*fn) (void) = (int (*) (void)) &a[0]; > > return fn (); > > } > > > > even with pure or regular fn this gets DSEd, the call is appearantly > > not an escape point for &a. > > > > That is, for DSE it would only matter for locals and/or const > > function calls (if stores after the call make the earlier appear dead). > > > > We did have bugs like PR70128 fixed though (but that was "global memory"). > > It might be still broken if the patched function is const and we > > unpatch right after calling... > > > > So to answer, yes, I think we want to treat this conservatively > > (but we may have existing bugs here). > > So like the variant patch I've just posted? That doesn't handle int __attribute__((const,noinline)) foo () { return 1; } int bar() { *((int *)foo) + 4 = 2; int ret = foo (); *((int *)foo) + 4 = 1; return ret; } right? patching foo to return 2, calling foo and then unpatching it? > > > In any case, the second hunk fixes an important DSE bug that just got > > > revealed by the former change. > > > > Do you have a testcase for the (wrong-code?) bug? > > FAIL: gcc.dg/torture/pr67690.c -Os execution test > > that appeared on i686-linux with the first hunk without the second hunk. Ah ok. Richard. > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, Nov 04, 2016 at 01:59:05PM +0100, Richard Biener wrote: > > So like the variant patch I've just posted? > > That doesn't handle > > int __attribute__((const,noinline)) > foo () > { > return 1; > } > > int bar() > { > *((int *)foo) + 4 = 2; > int ret = foo (); > *((int *)foo) + 4 = 1; > return ret; > } > > right? patching foo to return 2, calling foo and then unpatching it? If you mean *((int *)foo + 4) = rather than *((int *)foo) + 4, then indeed it doesn't. But GIMPLE optimizers remove that far before. And I'm not sure if we want to pessimize code just for such hypothetical cases; users can always add explicit optimization barriers; or maybe just have a flag whether function text is writable and disable that flag by default on non-bare metal? Unless the function is in a writable section, if it isn't the kernel or embedded target, most likely the function body won't be writable anyway. Note I'm not sure that even without the patch we'd handle it at the DSE/sched time - the thing is that the call's argument is a mem:QI, so effectively represents just the first byte of the function. So if you stored to the first byte of the function, it might handle it "right", but if you store 4 bytes at offset 4, nonoverlapping_memrefs_p might still figure out that it is comparing a offset 4 4 byte memory with offset 0 1 byte memory and tell they don't overlap. Jakub
Re: Simplify X /[ex] 8 == 0
On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse wrote: > Hello, > > this kind of simplification is already handled by fold_comparison, but the > code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In > particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can > be turned into X==0. > > When we have an explicit division by 0, there is a better optimisation > possible (insert __builtin_unreachable() or __builtin_trap() after that > statement, as in find_explicit_erroneous_behavior), so I don't touch it. > > For the overflow inequality case, it would have been a bit clearer to > generate > (cmp { build_zero_cst (TREE_TYPE (@0)); } @2) > and let that be constant folded instead of having that ugly and error-prone > test in constant_boolean_node, but I saved one tree ;-) > > Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are > in the libitm part of the testsuite, they should be fixed by > https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the > testsuite when that patch is in. Ok. You fail to handle A /[ex] -2 < 2? Is that on purpose? Or just lazy so you dont' have to handle inverting the comparison? Thanks, Richard. > 2016-11-07 Marc Glisse > > gcc/ > * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR. > * match.pd (A /[ex] B CMP C): New simplifications. > > gcc/testsuite/ > * gcc.dg/tree-ssa/cmpexactdiv.c: New file. > > -- > Marc Glisse
Re: Simplify X /[ex] 8 == 0
On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener wrote: > On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse wrote: >> Hello, >> >> this kind of simplification is already handled by fold_comparison, but the >> code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In >> particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can >> be turned into X==0. >> >> When we have an explicit division by 0, there is a better optimisation >> possible (insert __builtin_unreachable() or __builtin_trap() after that >> statement, as in find_explicit_erroneous_behavior), so I don't touch it. >> >> For the overflow inequality case, it would have been a bit clearer to >> generate >> (cmp { build_zero_cst (TREE_TYPE (@0)); } @2) >> and let that be constant folded instead of having that ugly and error-prone >> test in constant_boolean_node, but I saved one tree ;-) >> >> Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are >> in the libitm part of the testsuite, they should be fixed by >> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the >> testsuite when that patch is in. > > Ok. > > You fail to handle A /[ex] -2 < 2? Is that on purpose? Or just lazy so you > dont' have to handle inverting the comparison? Oh, I suppose nothing generates exact divisions by negative numbers. Richard. > Thanks, > Richard. > >> 2016-11-07 Marc Glisse >> >> gcc/ >> * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR. >> * match.pd (A /[ex] B CMP C): New simplifications. >> >> gcc/testsuite/ >> * gcc.dg/tree-ssa/cmpexactdiv.c: New file. >> >> -- >> Marc Glisse
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, 4 Nov 2016, Jakub Jelinek wrote: > On Fri, Nov 04, 2016 at 01:59:05PM +0100, Richard Biener wrote: > > > So like the variant patch I've just posted? > > > > That doesn't handle > > > > int __attribute__((const,noinline)) > > foo () > > { > > return 1; > > } > > > > int bar() > > { > > *((int *)foo) + 4 = 2; > > int ret = foo (); > > *((int *)foo) + 4 = 1; > > return ret; > > } > > > > right? patching foo to return 2, calling foo and then unpatching it? > > If you mean *((int *)foo + 4) = rather than *((int *)foo) + 4, > then indeed it doesn't. But GIMPLE optimizers remove that far before. > And I'm not sure if we want to pessimize code just for such hypothetical > cases; users can always add explicit optimization barriers; > or maybe just have a flag whether function text is writable > and disable that flag by default on non-bare metal? > Unless the function is in a writable section, if it isn't the kernel > or embedded target, most likely the function body won't be writable anyway. Not sure if it will pessimize much, but yes, if we want to support it there's the complication that the const call has no VOPs on GIMPLE and I'm not sure we want to force VUSEs on all const calls... > Note I'm not sure that even without the patch we'd handle it at the > DSE/sched time - the thing is that the call's argument is a mem:QI, > so effectively represents just the first byte of the function. So > if you stored to the first byte of the function, it might handle it > "right", but if you store 4 bytes at offset 4, nonoverlapping_memrefs_p > might still figure out that it is comparing a offset 4 4 byte memory > with offset 0 1 byte memory and tell they don't overlap. Yeah, but I hope MEM_SIZE_KNOWN_P is false for the MEM, that should do the trick (at least that's what I remember - I'm not that fluent with RTL). Richard.
[PATCH][GCC] Fix native Windows x86 bootstrap failure with self test
Hi all, The GCC self-test added in r237144 breaks the native Windows x86 builds (e.g. mingw-w64). This fixes (PR78196) by explicitly adding /dev/null as the output file to the GCC self test. The test essentially does `-xc -S -c /dev/null -fself-test` `/dev/null` is then converted into the Windows null device `nul` by the MSYS shell, which is correct. But then the driver adds a filename to the name, trying to write the output to `nul.s`. `nul` is a reserved filename on Windows. As such it's invalid to create this file and the call always fails using CreateFile. Checked with x86_64-w64-mingw32 and build gets passed self tests but dies at unrelated libstdc++ and libvtv errors. Ok for trunk? Thanks, Tamar gcc/ 2016-11-03 Tamar Christina PR driver/78196 * Makefile.in (SELFTEST_FLAGS): Added -o /dev/null. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 622d038f6fb50c35ff48bc84a69967071aa17e90..7ecd1e4e726e4cb31b1c9fe22dcb777e71468fb2 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1879,7 +1879,10 @@ rest.cross: specs # Specify a dummy input file to placate the driver. # Specify -nostdinc to work around missing WIND_BASE environment variable # required for *-wrs-vxworks-* targets. -SELFTEST_FLAGS = -nostdinc -x c /dev/null -S -fself-test +# Specify -o /dev/null so the output of -S is discarded. More importantly +# It does not try to create a file with the name "null.s" on POSIX and +# "nul.s" on Windows. Because on Windows "nul" is a reserved file name. +SELFTEST_FLAGS = -nostdinc -x c /dev/null -S -fself-test -o /dev/null # Run the selftests during the build once we have a driver and a cc1, # so that self-test failures are caught as early as possible.
[PATCH] Improve vect cost model for PR37150
The following implements some easy improvements for the SLP cost model for PR37150 which shows excess cost for dead loads and permutes accounted when vectorizing a basic-block. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Still doesn't vectorize the testcase in that PR without -fno-vect-cost-model though. Real improvements are only possible with re-doing the vectorizer data structures. Richard. 2016-11-04 Richard Biener PR tree-optimization/37150 * tree-vectorizer.h (vect_transform_slp_perm_load): Add n_perms parameter. * tree-vect-slp.c (vect_supported_load_permutation_p): Adjust. (vect_analyze_slp_cost_1): Account for the real number of permutations emitted and for dead loads. (vect_transform_slp_perm_load): Add n_perms parameter counting the number of emitted permutations. * tree-vect-stmts.c (vectorizable_load): Adjust. Index: gcc/tree-vectorizer.h === --- gcc/tree-vectorizer.h (revision 241791) +++ gcc/tree-vectorizer.h (working copy) @@ -1166,7 +1168,7 @@ extern int vect_get_known_peeling_cost ( extern void vect_free_slp_instance (slp_instance); extern bool vect_transform_slp_perm_load (slp_tree, vec , gimple_stmt_iterator *, int, - slp_instance, bool); + slp_instance, bool, unsigned *); extern bool vect_slp_analyze_operations (vec slp_instances, void *); extern bool vect_schedule_slp (vec_info *); Index: gcc/tree-vect-slp.c === --- gcc/tree-vect-slp.c (revision 241791) +++ gcc/tree-vect-slp.c (working copy) @@ -1461,8 +1461,9 @@ vect_supported_load_permutation_p (slp_i { /* Verify the permutation can be generated. */ vec tem; + unsigned n_perms; if (!vect_transform_slp_perm_load (node, tem, NULL, -1, slp_instn, true)) +1, slp_instn, true, &n_perms)) { dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -1475,11 +1476,13 @@ vect_supported_load_permutation_p (slp_i } /* For loop vectorization verify we can generate the permutation. */ + unsigned n_perms; FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (slp_instn), i, node) if (node->load_permutation.exists () && !vect_transform_slp_perm_load (node, vNULL, NULL, - SLP_INSTANCE_UNROLLING_FACTOR (slp_instn), slp_instn, true)) + SLP_INSTANCE_UNROLLING_FACTOR (slp_instn), slp_instn, true, + &n_perms)) return false; return true; @@ -1548,14 +1551,38 @@ vect_analyze_slp_cost_1 (slp_instance in stmt = GROUP_FIRST_ELEMENT (stmt_info); stmt_info = vinfo_for_stmt (stmt); /* Record the cost for the permutation. */ - record_stmt_cost (body_cost_vec, ncopies_for_cost, vec_perm, + unsigned n_perms; + vect_transform_slp_perm_load (node, vNULL, NULL, + ncopies_for_cost, instance, true, + &n_perms); + record_stmt_cost (body_cost_vec, n_perms, vec_perm, stmt_info, 0, vect_body); - /* And adjust the number of loads performed. */ unsigned nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)); - ncopies_for_cost - = (GROUP_SIZE (stmt_info) - GROUP_GAP (stmt_info) - + nunits - 1) / nunits; + /* And adjust the number of loads performed. This handles +redundancies as well as loads that are later dead. */ + auto_sbitmap perm (GROUP_SIZE (stmt_info)); + bitmap_clear (perm); + for (i = 0; i < SLP_TREE_LOAD_PERMUTATION (node).length (); ++i) + bitmap_set_bit (perm, SLP_TREE_LOAD_PERMUTATION (node)[i]); + ncopies_for_cost = 0; + bool load_seen = false; + for (i = 0; i < GROUP_SIZE (stmt_info); ++i) + { + if (i % nunits == 0) + { + if (load_seen) + ncopies_for_cost++; + load_seen = false; + } + if (bitmap_bit_p (perm, i)) + load_seen = true; + } + if (load_seen) + ncopies_for_cost++; + gcc_assert (ncopies_for_cost + <= (GROUP_SIZE (stmt_info) - GROUP_GAP (stmt_info) + + nunits - 1) / nunits)
Re: [PATCH] Convert character arrays to string csts
On 11/03/2016 01:46 PM, Richard Biener wrote: > On Thu, Nov 3, 2016 at 1:12 PM, Martin Liška wrote: >> Hello. >> >> This is small follow-up of the patches I sent to string folding. >> The patch transforms character array defined in an initializer to string >> constant: >> >> +const char global[] = {'a', 'b', 'c', 'd', '\0'}; >> >> Apart from that, it also enables string folding of local symbols like: >> + const char local[] = "abcd"; >> >> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Hmm, does this handle > > const char global[] = {'a', [4] = 'd', 'b', [3] = '\0'}; > > correctly? It hasn't, fixed in the v2. As ctor indices are sorted in increasing order, I only check that there are no holes. > > I think you need to check that 'key' is increasing and there are no gaps > (ISTR constructor elements are sorted but gaps can still appear). > > + || !useless_type_conversion_p (TREE_TYPE (value), char_type_node)) > > please instead check the element type of the array constructor. I'd also > use a stricter check like TYPE_MAIN_VARIANT (type) == char_type_node > to avoid changing non-strings like unsigned / signed char. Ok, v2 does: TYPE_MAIN_VARIANT (type) == char_type_node > > Finally I'm a bit worried about doing this for all 'char' > constructors. Maybe we > should restrict this to those with ISPRINT () entries? Also done for all characters except the trailing \0 character. I'll send patch in the next email. Martin > > Richard. > > >> Martin
Re: Simplify X /[ex] 8 == 0
On Fri, 4 Nov 2016, Richard Biener wrote: On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener wrote: On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse wrote: Hello, this kind of simplification is already handled by fold_comparison, but the code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can be turned into X==0. When we have an explicit division by 0, there is a better optimisation possible (insert __builtin_unreachable() or __builtin_trap() after that statement, as in find_explicit_erroneous_behavior), so I don't touch it. For the overflow inequality case, it would have been a bit clearer to generate (cmp { build_zero_cst (TREE_TYPE (@0)); } @2) and let that be constant folded instead of having that ugly and error-prone test in constant_boolean_node, but I saved one tree ;-) Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are in the libitm part of the testsuite, they should be fixed by https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the testsuite when that patch is in. Ok. You fail to handle A /[ex] -2 < 2? Is that on purpose? Or just lazy so you dont' have to handle inverting the comparison? Oh, I suppose nothing generates exact divisions by negative numbers. Yes, that :-) I think I'd rather wait until I can test it before simplifying those too much. -- Marc Glisse
Re: [PATCH] combine lhs zero_extract fix (PR78186)
On 3 November 2016 at 17:01, Segher Boessenkool wrote: > I managed to forget to mask the thing inserted. Tested on powerpc64-linux > {-m32,-m64}, and Bin tested on arm. Applying to trunk. > > > Segher > > > 2016-11-03 Segher Boessenkool > > PR rtl-optimization/78186 > * combine.c (change_zero_ext): Mask the RHS of a zero_extract as > well, when converting to IOR. > Hi, Since this commit I have noticed execution failures on "old" arm targets: gcc.dg/torture/pr48124-4.c -O1 execution test gcc.dg/torture/pr48124-4.c -O2 execution test gcc.dg/torture/pr48124-4.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test gcc.dg/torture/pr48124-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test gcc.dg/torture/pr48124-4.c -O3 -g execution test gcc.dg/torture/pr48124-4.c -Os execution test For instance on target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=arm and running the tests with -march=armv5t or on arm-none-eabi with default mode/cpu and no special runtest flags if that's easier. Christophe > --- > gcc/combine.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/gcc/combine.c b/gcc/combine.c > index 7c21fe4..7ed0a62 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -11224,6 +11224,9 @@ change_zero_ext (rtx pat) >rtx x = gen_rtx_AND (mode, reg, immed_wide_int_const (mask, mode)); >rtx y = simplify_gen_binary (ASHIFT, mode, SET_SRC (pat), >GEN_INT (offset)); > + wide_int mask2 = wi::shifted_mask (offset, width, false, reg_width); > + y = simplify_gen_binary (AND, mode, y, > + immed_wide_int_const (mask2, mode)); >rtx z = simplify_gen_binary (IOR, mode, x, y); >SUBST (SET_DEST (pat), reg); >SUBST (SET_SRC (pat), z); > -- > 1.9.3 >
Re: [PATCH] Convert character arrays to string csts
On 11/03/2016 02:00 PM, Jan Hubicka wrote: >> On 11/03/2016 01:12 PM, Martin Liška wrote: >>> + tree init = DECL_INITIAL (decl); >>> + if (init >>> + && TREE_READONLY (decl) >>> + && can_convert_ctor_to_string_cst (init)) >>> +DECL_INITIAL (decl) = build_string_cst_from_ctor (init); >> >> I'd merge these two new functions since they're only ever called >> together. We'd then have something like >> >> if (init && TREE_READONLY (decl)) >> init = convert_ctor_to_string_cst (init); >> if (init) >> DECL_INITIAL (decl) = init; Done. >> >> I'll defer to Jan on whether finalize_decl seems like a good place >> to do this. > > I think finalize_decl may be bit too early because frontends may expects the > ctors to be in a way they produced them. We only want to convert those arrays > seen by middle-end so I would move the logic to varpool_node::analyze > > Otherwise the patch seems fine to me. > > Honza >> >>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c >>> b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c >>> index 283bd1c..b2d1fd5 100644 >>> --- a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c >>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c >>> @@ -4,12 +4,15 @@ >>> char *buffer1; >>> char *buffer2; >>> >>> +const char global[] = {'a', 'b', 'c', 'd', '\0'}; >>> + >>> #define SIZE 1000 >>> >>> int >>> main (void) >>> { >>> const char* const foo1 = "hello world"; >>> + const char local[] = "abcd"; >>> >>> buffer1 = __builtin_malloc (SIZE); >>> __builtin_strcpy (buffer1, foo1); >>> @@ -45,6 +48,10 @@ main (void) >>> __builtin_abort (); >>> if (__builtin_memchr (foo1, null, 12) != foo1 + 11) >>> __builtin_abort (); >>> + if (__builtin_memchr (global, null, 5) == 0) >>> +__builtin_abort (); >>> + if (__builtin_memchr (local, null, 5) == 0) >>> +__builtin_abort (); >> >> How is that a meaningful test? This seems to work even with an >> unpatched gcc. I'd have expected something that shows a benefit for >> doing this conversion, and maybe also a test that shows it isn't >> done in cases where it's not allowed. It's meaningful as it scans that there's no __builtin_memchr in optimized dump. I'm adding new tests that does the opposite test. >> >>> tree >>> -build_string_literal (int len, const char *str) >>> +build_string_literal (int len, const char *str, bool build_addr_expr) >> >> New arguments should be documented in the function comment. Yep, improved. >> >>> +/* Return TRUE when CTOR can be converted to a string constant. */ >> >> "if", not "when". Done. >> >>> + unsigned HOST_WIDE_INT elements = CONSTRUCTOR_NELTS (ctor); >>> + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), idx, key, value) >>> +{ >>> + if (key == NULL_TREE >>> + || TREE_CODE (key) != INTEGER_CST >>> + || !tree_fits_uhwi_p (value) >>> + || !useless_type_conversion_p (TREE_TYPE (value), char_type_node)) >>> + return false; >> >> Shouldn't all elements have the same type, or do you really have to >> call useless_type_conversion in a loop? >> >>> + /* Allow zero character just at the end of a string. */ >>> + if (integer_zerop (value)) >>> + return idx == elements - 1; >> >> Don't you also have to explicitly check it's there? >> >> >> Bernd Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Martin >From 4a2bd43ad10cfb0467dde15ca0be0deba8194ea7 Mon Sep 17 00:00:00 2001 From: marxin Date: Mon, 17 Oct 2016 15:24:46 +0200 Subject: [PATCH] Convert character arrays to string csts gcc/testsuite/ChangeLog: 2016-11-04 Martin Liska * gcc.dg/tree-ssa/builtins-folding-gimple.c (main): Add new tests. * gcc.dg/tree-ssa/builtins-folding-gimple-ub.c (main): Likewise. gcc/ChangeLog: 2016-11-04 Martin Liska * gimplify.c (gimplify_decl_expr): Emit INIT_EXPR just if it cannot be converted to a string constant. (gimplify_init_constructor): Create string constant for local variables (if possible). * tree.c (convert_ctor_to_string_cst): New function. (build_string_literal): Add new argument. (can_convert_ctor_to_string_cst): New function. * tree.h: Declare new functions. * varpool.c (ctor_for_folding): Return ctor for local variables. (varpool_node::analyze): Convert character array ctors to a string constant (if possible). --- gcc/gimplify.c | 16 - .../gcc.dg/tree-ssa/builtins-folding-gimple-ub.c | 20 +- .../gcc.dg/tree-ssa/builtins-folding-gimple.c | 7 ++ gcc/tree.c | 83 -- gcc/tree.h | 5 +- gcc/varpool.c | 14 +++- 6 files changed, 134 insertions(+), 11 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 1531582..32f0909 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1495,7 +1495,8 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p) { if (!
Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly
On 30 October 2016 at 21:40, Thomas Koenig wrote: > Hi Jerry, > >> See patch below which is very simple. We were handling the case where we >> tabbed left of current position in the record by clearing the seen_eor >> flag. We were not handling the case where we tab toward the right and so >> are still at eor. >> >> New test case attached. Regression tested on x86-64-linux. >> >> OK to commit? >> >> This is a regression relative to g77. Its simple enough I think it >> should go to 5 and 6 as well. > > > I think so too. OK for all affected branches. > > Thanks for the patch! > Hi, The new test fails at execution on arm and aarch64: gfortran.dg/fmt_t_9.f -O0 execution test gfortran.dg/fmt_t_9.f -O1 execution test gfortran.dg/fmt_t_9.f -O2 execution test gfortran.dg/fmt_t_9.f -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test gfortran.dg/fmt_t_9.f -O3 -g execution test gfortran.dg/fmt_t_9.f -Os execution test seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to larger granularity in validations. Christophe > Thomas >
[PING][PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)
On Fri, 28 Oct 2016, Richard Biener wrote: > > These are the C (and ObjC) Frontend changes required by the GIMPLE > Frontend which is now itself contained in c/gimple-parser.[ch]. > > Most changes are due to a new c-parser.h header where we export > stuff from the C parser that the GIMPLE frontend requires. Other > changes include new __GIMPLE and __PHI keywords, handling > __GIMPLE as new declspec and dispatching to the GIMPLE parser > for __GIMPLE marked function definitions. > > We'd like to include the GIMPLE parser for GCC 7, as the parser > is pretty self-contained (and now works to a good extent) it can > be improved during stage3 or when testcases show that it needs > improvement. > > Bootstrapped and tested on x86_64-unknown-linux-gnu (together with [2/2]) > for C, ObjC. > > Ok for trunk? Ping. Thanks, Richard. > Thanks, > Richard. > > 2016-10-28 Prasad Ghangal > Richard Biener > > c/ > * Make-lang.in (C_AND_OBJC_OBJS): Add gimple-parser.o. > * config-lang.in (gtfiles): Add c/c-parser.h. > * c-tree.h (enum c_declspec_word): Add cdw_gimple. > (struct c_declspecs): Add gimple_pass member and gimple_p flag. > * c-parser.c (enum c_id_kind, struct c_token, > c_parser_next_token_is, c_parser_next_token_is_not, > c_parser_next_token_is_keyword, > enum c_lookahead_kind, enum c_dtr_syn, enum c_parser_prec): > Split out to ... > * c-parser.h: ... new header. > * c-parser.c: Include c-parser.h and gimple-parser.h. > (c_parser_peek_token, c_parser_peek_2nd_token, > c_token_starts_typename, c_parser_next_token_starts_declspecs, > c_parser_next_tokens_start_declaration, c_parser_consume_token, > c_parser_error, c_parser_require, c_parser_skip_until_found, > c_parser_declspecs, c_parser_declarator, c_parser_peek_nth_token, > c_parser_cast_expression): Export. > (c_parser_tokens_buf): New function. > (c_parser_error): Likewise. > (c_parser_set_error): Likewise. > (c_parser_declspecs): Handle RID_GIMPLE. > (c_parser_declaration_or_fndef): Parse __GIMPLE marked body > via c_parser_parse_gimple_body. > * c-parser.h (c_parser_peek_token, c_parser_peek_2nd_token, > c_token_starts_typename, c_parser_next_token_starts_declspecs, > c_parser_next_tokens_start_declaration, c_parser_consume_token, > c_parser_error, c_parser_require, c_parser_skip_until_found, > c_parser_declspecs, c_parser_declarator, c_parser_peek_nth_token, > c_parser_cast_expression): Declare. > (struct c_parser): Declare forward. > (c_parser_tokens_buf): Declare. > (c_parser_error): Likewise. > (c_parser_set_error): Likewise. > * gimple-parser.c: New file. > * gimple-parser.h: Likewise. > > obj-c/ > * config-lang.in (gtfiles): Add c/c-parser.h. > > c-family/ > * c-common.h (c_common_resword): Add RID_GIMPLE, RID_PHI types. > * c-common.h (enum rid): Add RID_GIMPLE, RID_PHI. > * c.opt (fgimple): New option. > > * doc/invoke.texi (fgimple): Document. > > diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c > index 307862b..2997c83 100644 > --- a/gcc/c-family/c-common.c > +++ b/gcc/c-family/c-common.c > @@ -435,6 +435,8 @@ const struct c_common_resword c_common_reswords[] = >{ "__underlying_type", RID_UNDERLYING_TYPE, D_CXXONLY }, >{ "__volatile",RID_VOLATILE, 0 }, >{ "__volatile__", RID_VOLATILE, 0 }, > + { "__GIMPLE", RID_GIMPLE, D_CONLY }, > + { "__PHI", RID_PHI,D_CONLY }, >{ "alignas", RID_ALIGNAS,D_CXXONLY | D_CXX11 | D_CXXWARN > }, >{ "alignof", RID_ALIGNOF,D_CXXONLY | D_CXX11 | D_CXXWARN > }, >{ "asm", RID_ASM,D_ASM }, > diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h > index 547bab2..1fbe060 100644 > --- a/gcc/c-family/c-common.h > +++ b/gcc/c-family/c-common.h > @@ -118,6 +118,12 @@ enum rid > >RID_FRACT, RID_ACCUM, RID_AUTO_TYPE, RID_BUILTIN_CALL_WITH_STATIC_CHAIN, > > + /* "__GIMPLE", for the GIMPLE-parsing extension to the C frontend. */ > + RID_GIMPLE, > + > + /* "__PHI", for parsing PHI function in GIMPLE FE. */ > + RID_PHI, > + >/* C11 */ >RID_ALIGNAS, RID_GENERIC, > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 458d453..24d3b8e 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -200,6 +200,10 @@ F > Driver C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after > %qs) > -F Add to the end of the main framework include path. > > +fgimple > +C Var(flag_gimple) Init(0) > +Enable parsing GIMPLE. > + > H > C ObjC C++ ObjC++ > Print the name of header files as they are used. > diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in > index 72c9ae7..cd7108b 100644 > --- a/gcc/c/Make-lang.in > +++ b/gcc/c/Make-lang.in > @@ -51,7 +51,
Re: [PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)
Hi! Just 2 nits: On Fri, Oct 28, 2016 at 01:46:57PM +0200, Richard Biener wrote: > +/* Return a pointer to the Nth token in PARERs tokens_buf. */ PARSERs ? > @@ -454,7 +423,7 @@ c_lex_one_token (c_parser *parser, c_token *token) > /* Return a pointer to the next token from PARSER, reading it in if > necessary. */ > > -static inline c_token * > +c_token * > c_parser_peek_token (c_parser *parser) > { >if (parser->tokens_avail == 0) I wonder if turning all of these into non-inlines is a good idea. Can't you move them to the common header instead? The rest I defer to Joseph or Marek. Jakub
Re: [libgcc] Protect __TMC_END__ - __TMC_LIST__ == 0
Ping https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html On Thu, 27 Oct 2016, Marc Glisse wrote: Hello, some optimization patch I was working on simplified __TMC_END__ - __TMC_LIST__ == 0 to false, which is not wanted (I assume that's why it wasn't written __TMC_END__ == __TMC_LIST__ in the first place). Bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2016-10-27 Marc Glisse PR libgcc/77813 * crtstuff.c (deregister_tm_clones, register_tm_clones): Hide __TMC_END__ behind a passthrough asm. -- Marc Glisse
Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly
On 4 November 2016 at 14:48, Christophe Lyon wrote: > On 30 October 2016 at 21:40, Thomas Koenig wrote: >> Hi Jerry, >> >>> See patch below which is very simple. We were handling the case where we >>> tabbed left of current position in the record by clearing the seen_eor >>> flag. We were not handling the case where we tab toward the right and so >>> are still at eor. >>> >>> New test case attached. Regression tested on x86-64-linux. >>> >>> OK to commit? >>> >>> This is a regression relative to g77. Its simple enough I think it >>> should go to 5 and 6 as well. >> >> >> I think so too. OK for all affected branches. >> >> Thanks for the patch! >> > > Hi, > > The new test fails at execution on arm and aarch64: > > gfortran.dg/fmt_t_9.f -O0 execution test > gfortran.dg/fmt_t_9.f -O1 execution test > gfortran.dg/fmt_t_9.f -O2 execution test > gfortran.dg/fmt_t_9.f -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions execution test > gfortran.dg/fmt_t_9.f -O3 -g execution test > gfortran.dg/fmt_t_9.f -Os execution test > > seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to > larger granularity in validations. > I see it failing on trunk too. > Christophe > > > >> Thomas >>
Re: [PATCH, RFC] Fix PR71915, PR71490: Handle casts on strides consistently
On 24 October 2016 at 12:55, Richard Biener wrote: > On Fri, Oct 21, 2016 at 11:46 PM, Bill Schmidt > wrote: >> Hi, >> >> I've been meaning for some time now to do a better job handling strength >> reduction candidates where the stride of the candidate and its basis >> involves a cast (usually widening) from another type. The two PRs >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71490 and >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71915 show that we miss >> strength reduction opportunities in such cases. Essentially the way I >> was handling this before required a cast to go back to the original >> type, which was illegal when this was a narrowing operation or a cast >> from a wrapping type to a non-wrapping type. >> >> This patch improves matters by tracking the effective type of the stride >> so that we perform arithmetic in that type directly. This allows us to >> strength-reduce some cases missed before. We still have to be careful >> not to ever perform a narrowing or wrap-to-nonwrap conversion, but these >> cases don't come up nearly so often in practice with this patch. >> >> The patch is pretty straightforward but rather large, so I'd like to >> leave it up for review for a week or so before committing it -- extra >> eyes welcome! >> >> There is an existing test case that checks for the narrowing problem, >> and now fails with this patch as it should. That is, it's now legal to >> insert the initializer that we formerly knew to be bad for this test. >> Since the test no longer serves any purpose, I'm deleting it. >> >> gcc.dg/tree-ssa/slsr-8.c generates quite different intermediate code now >> than when I first added it to the test suite. As a result of various >> optimization changes, it's no longer maintained as a single block; >> instead, the optimizable computations are sunk into two legs of a >> conditional. This exposed another similar case, leading to the bug >> reports. This test now passes, but I had to adjust it for the new code >> generation we get. I also added some commentary to indicate what we >> expect to happen in that test, since it isn't too obvious. >> >> I've bootstrapped this and tested it on powerpc64le-unknown-linux-gnu >> with no regressions. To avoid the Wrath of Octoploid ;), I've also >> tested it against ffmpeg using an x86_64-pc-linux-gnu cross with -O3 >> -march=amdfam10, also with no failures. I've also verified that we hit >> this code about 35 times in the test suite, so it looks like we have >> some decent additional test coverage. >> >> Thanks in advance to anyone willing to take a look. > > LGTM. > Hi, Since this was committed, I'm seeing a failure in gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " \\* " 9 on aarch64 targets. Christophe > Richard. > >> Bill >> >> >> [gcc] >> >> 2016-10-21 Bill Schmidt >> >> PR tree-optimization/71915 >> * gimple-ssa-strength-reduction.c (struct slsr_cand_d): Add >> stride_type field. >> (find_basis_for_base_expr): Require stride types to match when >> seeking a basis. >> (alloc_cand_and_find_basis): Record the stride type. >> (slsr_process_phi): Pass stride type to alloc_cand_and_find_basis. >> (backtrace_base_for_ref): Pass types to legal_cast_p_1 rather than >> the expressions having those types. >> (slsr_process_ref): Pass stride type to alloc_cand_and_find_basis. >> (create_mul_ssa_cand): Likewise. >> (create_mul_imm_cand): Likewise. >> (create_add_ssa_cand): Likewise. >> (create_add_imm_cand): Likewise. >> (legal_cast_p_1): Change interface to accept types rather than the >> expressions having those types. >> (legal_cast_p): Pass types to legal_cast_p_1. >> (slsr_process_cast): Pass stride type to >> alloc_cand_and_find_basis. >> (slsr_process_copy): Likewise. >> (dump_candidate): Display stride type when a cast exists. >> (create_add_on_incoming_edge): Introduce a cast when necessary for >> the stride type. >> (analyze_increments): Change the code checking for invalid casts >> to rely on the stride type, and update the documentation and >> example. Change the code checking for pointer multiplies to rely >> on the stride type. >> (insert_initializers): Introduce a cast when necessary for the >> stride type. Use the stride type for the type of the initializer. >> >> [gcc/testsuite] >> >> 2016-10-21 Bill Schmidt >> >> PR tree-optimization/71915 >> * gcc.dg/tree-ssa/pr54245.c: Delete. >> * gcc.dg/tree-ssa/slsr-8.c: Adjust for new optimization and >> document why. >> >> >> Index: gcc/gimple-ssa-strength-reduction.c >> === >> --- gcc/gimple-ssa-strength-reduction.c (revision 241379) >> +++ gcc/gimple-ssa-strength-reduction.c (working copy) >> @@ -246,6 +246,13 @@
Re: [PATCH] Five patches for std::experimental::filesystem
On 2 November 2016 at 10:09, Christophe Lyon wrote: > On 27 October 2016 at 15:34, Jonathan Wakely wrote: >> On 26/10/16 09:24 +0200, Christophe Lyon wrote: >>> >>> Hi Jonathan, >>> >>> On 25 October 2016 at 17:32, Jonathan Wakely wrote: Two more fixes for the filesystem TS, and improved tests. Handle negative times in filesystem::last_write_time * src/filesystem/ops.cc (last_write_time(const path&, file_time_type, error_code&)): Handle negative times correctly. * testsuite/experimental/filesystem/operations/last_write_time.cc: Test writing file times. Fix error handling in copy_file and equivalent * src/filesystem/ops.cc (do_copy_file): Report an error if source or destination is not a regular file (LWG 2712). (equivalent): Fix error handling and result when only one file exists. * testsuite/experimental/filesystem/operations/copy.cc: Remove files created by tests. Test copying directories. * testsuite/experimental/filesystem/operations/copy_file.cc: Remove files created by tests. * testsuite/experimental/filesystem/operations/equivalent.cc: New. * testsuite/experimental/filesystem/operations/is_empty.cc: New. * testsuite/experimental/filesystem/operations/read_symlink.cc: Remove file created by test. * testsuite/experimental/filesystem/operations/remove_all.cc: New. * testsuite/util/testsuite_fs.h (~scoped_file): Only try to remove file if path is non-empty, to support removal by other means. Tested x86_64-linux, committed to trunk. >>> I can see failures in >>> experimental/filesystem/operations/last_write_time.cc after your >>> committed this patch: >>> >>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/experimental/filesystem/operations/last_write_time.cc:127: >>> void test02(): Assertion 'last_write_time(f.path) == time' failed. >>> on arm*linux* and aarch64*linux* targets. >> >> >> That test will fail for targets where _GLIBCXX_USE_UTIMENSAT is not >> defined, as they use utime() instead which only supports second >> granularity. >> >> This should solve it, by only checking that the file times are within >> one second of the expected value. >> > > Hi Jonathan, > Indeed your patch fixes the problem I reported. > Sorry for the delay, I was on holidays. > but experimental/filesystem/iterators/recursive_directory_iterator.cc now fails at execution on "old" arm targets (when forcing -march=armv5t). Christophe > Thanks, > > Christophe > > >> >> Tested x86_64-linux, committed to trunk.
Re: [PATCH, RFC] Fix PR71915, PR71490: Handle casts on strides consistently
On Nov 4, 2016, at 9:10 AM, Christophe Lyon wrote: > > > Hi, > > Since this was committed, I'm seeing a failure in > gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " \\* " 9 > on aarch64 targets. Hi Christophe, Best open a bug report or I will likely lose this in the crush. If you don't mind, please include the generated assembly file so I can see why the number of *'s is off, and provide the output from -fdump-tree-slsr-details. I imagine this is a minor issue. Please CC me as wschm...@gcc.gnu.org. Thanks! Bill
Re: [libgcc] Protect __TMC_END__ - __TMC_LIST__ == 0
On Fri, Nov 4, 2016 at 7:08 AM, Marc Glisse wrote: > Ping https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html I think this is obvious. Thanks, Andrew > > > On Thu, 27 Oct 2016, Marc Glisse wrote: > >> Hello, >> >> some optimization patch I was working on simplified __TMC_END__ - >> __TMC_LIST__ == 0 to false, which is not wanted (I assume that's why it >> wasn't written __TMC_END__ == __TMC_LIST__ in the first place). >> >> Bootstrap+regtest on powerpc64le-unknown-linux-gnu. >> >> 2016-10-27 Marc Glisse >> >> PR libgcc/77813 >> * crtstuff.c (deregister_tm_clones, register_tm_clones): Hide >> __TMC_END__ behind a passthrough asm. > > > -- > Marc Glisse
Re: [PATCH, RFC] Fix PR71915, PR71490: Handle casts on strides consistently
On 4 November 2016 at 15:36, Bill Schmidt wrote: > On Nov 4, 2016, at 9:10 AM, Christophe Lyon > wrote: >> >> >> Hi, >> >> Since this was committed, I'm seeing a failure in >> gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " \\* " 9 >> on aarch64 targets. > > Hi Christophe, > > Best open a bug report or I will likely lose this in the crush. If you don't > mind, > please include the generated assembly file so I can see why the number of *'s > is off, and provide the output from -fdump-tree-slsr-details. I imagine this > is > a minor issue. Please CC me as wschm...@gcc.gnu.org. > OK, I've created pr78210 for this. Christophe > Thanks! > Bill
[Ping] Re: [PATCH] Start adding target-specific selftests
(ping) https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01785.html On Fri, 2016-10-21 at 13:45 -0400, David Malcolm wrote: > On Fri, 2016-10-21 at 12:04 +0200, Bernd Schmidt wrote: > > On 10/21/2016 02:36 AM, David Malcolm wrote: > > > + /* Test dumping of hard regs. This is inherently target > > > -specific due > > > + to the name. */ > > > +#ifdef I386_OPTS_H > > > + ASSERT_RTL_DUMP_EQ ("(reg:SI ax)", gen_raw_REG (SImode, 0)); > > > +#endif > > > > Generally putting in target dependencies like this isn't something > > we > > like to do. The patch is OK without this part, and we can revisit > > this, > > but maybe there wants to be a target hook for running target > > -specific > > selftests. > > Thanks. I removed the above target-specific part, and committed it > as r241405 (having reverified bootstrap®rtest on x86_64-pc-linux > -gnu). > > The following patch implements a target hook for running target > -specific > selftests. > > It implements the above test for dumping of hard regs, putting it > within i386.c. > > It's rather trivial, but I have followups that add further > target-specific tests, so hopefully this foundation is OK. > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu. > > OK for trunk? > > > > + ASSERT_RTL_DUMP_EQ ("(cjump_insn (set (pc)\n" > > > + "(label_ref 0))\n" > > > + " (nil))\n", > > > + jump_insn); > > > } > > > > I do wonder about the (nil)s and whether we can eliminate them. > > I hope to. > > gcc/ChangeLog: > * config/i386/i386.c: Include "selftest.h" and "selftest > -rtl.h". > (selftest::ix86_test_dumping_hard_regs): New function. > (selftest::ix86_run_selftests): New function. > (TARGET_RUN_TARGET_SELFTESTS): When CHECKING_P, wire this up to > selftest::ix86_run_selftests. > * doc/tm.texi.in (TARGET_RUN_TARGET_SELFTESTS): New. > * doc/tm.texi: Regenerate > * rtl-tests.c: Include "selftest-rtl.h". > (selftest::assert_rtl_dump_eq): Make non-static. > (ASSERT_RTL_DUMP_EQ): Move to selftest-rtl.h. > (selftest::test_dumping_regs): Update comment. > * selftest-rtl.h: New file. > * selftest-run-tests.c: Include "target.h". > (selftest::run_tests): If non-NULL, call > targetm.run_target_selftests. > * target.def (run_target_selftests): New hook. > --- > gcc/config/i386/i386.c | 34 ++ > gcc/doc/tm.texi | 4 > gcc/doc/tm.texi.in | 2 ++ > gcc/rtl-tests.c | 10 +++--- > gcc/selftest-rtl.h | 45 > + > gcc/selftest-run-tests.c | 5 + > gcc/target.def | 6 ++ > 7 files changed, 99 insertions(+), 7 deletions(-) > create mode 100644 gcc/selftest-rtl.h > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 3e6f8fd..8f6ceb4 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -77,6 +77,8 @@ along with GCC; see the file COPYING3. If not see > #include "case-cfn-macros.h" > #include "regrename.h" > #include "dojump.h" > +#include "selftest.h" > +#include "selftest-rtl.h" > > /* This file should be included last. */ > #include "target-def.h" > @@ -50365,6 +50367,33 @@ ix86_addr_space_zero_address_valid > (addr_space_t as) > #undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID > #define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID > ix86_addr_space_zero_address_valid > > +/* Target-specific selftests. */ > + > +#if CHECKING_P > + > +namespace selftest { > + > +/* Verify that hard regs are dumped as expected (in compact mode). > */ > + > +static void > +ix86_test_dumping_hard_regs () > +{ > + ASSERT_RTL_DUMP_EQ ("(reg:SI ax)", gen_raw_REG (SImode, 0)); > + ASSERT_RTL_DUMP_EQ ("(reg:SI dx)", gen_raw_REG (SImode, 1)); > +} > + > +/* Run all target-specific selftests. */ > + > +static void > +ix86_run_selftests (void) > +{ > + ix86_test_dumping_hard_regs (); > +} > + > +} // namespace selftest > + > +#endif /* CHECKING_P */ > + > /* Initialize the GCC target structure. */ > #undef TARGET_RETURN_IN_MEMORY > #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory > @@ -50840,6 +50869,11 @@ ix86_addr_space_zero_address_valid > (addr_space_t as) > #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS > #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 > > +#if CHECKING_P > +#undef TARGET_RUN_TARGET_SELFTESTS > +#define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests > +#endif /* #if CHECKING_P */ > + > struct gcc_target targetm = TARGET_INITIALIZER; > > #include "gt-i386.h" > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 29dc73b..7efcf57 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -11821,3 +11821,7 @@ All and all it does not take long to convert > ports that the > maintainer is familiar with. > > @end defmac > + > +@deftypefn {Target Hook} void TARGET_RUN_TARGET_SELFTESTS (void) > +If selftests are enabled, run
Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.
Hi Pekka, On Sat, Oct 29, 2016 at 02:58:29PM +0300, Pekka Jääskeläinen wrote: > Hi Martin, > > Thanks for the comments and suggestions. Replies inline: > > On Thu, Oct 20, 2016 at 6:10 PM, Martin Jambor wrote: > > - Still quite few things need to be documented better, e.g.: > > + brig_to_generic::get_mangled_name_tmpl and to a lesser extent > > brig_to_generic::get_mangled_name. It should be clear what is the > > intended difference in usage of the two (specially since the > > former is a template and so the parameter does not give that much > > of a hint to the reader) > > Added more comments. thanks. > > > + the visitor classes need some description so that the first time > > readers see them, they understand what they are for and what they > > visit (i.e. what "visiting" even means). > > This is an adaptation of the classic gang of four Visitor design pattern. > I added a reference to it in a comment. Thanks. As you can see, I am not a C++ person and do not know names of common patterns. For me, "Gang of Four" is a reference to history of China and not something one would want to emulate. > > > - I know it was me who told you to use gcc_assert and gcc_unreachable > > instead of internal_error. The thing is, often the error is clearly > > not an internal error but an error in input. I think that we should > > plan to handle these cases differently and report the issues better > > to the user, give a meaningful error message together width the > > section and the offset there when it was encountered. I am not > > asking you to audit all asserts now and convert those in this > > category but it would be nice to have a mechanism to easily do so > > (and convert a few obvious places), so that we can convert these as > > we bump into them. > > I'm not sure about this. BRIG FE is a rather special case as we > assume HSAILasm has been used to parse and error check the original > HSAIL text to the binary BRIG format which it consumes. > > Of course HSAILasm can have bugs, but how much we should produce human > readable error messages to help debugging HSAILasm is another thing. > In case the BRIG FE > fails to consume the input, it means either the BRIG is corrupted for > a reason or another, > but typically is not a human error (those should be caught be HSAILasm). I personally primarily want this for debugging purposes, and we should try to eventually report errors in a more comprehensible way than HSAILasm. But more generally, and more importantly, if the input, whether human readable or not, is incorrect, the compiler should not produce an "internal error." If that happens, users will file bugs against gcc when in fact it is the generating tool that is broken. One you maintain the FE, you personally will not want this :-) > > "File format not recognized" error is one that might be useful though. > I added a check for the BRIG magic number and the supported version (1.0). > > Perhaps we should add error printouts later on case by case basis when we > see which error cases can be useful and worth reporting in a human readable > graceful manner? It can be as easy as converting the internal_error > to fatal_error or similar in that case. Yeah, that is what I meant, together with printing some information about offsets in different sections where the error is. > > > - A very minor suggestion: In GCC it is customary to write TODO as one > > word. We generally do not use "TO OPTIMIZE", that is just a TODO > > (as opposed to a FIXME, which hints that something is at least a bit > > wrong here). I think you can keep your way if you want but for > > example I do have emacs highlighting set up for the traditional > > formats. > > Converted all to TODO. > > > - You do not seem to handle BRIG_OPCODE_ICALL, or have I missed it? > > That is fine, I don't think anybody else does that now anyway, I > > just got curious reading the ipa analysis. > > Right. > > > - brig-lang.c: > > + x_flag_whole_program = 0; - talk about this with Honza > > I guess this is note to yourself? As we agreed I didn't try whole program > optimizations yet. I might later when optimizing for a target. It will > be really useful, I agree. Now that it used the proper builtin way, it > might work more easily. Right, it was a note for myself. > > > + brig_langhook_type_for_size: has several issues. Either always > > return a type like go or return error_mark_node instead of NULL. > > Also, do not count on long being 64-bit. I would just copy what > > go does. Or lto. > > Copied the go's version, it should work with BRIG too. Great. > > > + convert: did you avoid using convert_to_vector deliberately? The > > size check seems genuinely useful. > > BRIG/HSAIL is a bit special case due to its "untyped" variables (registers), > I use bit level casts in a lot of places to avoid accidental type conversions > or sign extensions. I w
Re: [PATCH] Start adding target-specific selftests
On 10/21/2016 07:45 PM, David Malcolm wrote: gcc/ChangeLog: * config/i386/i386.c: Include "selftest.h" and "selftest-rtl.h". (selftest::ix86_test_dumping_hard_regs): New function. (selftest::ix86_run_selftests): New function. (TARGET_RUN_TARGET_SELFTESTS): When CHECKING_P, wire this up to selftest::ix86_run_selftests. * doc/tm.texi.in (TARGET_RUN_TARGET_SELFTESTS): New. * doc/tm.texi: Regenerate * rtl-tests.c: Include "selftest-rtl.h". (selftest::assert_rtl_dump_eq): Make non-static. (ASSERT_RTL_DUMP_EQ): Move to selftest-rtl.h. (selftest::test_dumping_regs): Update comment. * selftest-rtl.h: New file. * selftest-run-tests.c: Include "target.h". (selftest::run_tests): If non-NULL, call targetm.run_target_selftests. * target.def (run_target_selftests): New hook. Ok. Bernd
Re: [PATCH, rs6000] Fold vector addition built-ins in GIMPLE
Just a note that the "-std=gnu11" option in the test cases in this patch is a leftover from a previous version of the patch, which I'll plan to remove. Sorry for the oversight! Bill > On Nov 1, 2016, at 9:05 PM, Bill Schmidt wrote: > > Hi, > > As Jakub suggested in response to my *ahem* ornate patch for overloaded > function built-ins, a much better approach is to use the existing > machinery for overloading and then immediately fold the specific > functions during gimplification. There is a target hook available for > this purpose that we have not previously used. This patch demonstrates > this functionality by implementing the target hook and folding vector > addition built-ins within it. Future patches will fold other such > operations, improving the optimization available for many vector > intrinsics. > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no > regressions. Is this ok for trunk? > > Thanks, > Bill > > > [gcc] > > 2016-11-01 Bill Schmidt > > * config/rs6000/rs6000.c (gimple-ssa.h): New #include. > (TARGET_GIMPLE_FOLD_BUILTIN): Define as > rs6000_gimple_fold_builtin. > (rs6000_gimple_fold_builtin): New function. Add handling for > early expansion of vector addition builtins. > > > [gcc/testsuite] > > 2016-11-01 Bill Schmidt > > * gcc.target/powerpc/fold-vec-add-1.c: New. > * gcc.target/powerpc/fold-vec-add-2.c: New. > * gcc.target/powerpc/fold-vec-add-3.c: New. > * gcc.target/powerpc/fold-vec-add-4.c: New. > * gcc.target/powerpc/fold-vec-add-5.c: New. > * gcc.target/powerpc/fold-vec-add-6.c: New. > * gcc.target/powerpc/fold-vec-add-7.c: New. > > > Index: gcc/config/rs6000/rs6000.c > === > --- gcc/config/rs6000/rs6000.c(revision 241624) > +++ gcc/config/rs6000/rs6000.c(working copy) > @@ -56,6 +56,7 @@ > #include "sched-int.h" > #include "gimplify.h" > #include "gimple-iterator.h" > +#include "gimple-ssa.h" > #include "gimple-walk.h" > #include "intl.h" > #include "params.h" > @@ -1632,6 +1633,8 @@ static const struct attribute_spec rs6000_attribut > > #undef TARGET_FOLD_BUILTIN > #define TARGET_FOLD_BUILTIN rs6000_fold_builtin > +#undef TARGET_GIMPLE_FOLD_BUILTIN > +#define TARGET_GIMPLE_FOLD_BUILTIN rs6000_gimple_fold_builtin > > #undef TARGET_EXPAND_BUILTIN > #define TARGET_EXPAND_BUILTIN rs6000_expand_builtin > @@ -16337,6 +16340,46 @@ rs6000_fold_builtin (tree fndecl, int n_args ATTRI > #endif > } > > +/* Fold a machine-dependent built-in in GIMPLE. (For folding into > + a constant, use rs6000_fold_builtin.) */ > + > +bool > +rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > +{ > + gimple *stmt = gsi_stmt (*gsi); > + tree fndecl = gimple_call_fndecl (stmt); > + gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == > BUILT_IN_MD); > + enum rs6000_builtins fn_code > += (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl); > + tree arg0, arg1, lhs; > + > + switch (fn_code) > +{ > +/* Flavors of vec_add. We deliberately don't expand > + P8V_BUILTIN_VADDUQM as it gets lowered from V1TImode to > + TImode, resulting in much poorer code generation. */ > +case ALTIVEC_BUILTIN_VADDUBM: > +case ALTIVEC_BUILTIN_VADDUHM: > +case ALTIVEC_BUILTIN_VADDUWM: > +case P8V_BUILTIN_VADDUDM: > +case ALTIVEC_BUILTIN_VADDFP: > +case VSX_BUILTIN_XVADDDP: > + { > + arg0 = gimple_call_arg (stmt, 0); > + arg1 = gimple_call_arg (stmt, 1); > + lhs = gimple_call_lhs (stmt); > + gimple *g = gimple_build_assign (lhs, PLUS_EXPR, arg0, arg1); > + gimple_set_location (g, gimple_location (stmt)); > + gsi_replace (gsi, g, true); > + return true; > + } > +default: > + break; > +} > + > + return false; > +} > + > /* Expand an expression EXP that calls a built-in function, >with result going to TARGET if that's convenient >(and in mode MODE if that's convenient). > Index: gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c > === > --- gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c (revision 0) > +++ gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c (working copy) > @@ -0,0 +1,46 @@ > +/* Verify that overloaded built-ins for vec_add with char > + inputs produce the right results. */ > + > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_altivec_ok } */ > +/* { dg-additional-options "-std=gnu11" } */ > + > +#include > + > +vector signed char > +test1 (vector bool char x, vector signed char y) > +{ > + return vec_add (x, y); > +} > + > +vector signed char > +test2 (vector signed char x, vector bool char y) > +{ > + return vec_add (x, y); > +} > + > +vector signed char > +test3 (vector signed char x, vector signed char y) > +{ > + return vec_add (x, y); > +} > + > +vector unsigned char > +test4 (vector bool char x,
RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.
> From: Matthew Fortune > Sent: 03 November 2016 13:07 > To: Toma Tabacu; gcc-patches@gcc.gnu.org > Cc: catherine_mo...@mentor.com > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need > branch-likely instructions. > > Toma Tabacu writes: > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-* > > toolchains because it was using -mbranch-likely with an R6 target, and > > branch- likely instructions were removed in R6. > > > > This patch makes the testsuite downgrade to R5 if the -mbranch-likely > > option is present and we're targeting R6. > > > > Tested with mips-img-elf and mips-img-linux-gnu. > > Hi Toma, > > Welcome to GCC development, thanks for your first patch. As you can see > from Catherine's reply the change looks good. I'll just cover some > housekeeping issues... > > > gcc/testsuite/ > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely in > > condition for R5 downgrade. > > Changelogs are an art form which will take some getting used to. This is > almost there but needs to reference the function affected. > > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5 > for -mbranch-likely and infer -mno-branch-likely for R6. > > I have extended the comment as well as there is an additional change > needed for this patch ideally. > > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp > > b/gcc/testsuite/gcc.target/mips/mips.exp > > index 7c24140..382d69c 100644 > > --- a/gcc/testsuite/gcc.target/mips/mips.exp > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } { > >|| [mips_have_test_option_p options "-mpaired- > > single"] > >|| [mips_have_test_option_p options "- > > mnan=legacy"] > >|| [mips_have_test_option_p options "- > > mabs=legacy"] > > - || [mips_have_test_option_p options "!HAS_LSA"]) > > } { > > + || [mips_have_test_option_p options "!HAS_LSA"] > > + || [mips_have_test_option_p options "-mbranch- > > likely"]) } { > > if { $gp_size == 32 } { > > mips_make_test_option options "-mips32r5" > > } else { > > Please can you make sure to retain the original patch formatting when > posting. I suspect you have copied this out of a putty session or similar and > have therefore lost the tabs. > > The extra change is that in the post-arch option processing we will need to > infer -mno-branch-likely for the $isa_rev > 5 case much like we infer - > mnan=2008 and -mabs=2008. This is so that when running the testsuite using > -mips32r5 or earlier, with -mbranch-likely as part of the user-supplied test > flags, then a test which is upgraded to > mips32r6 does not attempt to use -mbranch-likely. > > Hope that wasn't too cryptic! > > Thanks, > Matthew The updated patch below includes the improved ChangeLog comment, correct formatting, and the post-arch enforcing of -mno-branch-likely for R6. Regards, Toma gcc/testsuite/ChangeLog: * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5 for -mbranch-likely and infer -mno-branch-likely for R6. diff --git a/gcc/testsuite/gcc.target/mips/mips.exp b/gcc/testsuite/gcc.target/mips/mips.exp index 7c24140..6b7c46f 100644 --- a/gcc/testsuite/gcc.target/mips/mips.exp +++ b/gcc/testsuite/gcc.target/mips/mips.exp @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } { || [mips_have_test_option_p options "-mpaired-single"] || [mips_have_test_option_p options "-mnan=legacy"] || [mips_have_test_option_p options "-mabs=legacy"] - || [mips_have_test_option_p options "!HAS_LSA"]) } { + || [mips_have_test_option_p options "!HAS_LSA"]) + || [mips_have_test_option_p options "-mbranch-likely"]) } { if { $gp_size == 32 } { mips_make_test_option options "-mips32r5" } else { @@ -1345,6 +1346,7 @@ proc mips-dg-options { args } { mips_make_test_option options "-mno-paired-single" mips_make_test_option options "-mnan=2008" mips_make_test_option options "-mabs=2008" + mips_make_test_option options "-mno-branch-likely" } if { [regexp {^-march=(octeon|loongson)} $arch] } { mips_make_test_option options "-mno-micromips"
[PATCH, committed] Fix PR78210
Hi, On some targets (AArch64 in particular) test gcc.dg/tree-ssa/slsr-8.c fails because the scan of the SLSR dump is too strict. On these targets, a multiply may be a widening multiply, and that wasn't accounted for. Now it is. Committed as obvious. Thanks, Bill 2016-11-04 Bill Schmidt PR tree-optimization/78210 * gcc.dg/tree-ssa/slsr-8.c: Fix slsr scan to include the possibility of widening multiplies. Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c === --- gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c (revision 241844) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c (working copy) @@ -24,5 +24,6 @@ f (int s, int *c, int *d) initializer with a cast, so we'll keep it as is. */ /* There are 4 ' * ' instances in the decls (since "int * iftmp.0;" is - added), 2 parms, 3 in the code. */ -/* { dg-final { scan-tree-dump-times " \\* " 9 "optimized" } } */ + added), 2 parms, 3 in the code. The second one in the code may + be a widening multiply (for example, on AArch64). */ +/* { dg-final { scan-tree-dump-times " w?\\* " 9 "optimized" } } */
[PATCH v2] Add mcpu flag for Qualcomm falkor core
This adds an mcpu option for the upcoming Qualcomm Falkor core. This is identical to the qdf24xx part that was added earlier and hence retains the same tuning structure and continues to have the a57 pipeline for now. The part number has also been changed and this patch fixes this for both qdf24xx and falkor options. Tested with aarch64 and armhf. Siddhesh * gcc/config/aarch64/aarch64-cores.def (qdf24xx): Update part number. (falkor): New core. * gcc/config/aarch64/aarch64-tune.md: Regenerated. * gcc/config/arm/arm-cores.def (falkor): New core. * gcc/config/arm/arm-tables.opt: Regenerated. * gcc/config/arm/arm-tune.md: Regenerated. * gcc/config/arm/bpabi.h (BE8_LINK_SPEC): Add falkor support. * gcc/config/arm/t-aprofile (MULTILIB_MATCHES): Add falkor support. * gcc/doc/invoke.texi (AArch64 Options/-mtune): Add falkor. (ARM Options/-mtune): Add falkor. --- gcc/config/aarch64/aarch64-cores.def | 3 ++- gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/config/arm/arm-cores.def | 1 + gcc/config/arm/arm-tables.opt| 3 +++ gcc/config/arm/arm-tune.md | 2 +- gcc/config/arm/bpabi.h | 2 ++ gcc/config/arm/t-aprofile| 1 + gcc/doc/invoke.texi | 9 + 8 files changed, 16 insertions(+), 7 deletions(-) diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index f9b7552..4b00f3f 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -54,7 +54,8 @@ AARCH64_CORE("cortex-a73", cortexa73, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AA AARCH64_CORE("exynos-m1", exynosm1, exynosm1, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1, 0x53, 0x001) /* Qualcomm ('Q') cores. */ -AARCH64_CORE("qdf24xx", qdf24xx, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx, 0x51, 0x800) +AARCH64_CORE("falkor", falkor,cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx, 0x51, 0xC00) +AARCH64_CORE("qdf24xx", qdf24xx, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx, 0x51, 0xC00) /* Cavium ('C') cores. */ AARCH64_CORE("thunderx",thunderx, thunderx, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, 0x0a1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index 022b131..29afcdf 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,qdf24xx,thunderx,xgene1,vulcan,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53" + "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,falkor,qdf24xx,thunderx,xgene1,vulcan,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 2072e1e..1bfec9c 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -173,6 +173,7 @@ ARM_CORE("cortex-a57", cortexa57, cortexa57, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED ARM_CORE("cortex-a72", cortexa72, cortexa57, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a57) ARM_CORE("cortex-a73", cortexa73, cortexa57, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a73) ARM_CORE("exynos-m1", exynosm1, exynosm1,8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), exynosm1) +ARM_CORE("falkor", falkor,cortexa57, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), qdf24xx) ARM_CORE("qdf24xx",qdf24xx, cortexa57, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), qdf24xx) ARM_CORE("xgene1", xgene1,xgene1, 8A,ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH8A),xgene1) diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index ee9e3bb..7b15c8c 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -328,6 +328,9 @@ EnumValue Enum(processor_type) String(exynos-m1) Value(exynosm1) EnumValue +Enum(processor_type) String(falkor) Value(falkor) + +EnumValue Enum(processor_type) String(qdf24xx) Value(qdf24xx) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index 594ce9d..867da26 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -34,7 +34,7 @@ cortexm3,marvell_pj4,cortexa15cortexa7, cortexa17cortexa7,cortexa32,cortexa35, cortexa53,cortexa57,cortexa72, - cortexa7
ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)
This is a similar case to PR sanitizer/70342. Here, we were generating expression in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, then UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on. On this testcase we were instrumention CALL_EXPR that is in fact operator<<. I think those always return a reference, so it cannot be NULL, so there's no point in instrumenting those? Bootstrapped/regtested on x86_64-linux, ok for trunk? 2016-11-04 Marek Polacek PR sanitizer/78208 * cp-gimplify.c (cp_genericize_r): Don't instrument CALL_EXPR_OPERATOR_SYNTAX. * g++.dg/ubsan/null-8.C: New. diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c index 9b9b511..f39e9d5 100644 --- gcc/cp/cp-gimplify.c +++ gcc/cp/cp-gimplify.c @@ -1495,7 +1495,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void *data) = TREE_CODE (fn) == ADDR_EXPR && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL && DECL_CONSTRUCTOR_P (TREE_OPERAND (fn, 0)); - if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT)) + if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT) + && !CALL_EXPR_OPERATOR_SYNTAX (stmt)) ubsan_maybe_instrument_member_call (stmt, is_ctor); if ((flag_sanitize & SANITIZE_VPTR) && !is_ctor) cp_ubsan_maybe_instrument_member_call (stmt); diff --git gcc/testsuite/g++.dg/ubsan/null-8.C gcc/testsuite/g++.dg/ubsan/null-8.C index e69de29..0600b93 100644 --- gcc/testsuite/g++.dg/ubsan/null-8.C +++ gcc/testsuite/g++.dg/ubsan/null-8.C @@ -0,0 +1,22 @@ +// PR sanitizer/78208 +// { dg-do compile } +// { dg-options "-fsanitize=null" } + +class S +{ + virtual void foo () = 0; +}; + +struct T { + T &operator << (const char *s); +}; + +T t; + +void +S::foo () +{ + t << "a" << "b" << "c" << "d" << "e" << "f" << "g" << "h" << "i" +<< "j" << "k" << "l" << "m" << "n" << "o" << "p" << "q" << "r" +<< "s" << "t" << "u" << "v" << "w" << "z"; +} Marek
Re: ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)
On Fri, Nov 04, 2016 at 05:05:51PM +0100, Marek Polacek wrote: > This is a similar case to PR sanitizer/70342. Here, we were generating > expression > in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, > then > UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on. > > On this testcase we were instrumention CALL_EXPR that is in fact operator<<. > I > think those always return a reference, so it cannot be NULL, so there's no > point in instrumenting those? How do you know what is the return type of a user defined overloaded operator? Even when it returns a reference, I thought the point of -fsanitize=null was for all references to verify their addresses are non-null. I must say I don't understand the issue, if it is the same SAVE_EXPR in both lhs of COMPOUND_EXPR and UBSAN_NULL argument, shouldn't cp_genericize_r use of hash table to avoid walking the same tree multiple times avoid the exponential compile time/memory? > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > 2016-11-04 Marek Polacek > > PR sanitizer/78208 > * cp-gimplify.c (cp_genericize_r): Don't instrument > CALL_EXPR_OPERATOR_SYNTAX. > > * g++.dg/ubsan/null-8.C: New. > > diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c > index 9b9b511..f39e9d5 100644 > --- gcc/cp/cp-gimplify.c > +++ gcc/cp/cp-gimplify.c > @@ -1495,7 +1495,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void > *data) > = TREE_CODE (fn) == ADDR_EXPR > && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL > && DECL_CONSTRUCTOR_P (TREE_OPERAND (fn, 0)); > - if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT)) > + if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT) > + && !CALL_EXPR_OPERATOR_SYNTAX (stmt)) > ubsan_maybe_instrument_member_call (stmt, is_ctor); > if ((flag_sanitize & SANITIZE_VPTR) && !is_ctor) > cp_ubsan_maybe_instrument_member_call (stmt); Jakub
RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Toma Tabacu > Sent: 04 November 2016 15:25 > To: Matthew Fortune; gcc-patches@gcc.gnu.org > Cc: catherine_mo...@mentor.com > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch- > likely instructions. > > > From: Matthew Fortune > > Sent: 03 November 2016 13:07 > > To: Toma Tabacu; gcc-patches@gcc.gnu.org > > Cc: catherine_mo...@mentor.com > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need > > branch-likely instructions. > > > > Toma Tabacu writes: > > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-* > > > toolchains because it was using -mbranch-likely with an R6 target, and > > > branch- likely instructions were removed in R6. > > > > > > This patch makes the testsuite downgrade to R5 if the -mbranch-likely > > > option is present and we're targeting R6. > > > > > > Tested with mips-img-elf and mips-img-linux-gnu. > > > > Hi Toma, > > > > Welcome to GCC development, thanks for your first patch. As you can see > > from Catherine's reply the change looks good. I'll just cover some > > housekeeping issues... > > > > > gcc/testsuite/ > > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely in > > > condition for R5 downgrade. > > > > Changelogs are an art form which will take some getting used to. This is > > almost there but needs to reference the function affected. > > > > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5 > > for -mbranch-likely and infer -mno-branch-likely for R6. > > > > I have extended the comment as well as there is an additional change > > needed for this patch ideally. > > > > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp > > > b/gcc/testsuite/gcc.target/mips/mips.exp > > > index 7c24140..382d69c 100644 > > > --- a/gcc/testsuite/gcc.target/mips/mips.exp > > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp > > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } { > > >|| [mips_have_test_option_p options "-mpaired- > > > single"] > > >|| [mips_have_test_option_p options "- > > > mnan=legacy"] > > >|| [mips_have_test_option_p options "- > > > mabs=legacy"] > > > - || [mips_have_test_option_p options "!HAS_LSA"]) > > > } { > > > + || [mips_have_test_option_p options "!HAS_LSA"] > > > + || [mips_have_test_option_p options "-mbranch- > > > likely"]) } { > > > if { $gp_size == 32 } { > > > mips_make_test_option options "-mips32r5" > > > } else { > > > > Please can you make sure to retain the original patch formatting when > > posting. I suspect you have copied this out of a putty session or similar > > and > > have therefore lost the tabs. > > > > The extra change is that in the post-arch option processing we will need to > > infer -mno-branch-likely for the $isa_rev > 5 case much like we infer - > > mnan=2008 and -mabs=2008. This is so that when running the testsuite using > > -mips32r5 or earlier, with -mbranch-likely as part of the user-supplied test > > flags, then a test which is upgraded to > > mips32r6 does not attempt to use -mbranch-likely. > > > > Hope that wasn't too cryptic! > > > > Thanks, > > Matthew > > The updated patch below includes the improved ChangeLog comment, correct > formatting, and the post-arch enforcing of -mno-branch-likely for R6. > > Regards, > Toma > > gcc/testsuite/ChangeLog: > > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5 > for -mbranch-likely and infer -mno-branch-likely for R6. > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp > b/gcc/testsuite/gcc.target/mips/mips.exp > index 7c24140..6b7c46f 100644 > --- a/gcc/testsuite/gcc.target/mips/mips.exp > +++ b/gcc/testsuite/gcc.target/mips/mips.exp > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } { > || [mips_have_test_option_p options "-mpaired-single"] > || [mips_have_test_option_p options "-mnan=legacy"] > || [mips_have_test_option_p options "-mabs=legacy"] > -|| [mips_have_test_option_p options "!HAS_LSA"]) } { > +|| [mips_have_test_option_p options "!HAS_LSA"]) > +|| [mips_have_test_option_p options "-mbranch-likely"]) > } { > if { $gp_size == 32 } { > mips_make_test_option options "-mips32r5" > } else { > @@ -1345,6 +1346,7 @@ proc mips-dg-options { args } { > mips_make_test_option options "-mno-paired-single" > mips_make_test_option options "-mnan=2008" > mips_make_test_option options "-mabs=2008" > + mips_make_test_option options "-mno-branch-likely" > } > if { [regexp {^-march=(octeon|loongson)} $arch] } { > mips_make_test_option options "-mno-micromips" An e
[PATCH] Add target hook to compute register pressure classes
While working to get -fsched-pressure profitable on PowerPC, one of the things I noticed causing problems was the selection of pressure classes. Understanding that getting the computation of pressure classes correct for all targets in the machine independent code is going to be extremely hard (or impossible), I propose the following target hook which lets the target decide the set of pressure classes to use. I will follow this up with an implementation of the hook for the rs6000 backend, but am currently benchmarking variations of it to see which is best. Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk? Note that the ira.c changes are not as complicated as what it looks, I only added a few lines to invoke the target hook, the rest is just indentation changes to move the existing code into the else leg. -Pat 2016-11-04 Pat Haugen * target.def (compute_pressure_classes): New target hook. * doc/tm.texi.in: Document it. * doc/tm.texi: Regenerate. * ira.c (setup_pressure_classes): Call target hook if defined. Index: gcc/target.def === --- gcc/target.def (revision 241821) +++ gcc/target.def (working copy) @@ -5039,6 +5039,16 @@ DEFHOOK machine_mode, (enum insn_code icode), default_cstore_mode) +/* This target hook allows the backend to compute the register pressure + classes to use. */ +DEFHOOK +(compute_pressure_classes, + "A target hook which lets a backend compute the set of pressure classes to\ + be used by those optimization passes which take register pressure into\ + account, as opposed to letting IRA compute them. It returns the number of\ + register classes stored in the array @var{pressure_classes}.", + int, (enum reg_class *pressure_classes), NULL) + /* True if a structure, union or array with MODE containing FIELD should be accessed using BLKmode. */ DEFHOOK Index: gcc/doc/tm.texi.in === --- gcc/doc/tm.texi.in (revision 241821) +++ gcc/doc/tm.texi.in (working copy) @@ -2509,6 +2509,8 @@ value that the middle-end intended. @hook TARGET_CSTORE_MODE +@hook TARGET_COMPUTE_PRESSURE_CLASSES + @node Stack and Calling @section Stack Layout and Calling Conventions @cindex calling conventions Index: gcc/doc/tm.texi === --- gcc/doc/tm.texi (revision 241821) +++ gcc/doc/tm.texi (working copy) @@ -2903,6 +2903,10 @@ This hook defines a class of registers w This hook defines the machine mode to use for the boolean result of conditional store patterns. The ICODE argument is the instruction code for the cstore being performed. Not definiting this hook is the same as accepting the mode encoded into operand 0 of the cstore expander patterns. @end deftypefn +@deftypefn {Target Hook} int TARGET_COMPUTE_PRESSURE_CLASSES (enum reg_class *@var{pressure_classes}) +A target hook which lets a backend compute the set of pressure classes to be used by those optimization passes which take register pressure into account, as opposed to letting IRA compute them. It returns the number of register classes stored in the array @var{pressure_classes}. +@end deftypefn + @node Stack and Calling @section Stack Layout and Calling Conventions @cindex calling conventions Index: gcc/ira.c === --- gcc/ira.c (revision 241821) +++ gcc/ira.c (working copy) @@ -792,78 +792,85 @@ setup_pressure_classes (void) HARD_REG_SET temp_hard_regset2; bool insert_p; - n = 0; - for (cl = 0; cl < N_REG_CLASSES; cl++) -{ - if (ira_class_hard_regs_num[cl] == 0) - continue; - if (ira_class_hard_regs_num[cl] != 1 - /* A register class without subclasses may contain a few - hard registers and movement between them is costly - (e.g. SPARC FPCC registers). We still should consider it - as a candidate for a pressure class. */ - && alloc_reg_class_subclasses[cl][0] < cl) + if (targetm.compute_pressure_classes) +n = targetm.compute_pressure_classes (pressure_classes); + else +{ + n = 0; + for (cl = 0; cl < N_REG_CLASSES; cl++) { - /* Check that the moves between any hard registers of the - current class are not more expensive for a legal mode - than load/store of the hard registers of the current - class. Such class is a potential candidate to be a - register pressure class. */ - for (m = 0; m < NUM_MACHINE_MODES; m++) + if (ira_class_hard_regs_num[cl] == 0) + continue; + if (ira_class_hard_regs_num[cl] != 1 + /* A register class without subclasses may contain a few + hard registers and movement between them is costly + (e.g. SPARC FPCC registers). We still should consider it + as a candidate for a pressure class. */ + && alloc_reg_class_subclasses[cl][0] < cl) { - COPY_HAR
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On 11/04/2016 01:26 PM, Jakub Jelinek wrote: In any case, the second hunk fixes an important DSE bug that just got revealed by the former change. Can you elaborate on that (what situation does it occur in, is it covered by the testsuite)? Bernd
Re: [PATCH 1/2, i386] cmpstrnsi needs string length
ChangeLog for this patch: 2016-11-03 Aaron Sawdey * config/i386/i386.md (cmpstrnsi): New test to bail out if neither string input is a string constant. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: [PATCH 2/2, expand] make expand_builtin_strncmp more general
ChangeLog for this patch: 2016-11-03 Aaron Sawdey * builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp via cmpstrnsi even if neither string is constant. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly
On 11/04/2016 07:11 AM, Christophe Lyon wrote: On 4 November 2016 at 14:48, Christophe Lyon wrote: On 30 October 2016 at 21:40, Thomas Koenig wrote: Hi Jerry, --- snip --- The new test fails at execution on arm and aarch64: gfortran.dg/fmt_t_9.f -O0 execution test gfortran.dg/fmt_t_9.f -O1 execution test gfortran.dg/fmt_t_9.f -O2 execution test gfortran.dg/fmt_t_9.f -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test gfortran.dg/fmt_t_9.f -O3 -g execution test gfortran.dg/fmt_t_9.f -Os execution test seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to larger granularity in validations. Hi Christophe, The patch is very non target specific. All I can think at the moment is its related to buffering and flushing details, or an error in the test case. When I get a chance, I would like to send you a test program to maybe help debug this. Or, is there a machine available I cant ssh into and work on it there? Jerry
[PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 2)
Hi! On Wed, Nov 02, 2016 at 10:46:40AM +0100, Richard Biener wrote: > Yeah, plus if a followup test would have disambiguated things (the > dispatch to the tree oracle for example). After discussing this on IRC that the dse.c and sched-deps.c (call (mem ...) ) changes are probably not safe, I'm proposing following patch which should be safe to backport to release branches too, the only occurrences of !DECL_RTL_SET_P where DECL_RTL worked fine in my bootstraps/regtests were FUNCTION_DECLs, so the patch should only turn ICEs into returning the safe return value that the expressions might overlap. If/once this is in, I'm planning to test/submit a patch adding /* If one decl is known to be a function or label in a function and the other is some kind of data, they can't overlap. */ if ((TREE_CODE (exprx) == FUNCTION_DECL || TREE_CODE (exprx) == LABEL_DECL) != (TREE_CODE (expry) == FUNCTION_DECL || TREE_CODE (expry) == LABEL_DECL)) return 1; before that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-11-04 Jakub Jelinek PR target/77834 * alias.c (nonoverlapping_memrefs_p): Return 0 if exprx or expry doesn't have rtl set. * gcc.dg/pr77834.c: New test. --- gcc/alias.c.jj 2016-10-21 17:06:27.0 +0200 +++ gcc/alias.c 2016-10-31 11:38:29.448031590 +0100 @@ -2755,6 +2755,13 @@ nonoverlapping_memrefs_p (const_rtx x, c || TREE_CODE (expry) == CONST_DECL) return 1; + /* If either of the decls doesn't have DECL_RTL set (e.g. marked as + living in multiple places), we can't tell anything. Exception + are FUNCTION_DECLs for which we can create DECL_RTL on demand. */ + if ((!DECL_RTL_SET_P (exprx) && TREE_CODE (exprx) != FUNCTION_DECL) + || (!DECL_RTL_SET_P (expry) && TREE_CODE (expry) != FUNCTION_DECL)) +return 0; + rtlx = DECL_RTL (exprx); rtly = DECL_RTL (expry); --- gcc/testsuite/gcc.dg/pr77834.c.jj 2016-10-31 11:41:46.290521464 +0100 +++ gcc/testsuite/gcc.dg/pr77834.c 2016-10-31 11:41:24.0 +0100 @@ -0,0 +1,18 @@ +/* PR target/77834 */ +/* { dg-do compile } */ +/* { dg-options "-O -ftree-pre -Wno-psabi" } */ +/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* x86_64-*-* } } */ + +typedef int V __attribute__ ((vector_size (64))); + +V +foo (V u, V v, int w) +{ + do +{ + if (u[0]) v ^= u[w]; +} + while ((V) { 0, u[w] }[1]); + u = (V) { v[v[0]], u[u[0]] }; + return v + u; +} Jakub
Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)
On Fri, Nov 04, 2016 at 05:18:13PM +0100, Bernd Schmidt wrote: > On 11/04/2016 01:26 PM, Jakub Jelinek wrote: > > > >In any case, the second hunk fixes an important DSE bug that just got > >revealed by the former change. > > Can you elaborate on that (what situation does it occur in, is it covered by > the testsuite)? As discussed earlier, I'm proposing to leave the first hunk out and have bootstrapped/regtested the following on x86_64-linux and i686-linux, ok for trunk? I don't have a testcase except for FAIL: gcc.dg/torture/pr67690.c -Os execution test that fails on i686-linux (or x86_64-linux with -m32) with the other hunk, the details are mentioned in PR77834 comments. We don't remove the store of an argument to sibling constant call during the local DSE phase, and the const sibling call is marked insn_info->frame_read, because of the need of the arguments. During the propagation it is handled right, but then when doing the last step dse_step5 wouldn't call scan_reads on the sibcall, because it didn't have non-NULL insn_info->rec nor insn_info->non_frame_wild_read (the latter is used just for non-const calls that may read anything). scan_reads does pretty much: read_info_t read_info = insn_info->read_rec; if (insn_info->non_frame_wild_read) do_something; if (insn_info->frame_read) do_something2; while (read_info) do_something3; so not calling it for the insn_info->frame_read case is clearly wrong. 2016-11-04 Jakub Jelinek PR target/77834 * dse.c (dse_step5): Call scan_reads even if just insn_info->frame_read. Improve and fix dump file messages. --- gcc/dse.c.jj2016-11-03 15:52:12.521965058 +0100 +++ gcc/dse.c 2016-11-04 09:42:27.640098125 +0100 @@ -3298,12 +3298,19 @@ dse_step5 (void) bitmap_clear (v); } else if (insn_info->read_rec - || insn_info->non_frame_wild_read) + || insn_info->non_frame_wild_read + || insn_info->frame_read) { - if (dump_file && !insn_info->non_frame_wild_read) - fprintf (dump_file, "regular read\n"); - else if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "non-frame wild read\n"); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + if (!insn_info->non_frame_wild_read + && !insn_info->frame_read) + fprintf (dump_file, "regular read\n"); + if (insn_info->non_frame_wild_read) + fprintf (dump_file, "non-frame wild read\n"); + if (insn_info->frame_read) + fprintf (dump_file, "frame read\n"); + } scan_reads (insn_info, v, NULL); } } Jakub
Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly
On 4 November 2016 at 17:28, Jerry DeLisle wrote: > On 11/04/2016 07:11 AM, Christophe Lyon wrote: >> >> On 4 November 2016 at 14:48, Christophe Lyon >> wrote: >>> >>> On 30 October 2016 at 21:40, Thomas Koenig wrote: Hi Jerry, > --- snip --- >>> >>> >>> The new test fails at execution on arm and aarch64: >>> >>> gfortran.dg/fmt_t_9.f -O0 execution test >>> gfortran.dg/fmt_t_9.f -O1 execution test >>> gfortran.dg/fmt_t_9.f -O2 execution test >>> gfortran.dg/fmt_t_9.f -O3 -fomit-frame-pointer -funroll-loops >>> -fpeel-loops -ftracer -finline-functions execution test >>> gfortran.dg/fmt_t_9.f -O3 -g execution test >>> gfortran.dg/fmt_t_9.f -Os execution test >>> >>> seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to >>> larger granularity in validations. > > > Hi Christophe, > > The patch is very non target specific. All I can think at the moment is its > related to buffering and flushing details, or an error in the test case. > > When I get a chance, I would like to send you a test program to maybe help > debug this. Or, is there a machine available I cant ssh into and work on it > there? > I am building cross-compilers, testing with qemu. However, there are aarch64 machines available in the GCC farm, maybe you have access to them? Thanks, Christophe > Jerry >
RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.
Toma Tabacu writes: > > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > > ow...@gcc.gnu.org] On Behalf Of Toma Tabacu > > Sent: 04 November 2016 15:25 > > To: Matthew Fortune; gcc-patches@gcc.gnu.org > > Cc: catherine_mo...@mentor.com > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need > > branch- likely instructions. > > > > > From: Matthew Fortune > > > Sent: 03 November 2016 13:07 > > > To: Toma Tabacu; gcc-patches@gcc.gnu.org > > > Cc: catherine_mo...@mentor.com > > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests > > > need branch-likely instructions. > > > > > > Toma Tabacu writes: > > > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-* > > > > toolchains because it was using -mbranch-likely with an R6 target, > > > > and > > > > branch- likely instructions were removed in R6. > > > > > > > > This patch makes the testsuite downgrade to R5 if the > > > > -mbranch-likely option is present and we're targeting R6. > > > > > > > > Tested with mips-img-elf and mips-img-linux-gnu. > > > > > > Hi Toma, > > > > > > Welcome to GCC development, thanks for your first patch. As you can > > > see from Catherine's reply the change looks good. I'll just cover > > > some housekeeping issues... > > > > > > > gcc/testsuite/ > > > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely > in > > > > condition for R5 downgrade. > > > > > > Changelogs are an art form which will take some getting used to. > > > This is almost there but needs to reference the function affected. > > > > > > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5 > > > for -mbranch-likely and infer -mno-branch-likely for R6. > > > > > > I have extended the comment as well as there is an additional change > > > needed for this patch ideally. > > > > > > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp > > > > b/gcc/testsuite/gcc.target/mips/mips.exp > > > > index 7c24140..382d69c 100644 > > > > --- a/gcc/testsuite/gcc.target/mips/mips.exp > > > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp > > > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } { > > > >|| [mips_have_test_option_p options > > > > "-mpaired- single"] > > > >|| [mips_have_test_option_p options "- > > > > mnan=legacy"] > > > >|| [mips_have_test_option_p options "- > > > > mabs=legacy"] > > > > - || [mips_have_test_option_p options > "!HAS_LSA"]) > > > > } { > > > > + || [mips_have_test_option_p options > "!HAS_LSA"] > > > > + || [mips_have_test_option_p options > > > > + "-mbranch- > > > > likely"]) } { > > > > if { $gp_size == 32 } { > > > > mips_make_test_option options "-mips32r5" > > > > } else { > > > > > > Please can you make sure to retain the original patch formatting > > > when posting. I suspect you have copied this out of a putty session > > > or similar and have therefore lost the tabs. > > > > > > The extra change is that in the post-arch option processing we will > > > need to infer -mno-branch-likely for the $isa_rev > 5 case much like > > > we infer - > > > mnan=2008 and -mabs=2008. This is so that when running the testsuite > > > using > > > -mips32r5 or earlier, with -mbranch-likely as part of the > > > user-supplied test flags, then a test which is upgraded to > > > mips32r6 does not attempt to use -mbranch-likely. > > > > > > Hope that wasn't too cryptic! > > > > > > Thanks, > > > Matthew > > > > The updated patch below includes the improved ChangeLog comment, > > correct formatting, and the post-arch enforcing of -mno-branch-likely > for R6. Thanks, committed as r241850. Matthew
[PATCH] Remove a FIXME from verify_type_variant
Hi, now that we removed java, let us check whether the FIXME this removes (and that was added by Honza) is actually true. It only affects checking runs and if it turns out to be false, we can easily revert this. Bootstrapped, LTO-bootstrapped and tested on x86_64-linux and aarch64-linux, OK for trunk? Thanks, Martin 2016-11-03 Martin Jambor * tree.c (verify_type_variant): Use pointer comparison to check that TYPE_SIZE_UNIT match. --- gcc/tree.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/gcc/tree.c b/gcc/tree.c index 56cc653..10fc70d 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -13272,12 +13272,10 @@ verify_type_variant (const_tree t, tree tv) verify_variant_match (TYPE_SIZE); if (TREE_CODE (TYPE_SIZE_UNIT (t)) != PLACEHOLDER_EXPR && TREE_CODE (TYPE_SIZE_UNIT (tv)) != PLACEHOLDER_EXPR - && TYPE_SIZE_UNIT (t) != TYPE_SIZE_UNIT (tv) - /* FIXME: ideally we should compare pointer equality, but java FE -produce variants where size is INTEGER_CST of different type (int -wrt size_type) during libjava biuld. */ - && !operand_equal_p (TYPE_SIZE_UNIT (t), TYPE_SIZE_UNIT (tv), 0)) + && TYPE_SIZE_UNIT (t) != TYPE_SIZE_UNIT (tv)) { + gcc_assert (!operand_equal_p (TYPE_SIZE_UNIT (t), + TYPE_SIZE_UNIT (tv), 0)); error ("type variant has different TYPE_SIZE_UNIT"); debug_tree (tv); error ("type variant's TYPE_SIZE_UNIT"); -- 2.10.1
[PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend
Hi there, This patch document the newly added support in GCC 7 for Cortex-M23 and Cortex-M33 processors [1][2]. [1] http://www.arm.com/products/processors/cortex-m/cortex-m23-processor.php [2] http://www.arm.com/products/processors/cortex-m/cortex-m33-processor.php Is this ok for ? Best regards, Thomas Index: htdocs/gcc-7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v retrieving revision 1.20 diff -u -r1.20 changes.html --- htdocs/gcc-7/changes.html 25 Oct 2016 02:04:00 - 1.20 +++ htdocs/gcc-7/changes.html 26 Oct 2016 14:05:59 - @@ -316,6 +316,16 @@ -mcpu=cortex-a73 and -mtune=cortex-a73 options as well as the equivalent target attributes and pragmas. + + The ARM Cortex-M23 processor is now supported via the + -mcpu=cortex-m23 and -mtune=cortex-m23 + options. + + + The ARM Cortex-M33 processor is now supported via the + -mcpu=cortex-m33 and -mtune=cortex-m33 + options. + AVR
Re: [PATCH, gcc/ARM, ping] Add support for Cortex-M33
On 02/11/16 10:14, Kyrill Tkachov wrote: On 02/11/16 10:07, Thomas Preudhomme wrote: Ping? Best regards, Thomas On 26/10/16 17:42, Thomas Preudhomme wrote: Hi, This patch adds support for the Cortex-M33 processor launched by ARM [1]. The patch adds support for the name and wires it up to the ARMv8-M Mainline with DSP extensions architecture and arm_v7m_tune tuning parameters for the time being. It also updates documentation to mention this new processor. [1] http://www.arm.com/products/processors/cortex-m/cortex-m33-processor.php ChangeLog entry is as follows: *** gcc/Changelog *** 2016-10-26 Thomas Preud'homme * config/arm/arm-arches.def (armv8-m.main+dsp): Set Cortex-M33 as representative core for this architecture. * config/arm/arm-cores.def (cortex-m33): Define new processor. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Likewise. * config/arm/bpabi.h (BE8_LINK_SPEC): Add Cortex-M33 to the list of valid -mcpu options. * doc/invoke.texi (ARM Options): Document new Cortex-M33 processor. Tested by building libgcc and libstdc++ for Cortex-M33 and running a hello world compiled for it. Is this ok for master? Ok. Thanks, Kyrill Committed, thanks. Best regards, Thomas
Re: [PATCH, gcc/ARM, ping] Add support for Cortex-M23
On 02/11/16 10:13, Kyrill Tkachov wrote: On 02/11/16 10:07, Thomas Preudhomme wrote: Ping? Best regards, Thomas On 26/10/16 17:42, Thomas Preudhomme wrote: Hi, This patch adds support for the Cortex-M23 processor launched by ARM [1]. The patch adds support for the name and wires it up to the ARMv8-M Baseline architecture and arm_v6m_tune tuning parameters for the time being. It also updates documentation to mention this new processor. [1] http://www.arm.com/products/processors/cortex-m/cortex-m23-processor.php ChangeLog entry is as follows: *** gcc/Changelog *** 2016-10-26 Thomas Preud'homme * config/arm/arm-arches.def (armv8-m.base): Set Cortex-M23 as representative core for this architecture. * config/arm/arm-cores.def (cortex-m23): Define new processor. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Likewise. * config/arm/arm.c (arm_v6m_tune): Add Cortex-M23 to the list of cores this tuning parameters apply to in the comment. * config/arm/bpabi.h (BE8_LINK_SPEC): Add Cortex-M23 to the list of valid -mcpu options. * doc/invoke.texi (ARM Options): Document new Cortex-M23 processor. Tested by building libgcc and libstdc++ for Cortex-M23 and running a hello world compiled for it. Is this ok for master? Ok. Thanks, Kyrill Committed, thanks. Best regards, Thomas
[PATCH] rtx_writer: avoid printing trailing nils
On Fri, 2016-10-21 at 12:04 +0200, Bernd Schmidt wrote: > On 10/21/2016 02:36 AM, David Malcolm wrote: > > + ASSERT_RTL_DUMP_EQ ("(cjump_insn (set (pc)\n" > > + "(label_ref 0))\n" > > + " (nil))\n", > > + jump_insn); > > } > > I do wonder about the (nil)s and whether we can eliminate them. The following patch does so, by adding a flag param to rtx_writer::print_rtx allowing omitting the (nil) when in compact mode. Successfully bootstrapped®rtested on x86_64-pc-linux-gnu. OK for trunk? gcc/ChangeLog: * print-rtl.c (rtx_writer::print_rtx_operand_code_0): Don't print trailing "(nil)" for NOTE_VAR_LOCATION and for ENTRY_VALUE_EXP. (rtx_writer::print_rtx_operand_code_e): Likewise for CALL_INSN_FUNCTION_USAGE and for REG_NOTES. (rtx_writer::print_rtx_operand_codes_E_and_V): Always print nils within a vec. (rtx_writer::print_rtx): Add param "print_nil" to support not printing some trailing "(nil)" values when in compact mode. Don't print them for PAT_VAR_LOCATION_LOC. (print_inline_rtx): Update for new param. (debug_rtx): Likewise. (rtx_writer::print_rtl): Likewise. (rtx_writer::print_rtl_single_with_indent): Likewise. * print-rtl.h (rtx_writer::print_rtx): Add "print_nil" param. * rtl-tests.c (selftests::test_uncond_jump): Remove trailing "(nil)" from expected output. --- gcc/print-rtl.c | 44 +--- gcc/print-rtl.h | 2 +- gcc/rtl-tests.c | 2 +- 3 files changed, 31 insertions(+), 17 deletions(-) diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c index 341ecdf..643eb57 100644 --- a/gcc/print-rtl.c +++ b/gcc/print-rtl.c @@ -158,7 +158,7 @@ rtx_writer::print_rtx_operand_code_0 (const_rtx in_rtx ATTRIBUTE_UNUSED, case NOTE_INSN_VAR_LOCATION: case NOTE_INSN_CALL_ARG_LOCATION: fputc (' ', m_outfile); - print_rtx (NOTE_VAR_LOCATION (in_rtx)); + print_rtx (NOTE_VAR_LOCATION (in_rtx), false); break; case NOTE_INSN_CFI: @@ -201,7 +201,7 @@ rtx_writer::print_rtx_operand_code_0 (const_rtx in_rtx ATTRIBUTE_UNUSED, m_indent += 2; if (!m_sawclose) fprintf (m_outfile, " "); - print_rtx (ENTRY_VALUE_EXP (in_rtx)); + print_rtx (ENTRY_VALUE_EXP (in_rtx), false); m_indent -= 2; } #endif @@ -224,11 +224,17 @@ rtx_writer::print_rtx_operand_code_e (const_rtx in_rtx, int idx) if (idx == 7 && CALL_P (in_rtx)) { m_in_call_function_usage = true; - print_rtx (XEXP (in_rtx, idx)); + print_rtx (XEXP (in_rtx, idx), false); m_in_call_function_usage = false; } else -print_rtx (XEXP (in_rtx, idx)); +{ + bool print_nil = true; + /* Avoid printing trailing "(nil)" for REG_NOTES. */ + if (INSN_CHAIN_CODE_P (GET_CODE (in_rtx)) && idx == 6) + print_nil = false; + print_rtx (XEXP (in_rtx, idx), print_nil); +} m_indent -= 2; } @@ -252,7 +258,7 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx in_rtx, int idx) m_sawclose = 1; for (int j = 0; j < XVECLEN (in_rtx, idx); j++) - print_rtx (XVECEXP (in_rtx, idx, j)); + print_rtx (XVECEXP (in_rtx, idx, j), true); m_indent -= 2; } @@ -564,10 +570,15 @@ rtx_writer::print_rtx_operand (const_rtx in_rtx, int idx) } } -/* Print IN_RTX onto m_outfile. This is the recursive part of printing. */ +/* Print IN_RTX onto m_outfile. This is the recursive part of printing. + PRINT_NIL controls the printing of "(nil)" in compact mode. + In compact mode, if IN_RTX is NULL, then "(nil)" is printed if PRINT_NIL + is true, and nothing is printed if PRINT_NIL is false. + In non-compact mode, "(nil)" is always printed for NULL, and PRINT_NIL + is ignored. */ void -rtx_writer::print_rtx (const_rtx in_rtx) +rtx_writer::print_rtx (const_rtx in_rtx, bool print_nil) { int idx = 0; @@ -582,8 +593,11 @@ rtx_writer::print_rtx (const_rtx in_rtx) if (in_rtx == 0) { - fputs ("(nil)", m_outfile); - m_sawclose = 1; + if (print_nil || !m_compact) + { + fputs ("(nil)", m_outfile); + m_sawclose = 1; + } return; } else if (GET_CODE (in_rtx) > NUM_RTX_CODE) @@ -657,7 +671,7 @@ rtx_writer::print_rtx (const_rtx in_rtx) else print_mem_expr (m_outfile, PAT_VAR_LOCATION_DECL (in_rtx)); fputc (' ', m_outfile); - print_rtx (PAT_VAR_LOCATION_LOC (in_rtx)); + print_rtx (PAT_VAR_LOCATION_LOC (in_rtx), false); if (PAT_VAR_LOCATION_STATUS (in_rtx) == VAR_INIT_STATUS_UNINITIALIZED) fprintf (m_outfile, " [uninit]"); @@ -765,7 +779,7 @@ void print_inline_rtx (FILE *outf, const_rtx x, int ind) { rtx_writer w (outf, ind, false, false); -
Re: ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)
On Fri, Nov 4, 2016 at 12:16 PM, Jakub Jelinek wrote: > On Fri, Nov 04, 2016 at 05:05:51PM +0100, Marek Polacek wrote: >> This is a similar case to PR sanitizer/70342. Here, we were generating >> expression >> in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, >> then >> UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on. >> >> On this testcase we were instrumention CALL_EXPR that is in fact operator<<. >> I >> think those always return a reference, so it cannot be NULL, so there's no >> point in instrumenting those? > > How do you know what is the return type of a user defined overloaded > operator? Indeed, there are no constraints on the return type of overloaded operator<<. Jason
RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.
> From: Matthew Fortune > Sent: 04 November 2016 16:49 > To: Toma Tabacu; gcc-patches@gcc.gnu.org > Cc: catherine_mo...@mentor.com > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch- > likely instructions. > > Toma Tabacu writes: > > > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > > > ow...@gcc.gnu.org] On Behalf Of Toma Tabacu > > > Sent: 04 November 2016 15:25 > > > To: Matthew Fortune; gcc-patches@gcc.gnu.org > > > Cc: catherine_mo...@mentor.com > > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need > > > branch- likely instructions. > > > > > > > From: Matthew Fortune > > > > Sent: 03 November 2016 13:07 > > > > To: Toma Tabacu; gcc-patches@gcc.gnu.org > > > > Cc: catherine_mo...@mentor.com > > > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests > > > > need branch-likely instructions. > > > > > > > > Toma Tabacu writes: > > > > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-* > > > > > toolchains because it was using -mbranch-likely with an R6 target, > > > > > and > > > > > branch- likely instructions were removed in R6. > > > > > > > > > > This patch makes the testsuite downgrade to R5 if the > > > > > -mbranch-likely option is present and we're targeting R6. > > > > > > > > > > Tested with mips-img-elf and mips-img-linux-gnu. > > > > > > > > Hi Toma, > > > > > > > > Welcome to GCC development, thanks for your first patch. As you can > > > > see from Catherine's reply the change looks good. I'll just cover > > > > some housekeeping issues... > > > > > > > > > gcc/testsuite/ > > > > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely > > in > > > > > condition for R5 downgrade. > > > > > > > > Changelogs are an art form which will take some getting used to. > > > > This is almost there but needs to reference the function affected. > > > > > > > > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5 > > > > for -mbranch-likely and infer -mno-branch-likely for R6. > > > > > > > > I have extended the comment as well as there is an additional change > > > > needed for this patch ideally. > > > > > > > > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp > > > > > b/gcc/testsuite/gcc.target/mips/mips.exp > > > > > index 7c24140..382d69c 100644 > > > > > --- a/gcc/testsuite/gcc.target/mips/mips.exp > > > > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp > > > > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } { > > > > >|| [mips_have_test_option_p options > > > > > "-mpaired- single"] > > > > >|| [mips_have_test_option_p options "- > > > > > mnan=legacy"] > > > > >|| [mips_have_test_option_p options "- > > > > > mabs=legacy"] > > > > > - || [mips_have_test_option_p options > > "!HAS_LSA"]) > > > > > } { > > > > > + || [mips_have_test_option_p options > > "!HAS_LSA"] > > > > > + || [mips_have_test_option_p options > > > > > + "-mbranch- > > > > > likely"]) } { > > > > > if { $gp_size == 32 } { > > > > > mips_make_test_option options "-mips32r5" > > > > > } else { > > > > > > > > Please can you make sure to retain the original patch formatting > > > > when posting. I suspect you have copied this out of a putty session > > > > or similar and have therefore lost the tabs. > > > > > > > > The extra change is that in the post-arch option processing we will > > > > need to infer -mno-branch-likely for the $isa_rev > 5 case much like > > > > we infer - > > > > mnan=2008 and -mabs=2008. This is so that when running the testsuite > > > > using > > > > -mips32r5 or earlier, with -mbranch-likely as part of the > > > > user-supplied test flags, then a test which is upgraded to > > > > mips32r6 does not attempt to use -mbranch-likely. > > > > > > > > Hope that wasn't too cryptic! > > > > > > > > Thanks, > > > > Matthew > > > > > > The updated patch below includes the improved ChangeLog comment, > > > correct formatting, and the post-arch enforcing of -mno-branch-likely > > for R6. > > Thanks, committed as r241850. > > Matthew I've noticed that there is a typo in my surname in the ChangeLog entry (in the name and in the email address). Regards, Toma Tabacu
RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.
Toma Tabacu writes: > I've noticed that there is a typo in my surname in the ChangeLog entry > (in the name and in the email address). Apologies, corrected. Matthew
Re: [PATCH] rtx_writer: avoid printing trailing nils
On 11/04/2016 06:32 PM, David Malcolm wrote: ASSERT_RTL_DUMP_EQ ("(cjump_insn 1 (set (pc)\n" "(label_ref 0))\n" - " (nil))\n", + " )\n", jump_insn); This looks like the patch doesn't go the whole way - we'd want the space savings from avoiding the newline. You could consider giving the print_nil thing a default argument. Bernd
Privatize copy_blkmode_from_reg
Tested on x86-64/Linux, applied on the mainline as obvious. 2016-11-04 Eric Botcazou * expr.h (copy_blkmode_from_reg): Delete. * expr.c (copy_blkmode_from_reg): Make static. -- Eric BotcazouIndex: expr.c === --- expr.c (revision 241808) +++ expr.c (working copy) @@ -2591,7 +2591,7 @@ maybe_emit_group_store (rtx x, tree type This is used on targets that return BLKmode values in registers. */ -void +static void copy_blkmode_from_reg (rtx target, rtx srcreg, tree type) { unsigned HOST_WIDE_INT bytes = int_size_in_bytes (type); Index: expr.h === --- expr.h (revision 241808) +++ expr.h (working copy) @@ -146,9 +146,6 @@ extern void emit_group_store (rtx, rtx, extern rtx maybe_emit_group_store (rtx, tree); -/* Copy BLKmode object from a set of registers. */ -extern void copy_blkmode_from_reg (rtx, rtx, tree); - /* Mark REG as holding a parameter for the next CALL_INSN. Mode is TYPE_MODE of the non-promoted parameter, or VOIDmode. */ extern void use_reg_mode (rtx *, rtx, machine_mode);
Re: [PATCH, rs6000] Fold vector addition built-ins in GIMPLE
Hi Bill, On Tue, Nov 01, 2016 at 09:05:03PM -0500, Bill Schmidt wrote: > As Jakub suggested in response to my *ahem* ornate patch for overloaded > function built-ins, a much better approach is to use the existing > machinery for overloading and then immediately fold the specific > functions during gimplification. There is a target hook available for > this purpose that we have not previously used. This patch demonstrates > this functionality by implementing the target hook and folding vector > addition built-ins within it. Future patches will fold other such > operations, improving the optimization available for many vector > intrinsics. > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no > regressions. Is this ok for trunk? This looks fine, please apply. Thanks, Segher
Re: [PATCH] rtx_writer: avoid printing trailing nils
Here's something simpler. Only very lightly tested, but isn't something like this all that's needed? Could decide whether to apply it to expr_list etc. as well. Index: print-rtl.c === --- print-rtl.c (revision 241233) +++ print-rtl.c (working copy) @@ -697,7 +697,12 @@ print_rtx (const_rtx in_rtx) /* Get the format string and skip the first elements if we have handled them already. */ - for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++) + int limit = GET_RTX_LENGTH (GET_CODE (in_rtx)); + if (flag_compact + && INSN_CHAIN_CODE_P (GET_CODE (in_rtx)) + && XEXP (in_rtx, limit - 1) == NULL_RTX) +limit--; + for (; idx < limit; idx++) print_rtx_operand (in_rtx, idx); switch (GET_CODE (in_rtx)) Bernd
Re: [Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions
On Fri, Nov 04, 2016 at 12:43:37PM +0100, Paul Richard Thomas wrote: > > The associate construct does not readily permit the identification of > the associate name as an array during parsing. However, this can be > done whilst matching rvalues within the associate block. This patch > extends this identification in gfc_match_varspec, whilst excluding > character types from the present test. This latter change prevents > scalar substrings from being pegged as array references, which was the > origin of this bug. > > Bootstraps and regtests on FC21/x86_64. > > OK for trunk? > Looks good to me. -- Steve
Re: [PATCH] Print repeated rtl vector elements in a nicer way
On Thu, Nov 03, 2016 at 05:43:50PM +0100, Bernd Schmidt wrote: > On 11/03/2016 05:35 PM, Martin Jambor wrote: > > > > * print-rtl.c (print_rtx_operand_codes_E_and_V): Print how many times > > an element is repeated istead of printing each repeated element. > > "instead" Will fix. > > > --- > > gcc/print-rtl.c | 15 ++- > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c > > index 341ecdf..9752c85 100644 > > --- a/gcc/print-rtl.c > > +++ b/gcc/print-rtl.c > > @@ -252,7 +252,20 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx > > in_rtx, int idx) > > m_sawclose = 1; > > > >for (int j = 0; j < XVECLEN (in_rtx, idx); j++) > > - print_rtx (XVECEXP (in_rtx, idx, j)); > > + { > > + int j1; > > + > > + print_rtx (XVECEXP (in_rtx, idx, j)); > > + for (j1 = j + 1; j1 < XVECLEN (in_rtx, idx); j1++) > > + if (XVECEXP (in_rtx, idx, j) != XVECEXP (in_rtx, idx, j1)) > > + break; > > + > > + if (j1 != j + 1) > > + { > > + fprintf (m_outfile, " repeated %ix", j1 - j); > > + j = j1; > > + } > > + } > > Good idea, but can you give an example of how this looks in > practice? For example, after the patch dumps of constant vectors look like this: (insn 27 26 28 2 (set (reg:V64SI 450) (vec_merge:V64SI (ashift:V64SI (reg/v:V64SI 433 [ workitem_id ]) (const_vector:V64SI [ (const_int 2 [0x2]) repeated 64x ])) (unspec:V64SI [ (const_int 0 [0]) ] UNSPEC_VECTOR) (reg:DI 425 [ exec.1_4 ]))) "kernel.c":35 -1 (nil)) Without it, the very same insn is printed as following: (insn 27 26 28 2 (set (reg:V64SI 450) (vec_merge:V64SI (ashift:V64SI (reg/v:V64SI 433 [ workitem_id ]) (const_vector:V64SI [ (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) (const_int 2 [0x2]) ])) (unspec:V64SI [ (const_int 0 [0]) ] UNSPEC_VECTOR) (reg:DI 425 [ exec.1_4 ]))) "kernel.c":35 -1 (nil)) > Also, > it would be nice (and necessary for David's rtl-testing) to
[PATCH] rtx_writer: avoid printing trailing default values
On Fri, 2016-11-04 at 18:51 +0100, Bernd Schmidt wrote: > Here's something simpler. Only very lightly tested, but isn't > something > like this all that's needed? Could decide whether to apply it to > expr_list etc. as well. > > Index: print-rtl.c > === > --- print-rtl.c (revision 241233) > +++ print-rtl.c (working copy) > @@ -697,7 +697,12 @@ print_rtx (const_rtx in_rtx) > > /* Get the format string and skip the first elements if we have > handled >them already. */ > - for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++) > + int limit = GET_RTX_LENGTH (GET_CODE (in_rtx)); > + if (flag_compact (as of r241586 this is now "m_compact") > + && INSN_CHAIN_CODE_P (GET_CODE (in_rtx)) > + && XEXP (in_rtx, limit - 1) == NULL_RTX) > +limit--; > + for (; idx < limit; idx++) > print_rtx_operand (in_rtx, idx); > > switch (GET_CODE (in_rtx)) Thanks. This seems to assume that the final code in the fmt string can be accessed via XEXP (in_rtx, limit - 1), which if RTL checking is enabled requires that the code be either 'e' or 'u'. This isn't the case for JUMP_INSN, JUMP_TABLE_DATA, BARRIER (all code '0'), CODE_LABEL (code 's') and NOTE (code 'i'). Also, we might want to omit the REG_NOTES from, say a JUMP_INSN, which is the *penultimate* operand - for example, it doesn't omit the trailing (nil) from the cjump_insn in test_uncond_jump. That said, this is much simpler than my patch, so I used it as the basis for the following: it uses the same approach as your patch, but loops backwards from the end of the format string (rather than just once) until it finds a non-default value, using a new function "operand_has_default_value_p" to handle the logic for that. With this, compact dumps omit the trailing (nil) for both regular insns and for JUMP_INSNs, and omits the surplus newline seen in my earlier patch. It also appears removes the trailing (nil) from expr_list. Bootstrap®rtest in progress. OK for trunk if it passes? gcc/ChangeLog: * print-rtl.c (operand_has_default_value_p): New function. (rtx_writer::print_rtx): In compact mode, omit trailing operands that have the default values. * rtl-tests.c (selftest::test_dumping_insns): Remove empty label string from expected dump. (seltest::test_uncond_jump): Remove trailing "(nil)" for REG_NOTES from expected dump. --- gcc/print-rtl.c | 45 - gcc/rtl-tests.c | 5 ++--- 2 files changed, 46 insertions(+), 4 deletions(-) diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c index 341ecdf..493aa68 100644 --- a/gcc/print-rtl.c +++ b/gcc/print-rtl.c @@ -564,6 +564,40 @@ rtx_writer::print_rtx_operand (const_rtx in_rtx, int idx) } } +/* Subroutine of rtx_writer::print_rtx. + In compact mode, determine if operand IDX of IN_RTX is interesting + to dump, or (if in a trailing position) it can be omitted. */ + +static bool +operand_has_default_value_p (const_rtx in_rtx, int idx) +{ + const char *format_ptr = GET_RTX_FORMAT (GET_CODE (in_rtx)); + + switch (format_ptr[idx]) +{ +case 'e': +case 'u': + return XEXP (in_rtx, idx) == NULL_RTX; + +case 's': + return XSTR (in_rtx, idx) == NULL; + +case '0': + switch (GET_CODE (in_rtx)) + { + case JUMP_INSN: + return JUMP_LABEL (in_rtx) == NULL_RTX; + + default: + return false; + + } + +default: + return false; +} +} + /* Print IN_RTX onto m_outfile. This is the recursive part of printing. */ void @@ -681,9 +715,18 @@ rtx_writer::print_rtx (const_rtx in_rtx) fprintf (m_outfile, " %d", INSN_UID (in_rtx)); } + /* Determine which is the final operand to print. + In compact mode, skip trailing operands that have the default values + e.g. trailing "(nil)" values. */ + int limit = GET_RTX_LENGTH (GET_CODE (in_rtx)); + if (m_compact) +while (limit > idx && operand_has_default_value_p (in_rtx, limit - 1)) + limit--; + /* Get the format string and skip the first elements if we have handled them already. */ - for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++) + + for (; idx < limit; idx++) print_rtx_operand (in_rtx, idx); switch (GET_CODE (in_rtx)) diff --git a/gcc/rtl-tests.c b/gcc/rtl-tests.c index cf5239f..228226b 100644 --- a/gcc/rtl-tests.c +++ b/gcc/rtl-tests.c @@ -122,7 +122,7 @@ test_dumping_insns () /* Labels. */ rtx_insn *label = gen_label_rtx (); CODE_LABEL_NUMBER (label) = 42; - ASSERT_RTL_DUMP_EQ ("(clabel 0 42 \"\")\n", label); + ASSERT_RTL_DUMP_EQ ("(clabel 0 42)\n", label); LABEL_NAME (label)= "some_label"; ASSERT_RTL_DUMP_EQ ("(clabel 0 42 (\"some_label\"))\n", label); @@ -176,8 +176,7 @@ test_uncond_jump () ASSERT_TRUE (control_flow_insn_p (jump_insn)); ASSERT_RTL_DUMP_EQ ("(cjump_insn 1 (set (pc)\n" - "
Re: [PATCH] rtx_writer: avoid printing trailing default values
On 11/04/2016 08:14 PM, David Malcolm wrote: With this, compact dumps omit the trailing (nil) for both regular insns and for JUMP_INSNs, and omits the surplus newline seen in my earlier patch. It also appears removes the trailing (nil) from expr_list. Bootstrap®rtest in progress. OK for trunk if it passes? This seems OK, with something left to be addressed perhaps in a followup: + case JUMP_INSN: + return JUMP_LABEL (in_rtx) == NULL_RTX; Weren't we going to omit JUMP_LABELs in compact mode? If so, we'll need a m_compact test here eventually as well. Bernd
Re: [PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 2)
On November 4, 2016 5:27:11 PM GMT+01:00, Jakub Jelinek wrote: >Hi! > >On Wed, Nov 02, 2016 at 10:46:40AM +0100, Richard Biener wrote: >> Yeah, plus if a followup test would have disambiguated things (the >> dispatch to the tree oracle for example). > >After discussing this on IRC that the dse.c and sched-deps.c (call (mem >...) ) >changes are probably not safe, I'm proposing following patch which >should be >safe to backport to release branches too, the only occurrences of >!DECL_RTL_SET_P where DECL_RTL worked fine in my bootstraps/regtests >were >FUNCTION_DECLs, so the patch should only turn ICEs into returning the >safe >return value that the expressions might overlap. > >If/once this is in, I'm planning to test/submit a patch adding > /* If one decl is known to be a function or label in a function and > the other is some kind of data, they can't overlap. */ > if ((TREE_CODE (exprx) == FUNCTION_DECL > || TREE_CODE (exprx) == LABEL_DECL) > != (TREE_CODE (expry) == FUNCTION_DECL > || TREE_CODE (expry) == LABEL_DECL)) >return 1; >before that. > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK for trunk and branches (if appropriate) Richard. >2016-11-04 Jakub Jelinek > > PR target/77834 > * alias.c (nonoverlapping_memrefs_p): Return 0 if exprx or expry > doesn't have rtl set. > > * gcc.dg/pr77834.c: New test. > >--- gcc/alias.c.jj 2016-10-21 17:06:27.0 +0200 >+++ gcc/alias.c2016-10-31 11:38:29.448031590 +0100 >@@ -2755,6 +2755,13 @@ nonoverlapping_memrefs_p (const_rtx x, c > || TREE_CODE (expry) == CONST_DECL) > return 1; > >+ /* If either of the decls doesn't have DECL_RTL set (e.g. marked as >+ living in multiple places), we can't tell anything. Exception >+ are FUNCTION_DECLs for which we can create DECL_RTL on demand. >*/ >+ if ((!DECL_RTL_SET_P (exprx) && TREE_CODE (exprx) != FUNCTION_DECL) >+ || (!DECL_RTL_SET_P (expry) && TREE_CODE (expry) != >FUNCTION_DECL)) >+return 0; >+ > rtlx = DECL_RTL (exprx); > rtly = DECL_RTL (expry); > >--- gcc/testsuite/gcc.dg/pr77834.c.jj 2016-10-31 11:41:46.290521464 >+0100 >+++ gcc/testsuite/gcc.dg/pr77834.c 2016-10-31 11:41:24.0 +0100 >@@ -0,0 +1,18 @@ >+/* PR target/77834 */ >+/* { dg-do compile } */ >+/* { dg-options "-O -ftree-pre -Wno-psabi" } */ >+/* { dg-additional-options "-mstringop-strategy=libcall" { target >i?86-*-* x86_64-*-* } } */ >+ >+typedef int V __attribute__ ((vector_size (64))); >+ >+V >+foo (V u, V v, int w) >+{ >+ do >+{ >+ if (u[0]) v ^= u[w]; >+} >+ while ((V) { 0, u[w] }[1]); >+ u = (V) { v[v[0]], u[u[0]] }; >+ return v + u; >+} > > > Jakub
Re: [PATCH] Add target hook to compute register pressure classes
On 11/04/2016 12:18 PM, Pat Haugen wrote: While working to get -fsched-pressure profitable on PowerPC, one of the things I noticed causing problems was the selection of pressure classes. Understanding that getting the computation of pressure classes correct for all targets in the machine independent code is going to be extremely hard (or impossible), I propose the following target hook which lets the target decide the set of pressure classes to use. I will follow this up with an implementation of the hook for the rs6000 backend, but am currently benchmarking variations of it to see which is best. Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk? Yes. It is OK, Pat. Thanks for working on powerpc insn-scheduling issues. Note that the ira.c changes are not as complicated as what it looks, I only added a few lines to invoke the target hook, the rest is just indentation changes to move the existing code into the else leg.
Re: [Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions
Hi Steve, Thanks. At least it cannot do any harm :-) I will get on with it. Paul On 4 November 2016 at 19:21, Steve Kargl wrote: > On Fri, Nov 04, 2016 at 12:43:37PM +0100, Paul Richard Thomas wrote: >> >> The associate construct does not readily permit the identification of >> the associate name as an array during parsing. However, this can be >> done whilst matching rvalues within the associate block. This patch >> extends this identification in gfc_match_varspec, whilst excluding >> character types from the present test. This latter change prevents >> scalar substrings from being pegged as array references, which was the >> origin of this bug. >> >> Bootstraps and regtests on FC21/x86_64. >> >> OK for trunk? >> > > Looks good to me. > > -- > Steve -- The difference between genius and stupidity is; genius has its limits. Albert Einstein
Re: [Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions
Committed as revision 241860. Thanks for the review. Paul On 4 November 2016 at 19:21, Steve Kargl wrote: > On Fri, Nov 04, 2016 at 12:43:37PM +0100, Paul Richard Thomas wrote: >> >> The associate construct does not readily permit the identification of >> the associate name as an array during parsing. However, this can be >> done whilst matching rvalues within the associate block. This patch >> extends this identification in gfc_match_varspec, whilst excluding >> character types from the present test. This latter change prevents >> scalar substrings from being pegged as array references, which was the >> origin of this bug. >> >> Bootstraps and regtests on FC21/x86_64. >> >> OK for trunk? >> > > Looks good to me. > > -- > Steve -- The difference between genius and stupidity is; genius has its limits. Albert Einstein
Re: [PATCH] rtx_writer: avoid printing trailing default values
On Fri, 2016-11-04 at 19:53 +0100, Bernd Schmidt wrote: > On 11/04/2016 08:14 PM, David Malcolm wrote: > > > > With this, compact dumps omit the trailing (nil) for both regular > > insns > > and for JUMP_INSNs, and omits the surplus newline seen in my > > earlier patch. > > > > It also appears removes the trailing (nil) from expr_list. > > > > Bootstrap®rtest in progress. > > > > OK for trunk if it passes? > > This seems OK, with something left to be addressed perhaps in a > followup: > > > + case JUMP_INSN: > > + return JUMP_LABEL (in_rtx) == NULL_RTX; > > Weren't we going to omit JUMP_LABELs in compact mode? If so, we'll > need > a m_compact test here eventually as well. We already omit JUMP_LABELs in compact mode, in rtx_writer::print_rtx_operand_code_0, but we need this test to pass so that we can potentially omit earlier (nil) operands in a JUMP_INSN e.g. REG_NOTES. Given that, perhaps this test should read: case JUMP_INSN: /* JUMP_LABELs are always omitted in compact mode, so treat any value here as omittable, so that earlier operands can potentially be omitted also. */ return true; or: return m_compact; if we want to support calling operand_has_default_value_p in non -compact mode. OK with one of those changes? (assuming successful testing)
Re: [PATCH] rtx_writer: avoid printing trailing default values
On 11/04/2016 08:25 PM, David Malcolm wrote: return m_compact; Ok with this one plus a comment. Bernd
Re: [PATCH] enhance buffer overflow warnings (and c/53562)
Attached is an update to the patch that takes into consideration the feedback I got. It goes back to adding just one option, -Wstringop-overflow, as in the original, while keeping the Object Size type as an argument. It uses type-1 as the default setting for string functions (strcpy et al.) and, unconditionally, type-0 for raw memory functions (memcpy, etc.) I retested Binutils 2.27 and the Linux kernel again with this patch and also added Glibc, and it doesn't complain about anything (both Binutils and the kernel also build cleanly with an unpatched GCC with_FORTIFY_SOURCE=2 or its rough equivalent for the kernel). The emit-rtl.c warning (bug 78174) has also been suppressed by the change to bos type-0 for memcpy. While the patch doesn't trigger any false positives (AFAIK) it is subject to a fair number of false negatives due to the limitations of the tree-object-size pass, and due to transformations done by other passes that prevent it from detecting some otherwise obvious overflows. Although unfortunate, I believe the warnings that are emitted are useful as the first line of defense in software that doesn't use _FORTIFY_SOURCE (such as GCC itself). And this can of course be improved if some of the limitations are removed over time. Martin PR c/53562 - Add -Werror= support for -D_FORTIFY_SOURCE / __builtin___memcpy_chk PR middle-end/78149 - missing warning on strncpy buffer overflow due to an excessive bound PR middle-end/78138 - missing warnings on buffer overflow with non-constant source length gcc/c-family/ChangeLog: PR c/53562 * c.opt (-Wstringop-overflow): New option. gcc/ChangeLog: PR c/53562 PR middle-end/78149 PR middle-end/78138 * builtins.c (expand_builtin_strcat, expand_builtin_strncat): New functions. (compute_dest_size, get_size_range, check_sizes, check_strncat_sizes) (check_memop_sizes): Same. (expand_builtin_memcpy): Call check memop_sizes. (expand_builtin_mempcpy): Same. (expand_builtin_memset): Same, (expand_builtin_bzero): Same. (expand_builtin_memory_chk): Call check_sizes. (expand_builtin_strcpy): Same. (expand_builtin_strncpy): Same. (maybe_emit_sprintf_chk_warning): Same. (expand_builtin): Handle strcat and strncat. (fini_object_sizes): Reset pointers. (compute_object_size): New function. * doc/invoke.texi (Warning Options): Document -Wstringop-overflow. gcc/testsuite/ChangeLog: PR c/53562 PR middle-end/78149 PR middle-end/78138 * c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust expected diagnostic. * g++.dg/ext/builtin-object-size3.C (bar): Same. * g++.dg/ext/strncpy-chk1.C: Same. * g++.dg/opt/memcpy1.C: Same. * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Same. * gcc.c-torture/compile/pr55569.c: Disable -Wstringop-overflow. * gcc.dg/Wobjsize-1.c: Adjust expected diagnostic. * gcc.dg/attr-alloc_size.c: Same. * gcc.dg/builtin-stringop-chk-1.c: Adjust expected diagnostic. * gcc.dg/builtin-stringop-chk-2.c: Same. * gcc.dg/builtin-stringop-chk-4.c: New test. * gcc.dg/builtin-strncat-chk-1.c: Adjust expected diagnostic. * gcc.dg/memcpy-2.c: Same. * gcc.dg/pr40340-1.c: Same. * gcc.dg/pr40340-2.c (main): Same. * gcc.dg/pr40340-5.c (main): Same. * gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Same. * gcc.dg/torture/pr71132.c: Disable -Wstringop-overflow. * gcc.dg/fstack-protector-strong.c: Add expected diagnostic. diff --git a/gcc/builtins.c b/gcc/builtins.c index cc711a0..e2cf20a 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -67,7 +67,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "case-cfn-macros.h" #include "gimple-fold.h" - +#include "intl.h" struct target_builtins default_target_builtins; #if SWITCHABLE_TARGET @@ -125,9 +125,11 @@ static rtx expand_builtin_mempcpy (tree, rtx, machine_mode); static rtx expand_builtin_mempcpy_with_bounds (tree, rtx, machine_mode); static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx, machine_mode, int, tree); +static rtx expand_builtin_strcat (tree, rtx); static rtx expand_builtin_strcpy (tree, rtx); static rtx expand_builtin_strcpy_args (tree, tree, rtx); static rtx expand_builtin_stpcpy (tree, rtx, machine_mode); +static rtx expand_builtin_strncat (tree, rtx); static rtx expand_builtin_strncpy (tree, rtx); static rtx builtin_memset_gen_str (void *, HOST_WIDE_INT, machine_mode); static rtx expand_builtin_memset (tree, rtx, machine_mode); @@ -3011,6 +3013,292 @@ expand_builtin_memcpy_args (tree dest, tree src, tree len, rtx target, tree exp) return dest_addr; } +/* Fill the 2-element RANGE array with the minimum and maximum values + EXP is known to have and return true, otherwise null and return + false. */ + +static bool +get_size_range (tree exp, tree range[2]) +{ + if (tree_fits_uhwi_p (exp)) +{ + range[0] = range[1] = exp; + return true; +} + + if (TREE_CODE (exp) == SSA_NAME) +{ + wide_int min, max; + enum value_range_type range_type = get_range_info (exp, &min, &max); +
Simplify X / X, 0 / X and X % X
Hello, since we were discussing this recently... The condition is copied from the existing 0 % X case, visible in the context of the diff. As far as I understand, the main case where we do not want to optimize is during constexpr evaluation in the C++ front-end (it wants to detect the undefined behavior), and with late folding I think this means we only need to care about an explicit 0/0, not about X/X where X would become 0 after the simplification. And later, if we do have something like X/0, we could handle it the same way as we currently handle *(char*)0, insert a trap after that instruction and clear the following code, which likely gives better code than replacing 0/0 with 1. Bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2016-11-07 Marc Glisse gcc/ * match.pd (0 / X, X / X, X % X): New simplifications. gcc/testsuite/ * gcc.dg/tree-ssa/divide-5.c: New file. -- Marc GlisseIndex: gcc/match.pd === --- gcc/match.pd (revision 241856) +++ gcc/match.pd (working copy) @@ -133,33 +133,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (non_lvalue @0))) /* Transform x * -1.0 into -x. */ (simplify (mult @0 real_minus_onep) (if (!HONOR_SNANS (type) && (!HONOR_SIGNED_ZEROS (type) || !COMPLEX_FLOAT_TYPE_P (type))) (negate @0))) -/* Make sure to preserve divisions by zero. This is the reason why - we don't simplify x / x to 1 or 0 / x to 0. */ +/* X * 1, X / 1 -> X. */ (for op (mult trunc_div ceil_div floor_div round_div exact_div) (simplify (op @0 integer_onep) (non_lvalue @0))) +/* Preserve explicit divisions by 0: the C++ front-end wants to detect + undefined behavior in constexpr evaluation, and assuming that the division + traps enables better optimizations than these anyway. */ (for div (trunc_div ceil_div floor_div round_div exact_div) + /* 0 / X is always zero. */ + (simplify + (div integer_zerop@0 @1) + /* But not for 0 / 0 so that we can get the proper warnings and errors. */ + (if (!integer_zerop (@1)) + @0)) /* X / -1 is -X. */ (simplify (div @0 integer_minus_onep@1) (if (!TYPE_UNSIGNED (type)) (negate @0))) + /* X / X is one. */ + (simplify + (div @0 @0) + /* But not for 0 / 0 so that we can get the proper warnings and errors. */ + (if (!integer_zerop (@0)) + { build_one_cst (type); })) /* X / abs (X) is X < 0 ? -1 : 1. */ (simplify (div:C @0 (abs @0)) (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_UNDEFINED (type)) (cond (lt @0 { build_zero_cst (type); }) { build_minus_one_cst (type); } { build_one_cst (type); }))) /* X / -X is -1. */ (simplify (div:C @0 (negate @0)) @@ -269,38 +283,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && !real_zerop (@1)) (with { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @1); } (if (tem) (mult @0 { tem; } ))) (if (cst != COMPLEX_CST) (with { tree inverse = exact_inverse (type, @1); } (if (inverse) (mult @0 { inverse; } -/* Same applies to modulo operations, but fold is inconsistent here - and simplifies 0 % x to 0, only preserving literal 0 % 0. */ (for mod (ceil_mod floor_mod round_mod trunc_mod) /* 0 % X is always zero. */ (simplify (mod integer_zerop@0 @1) /* But not for 0 % 0 so that we can get the proper warnings and errors. */ (if (!integer_zerop (@1)) @0)) /* X % 1 is always zero. */ (simplify (mod @0 integer_onep) { build_zero_cst (type); }) /* X % -1 is zero. */ (simplify (mod @0 integer_minus_onep@1) (if (!TYPE_UNSIGNED (type)) { build_zero_cst (type); })) + /* X % X is zero. */ + (simplify + (mod @0 @0) + /* But not for 0 % 0 so that we can get the proper warnings and errors. */ + (if (!integer_zerop (@0)) + { build_zero_cst (type); })) /* (X % Y) % Y is just X % Y. */ (simplify (mod (mod@2 @0 @1) @1) @2) /* From extract_muldiv_1: (X * C1) % C2 is zero if C1 is a multiple of C2. */ (simplify (mod (mult @0 INTEGER_CST@1) INTEGER_CST@2) (if (ANY_INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_UNDEFINED (type) && wi::multiple_of_p (@1, @2, TYPE_SIGN (type))) Index: gcc/testsuite/gcc.dg/tree-ssa/divide-5.c === --- gcc/testsuite/gcc.dg/tree-ssa/divide-5.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/divide-5.c (working copy) @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized" } */ + +int f(int x){ + int y = x; + int z = 0; + return x / y - x % y + z / y; +} + +/* { dg-final { scan-tree-dump "return 1;" "optimized"} } */
Re: [patch,testsuite] Support dg-require-effective-target label_offsets.
On Nov 4, 2016, at 2:35 AM, Georg-Johann Lay wrote: > >> .word .L3-(.L2) >> >> which seems wrong, given all the other code-gen. I think this has to be >> gs(.L3)-(gs(.L2)), no? I tried to get binutils to do that for me directly >> with a .word and it seemed resistant; which is unfortunate. > Using gs, the expression would be gs(.L3)-(gs(.L2)) as gs implements word > addresses, not byte addresses. As I said, I tried that and could not get it to work. >> The compile only test cases work fine now, so nothing needs to be done for >> them. > > ICE is not a nice approach I never saw an ICE with current release binutils and top of tree gcc. If you want me to see an ICE and comment on it, I would need the command line apparently. > , IMO. I'd prefer some more graceful error using code locations of the > labels (which are not available). My preference is code-gen. :-) Anyway, the change I posted creates nice graceful errors, if one wants.
Re: [PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend
On Fri, 4 Nov 2016, Thomas Preudhomme wrote: This patch document the newly added support in GCC 7 for Cortex-M23 and Cortex-M33 processors [1][2]. : Is this ok for ? Surely so for me. Gerald
PR 78188 & 71848
Something in cgraph appears to be relying on COMDAT support if DECL_ONE_ONLY is available. This is the cause of the AIX bootstrap failure due to recent tree-vrp.c change, earlier tree-ssa-sccvn.c change. Also a recent C++ front-end change caused new libstdc++ testsuite failures that seems to have the same root cause. This patch disables the ipa-comdats pass on targets that do not support COMDAT groups. Pre-approved by Richi. PR bootstrap/78188 PR c++/71848 * ipa-comdats.c (pass_ipa_comdats::gate): Require HAVE_COMDAT_GROUP. Index: ipa-comdats.c === --- ipa-comdats.c (revision 241862) +++ ipa-comdats.c (working copy) @@ -416,7 +416,7 @@ class pass_ipa_comdats : public ipa_opt_pass_d bool pass_ipa_comdats::gate (function *) { - return optimize; + return HAVE_COMDAT_GROUP && optimize; } } // anon namespace