[PR71408] - Fix wrong code at -Os and above
Hi All, For the testcase in PR71408 zero_one_operation seems still broken. In handling NEGATE_EXPR, as part of undistribute_ops_list, in zero_one_operation, we are doing propagate_op_to_single_use (op, stmt, def); This results in: - _14 = _5 * _12; _15 = (int) _11; _16 = ~_15; _17 = (unsigned int) _16; - _18 = -_5; - _19 = _17 * _18; - _20 = _14 + _19; - _24 = _5 & _20; + _19 = _5 * _17; + _35 = _19 + _12; + _34 = _35 * _5; + _20 = _34; + _24 = _20 & _5; We should instead propagate (-1) as "op" is the one which gets factored out. With the attached patch we now have: - _14 = _5 * _12; _15 = (int) _11; _16 = ~_15; _17 = (unsigned int) _16; - _18 = -_5; - _19 = _17 * _18; - _20 = _14 + _19; - _24 = _5 & _20; + _32 = _17; + _19 = -_32; + _34 = _19 + _12; + _33 = _34 * _5; + _20 = _33; + _24 = _20 & _5; Regression tested and bootstrapped on x86-64-linux-gnu with no new regression. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2016-06-05 Kugan Vivekanandarajah PR middle-end/71408 * tree-ssa-reassoc.c (zero_one_operation): Fix NEGATE_EXPR operand for propagate_op_to_single_use. gcc/testsuite/ChangeLog: 2016-06-05 Kugan Vivekanandarajah PR middle-end/71408 * gcc.dg/tree-ssa/pr71408.c: New test. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71408.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71408.c index e69de29..896a428 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr71408.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71408.c @@ -0,0 +1,30 @@ +/* PR middle-end/71408 */ +/* { dg-do run } */ +/* { dg-options "-Os" } */ +unsigned a, b; + +struct S0 +{ + int f1:18; + unsigned f3:4; +}; + +void fn1 () +{ + struct S0 c = { 7, 0 }; + if (c.f1) +c.f3 = 3; + a = -~c.f3; + c.f3 = ~(c.f1 && c.f1); + c.f1 = c.f3 * (c.f1 - (c.f1 - a % c.f1)) + ~c.f3 * -a; + b = ~(c.f1 & a); + if (b >= 4294967295) +__builtin_abort (); +} + +int +main () +{ + fn1 (); + return 0; +} diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c index 1973077..7865df0 100644 --- a/gcc/tree-ssa-reassoc.c +++ b/gcc/tree-ssa-reassoc.c @@ -1203,7 +1203,8 @@ zero_one_operation (tree *def, enum tree_code opcode, tree op) { if (gimple_assign_rhs1 (stmt) == op) { - propagate_op_to_single_use (op, stmt, def); + tree cst = build_minus_one_cst (TREE_TYPE (op)); + propagate_op_to_single_use (cst, stmt, def); return; } else if (integer_minus_onep (op) @@ -1251,7 +1252,8 @@ zero_one_operation (tree *def, enum tree_code opcode, tree op) { if (gimple_assign_rhs1 (stmt2) == op) { - propagate_op_to_single_use (op, stmt2, def); + tree cst = build_minus_one_cst (TREE_TYPE (op)); + propagate_op_to_single_use (cst, stmt2, def); return; } else if (integer_minus_onep (op)
Re: [PATCH][2/3] Vectorize inductions that are live after the loop
Alan Hayward writes: > * gcc.dg/vect/vect-live-2.c: New test. This test fails on powerpc64 (with -m64, but not with -m32): $ grep 'vectorized.*loops' ./vect-live-2.c.149t.vect ../gcc/testsuite/gcc.dg/vect/vect-live-2.c:10:1: note: vectorized 0 loops in function. ../gcc/testsuite/gcc.dg/vect/vect-live-2.c:29:1: note: vectorized 0 loops in function. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: [PATCH] Selftest framework (v7)
On 06/03/2016 09:12 PM, David Malcolm wrote: It's not clear to me if these approvals still hold. I was willing to go with it; I had a look through some of these patches and didn't spot anything untoward. To make it clear, this patch is OK, with one tweak if possible: extend the namespace selftest to cover the various helper functions (some of these have names like from_int which ideally we wouldn't leak into the rest of the compiler). As far as I can tell this just involves moving the start of namespace selftest upwards a bit in the files where we have tests. A few other minor things... + tree bind_expr = +build3 (BIND_EXPR, void_type_node, NULL, stmt_list, block); Operators go at the start of the line. + tree fn_type = build_function_type_array (integer_type_node, /* return_type */ The line is too long, and we don't do /* arg name */ anyway. +static void +assert_loceq (const char *exp_filename, + int exp_linenum, + int exp_colnum, + location_t loc) +static layout_range +make_range (int start_line, int start_col, + int end_line, int end_col) These lines are too short :) Could save some vertical space here. For the future - I found the single merged patch easier to deal with than the 16- or 21-patch series. Split ups are often good when modifying the same code in multiple logically independent steps (keeping in mind that bugfixes to newly added code shouldn't be split out either). This is a different situation where the patches weren't truly independent, and the merged patch is essentially just a concatenation, so splitting it up does not really make the review any easier (potentially harder if you have to switch between mails rather than just hitting PgUp/Dn. Bernd
Re: Ping! [fortran, patch, pr69659, v1] [6/7 Regression] ICE on using option -frepack-arrays, in gfc_conv_descriptor_data_get
Ping! On Sun, 22 May 2016 18:51:22 +0200 Andre Vehreschild wrote: > Hi all, > > attached patch fixes a regression that occurred on some testcases when > doing a validation run with -frepack-arrays. The issue here was that > for class arrays the array descriptor is in the _data component and not > directly at the address of the class_array. The patch fixes this issue > for pr69659 on trunk 7 and gcc-6-branch. > > Ok for trunk and gcc-6? > > Bootstrapped and regtested on x86_64-linux-gnu. > > Regards, > Andre -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: Ping! [fortran, patch, pr69659, v1] [6/7 Regression] ICE on using option -frepack-arrays, in gfc_conv_descriptor_data_get
Hi Andre, That's verging on 'obvious' and even does the job :-) OK for trunk and 6-branch. Thanks for the patch Paul On 5 June 2016 at 16:13, Andre Vehreschild wrote: > Hi Paul, > > now with attachment. > > - Andre > > On Sun, 5 Jun 2016 16:06:09 +0200 > Paul Richard Thomas wrote: > >> Hi Andre, >> >> There's no attachment. Get it posted tonight and I will take a look at it. >> >> Cheers >> >> Paul >> >> On 5 June 2016 at 14:04, Andre Vehreschild wrote: >> > Ping! >> > >> > On Sun, 22 May 2016 18:51:22 +0200 >> > Andre Vehreschild wrote: >> > >> >> Hi all, >> >> >> >> attached patch fixes a regression that occurred on some testcases when >> >> doing a validation run with -frepack-arrays. The issue here was that >> >> for class arrays the array descriptor is in the _data component and not >> >> directly at the address of the class_array. The patch fixes this issue >> >> for pr69659 on trunk 7 and gcc-6-branch. >> >> >> >> Ok for trunk and gcc-6? >> >> >> >> Bootstrapped and regtested on x86_64-linux-gnu. >> >> >> >> Regards, >> >> Andre >> > >> > >> > -- >> > Andre Vehreschild * Email: vehre ad gmx dot de >> >> >> > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de -- The difference between genius and stupidity is; genius has its limits. Albert Einstein
[PING] [PATCH] Make basic asm implicitly clobber memory
Ping... I think we all agreed on the general direction of this patch. The patch is basically unchanged from previous version, except one line in doc/extend.texi has been updated. So I would like to ask if it is OK for trunk. Thanks Bernd.gcc/ 2016-05-05 Bernd Edlinger PR c/24414 * cfgexpand.c (expand_asm_loc): Remove handling for ADDR_EXPR. Implicitly clobber memory for basic asm with non-empty assembler string. Use targetm.md_asm_adjust also here. * compare-emim.c (arithmetic_flags_clobber_p): Use asm_noperands here. * final.c (final_scan_insn): Handle basic asm in PARALLEL block. * gimple.c (gimple_asm_clobbers_memory_p): Handle basic asm with non-empty assembler string. * ira.c (compute_regs_asm_clobbered): Use asm_noperands here. * recog.c (asm_noperands): Handle basic asm in PARALLEL block. (decode_asm_operands): Handle basic asm in PARALLEL block. (extract_insn): Handle basic asm in PARALLEL block. * doc/extend.texi: Mention new behavior of basic asm. * config/ia64/ia64 (rtx_needs_barrier): Handle ASM_INPUT here. * config/pa/pa.c (branch_to_delay_slot_p, branch_needs_nop_p, branch_needs_nop_p): Use asm_noperands. gcc/testsuite/ 2016-05-05 Bernd Edlinger PR c/24414 * gcc.target/i386/pr24414.c: New test. Index: gcc/cfgexpand.c === --- gcc/cfgexpand.c (revision 231412) +++ gcc/cfgexpand.c (working copy) @@ -2655,9 +2655,6 @@ expand_asm_loc (tree string, int vol, location_t l { rtx body; - if (TREE_CODE (string) == ADDR_EXPR) -string = TREE_OPERAND (string, 0); - body = gen_rtx_ASM_INPUT_loc (VOIDmode, ggc_strdup (TREE_STRING_POINTER (string)), locus); @@ -2664,6 +2661,34 @@ expand_asm_loc (tree string, int vol, location_t l MEM_VOLATILE_P (body) = vol; + /* Non-empty basic ASM implicitly clobbers memory. */ + if (TREE_STRING_LENGTH (string) != 0) +{ + rtx asm_op, clob; + unsigned i, nclobbers; + auto_vec input_rvec, output_rvec; + auto_vec constraints; + auto_vec clobber_rvec; + HARD_REG_SET clobbered_regs; + CLEAR_HARD_REG_SET (clobbered_regs); + + clob = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)); + clobber_rvec.safe_push (clob); + + if (targetm.md_asm_adjust) + targetm.md_asm_adjust (output_rvec, input_rvec, + constraints, clobber_rvec, + clobbered_regs); + + asm_op = body; + nclobbers = clobber_rvec.length (); + body = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (1 + nclobbers)); + + XVECEXP (body, 0, 0) = asm_op; + for (i = 0; i < nclobbers; i++) + XVECEXP (body, 0, i + 1) = gen_rtx_CLOBBER (VOIDmode, clobber_rvec[i]); +} + emit_insn (body); } Index: gcc/compare-elim.c === --- gcc/compare-elim.c (revision 231412) +++ gcc/compare-elim.c (working copy) @@ -162,7 +162,7 @@ arithmetic_flags_clobber_p (rtx_insn *insn) if (!NONJUMP_INSN_P (insn)) return false; pat = PATTERN (insn); - if (extract_asm_operands (pat)) + if (asm_noperands (pat) >= 0) return false; if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) == 2) Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 231412) +++ gcc/doc/extend.texi (working copy) @@ -7508,7 +7508,7 @@ inside them. GCC has no visibility of symbols in the @code{asm} and may discard them as unreferenced. It also does not know about side effects of the assembler code, such as modifications to memory or registers. Unlike -some compilers, GCC assumes that no changes to either memory or registers +some compilers, GCC assumes that no changes to general purpose registers occur. This assumption may change in a future release. To avoid complications from future changes to the semantics and the @@ -7442,6 +7442,10 @@ all basic @code{asm} blocks use the assembler dial Basic @code{asm} provides no mechanism to provide different assembler strings for different dialects. +For basic @code{asm} with non-empty assembler string GCC assumes +the assembler block does not change any general purpose registers, +but it may read or write any globally accessible variable. + Here is an example of basic @code{asm} for i386: @example Index: gcc/final.c === --- gcc/final.c (revision 231412) +++ gcc/final.c (working copy) @@ -2565,6 +2565,10 @@ final_scan_insn (rtx_insn *insn, FILE *file, int o (*debug_hooks->source_line) (last_linenum, last_filename, last_discriminator, is_stmt); + if (GET_CODE (body) == PARALLEL + && GET_CODE (XVECEXP (body, 0, 0)) == ASM_INPUT) + body = XVECEXP (body, 0, 0); + if (GET_CODE (body) == ASM_INPUT) { const char *string = XSTR (body,
Silence bogus mismatched profile messages
Hi, calls to functions that will not return cause profile to look locally incorrect. (because the sum of outgoing frequencies does not match the BB frequency). Do not report that. This makes it easier to analyze C++ sources. On tramp3d we now have about 20 mismatches caused by loop unroling about about 1200 caused by jump threading. Bootstrapped/regtested x86_64-linux. Honza * cfg.c (check_bb_profile): Do not report mismatched profiles when only edges out of BB are EH edges. Index: cfg.c === --- cfg.c (revision 237097) +++ cfg.c (working copy) @@ -412,20 +412,31 @@ check_bb_profile (basic_block bb, FILE * if (bb != EXIT_BLOCK_PTR_FOR_FN (fun)) { + bool found = false; FOR_EACH_EDGE (e, ei, bb->succs) - sum += e->probability; - if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100) - fprintf (file, "%s%sInvalid sum of outgoing probabilities %.1f%%\n", -(flags & TDF_COMMENT) ? ";; " : "", s_indent, -sum * 100.0 / REG_BR_PROB_BASE); - lsum = 0; - FOR_EACH_EDGE (e, ei, bb->succs) - lsum += e->count; - if (EDGE_COUNT (bb->succs) - && (lsum - bb->count > 100 || lsum - bb->count < -100)) - fprintf (file, "%s%sInvalid sum of outgoing counts %i, should be %i\n", -(flags & TDF_COMMENT) ? ";; " : "", s_indent, -(int) lsum, (int) bb->count); + { + if (!(e->flags & EDGE_EH)) + found = true; + sum += e->probability; + } + /* Only report mismatches for non-EH control flow. If there are only EH +edges it means that the BB ends by noreturn call. Here the control +flow may just terminate. */ + if (found) + { + if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100) + fprintf (file, "%s%sInvalid sum of outgoing probabilities %.1f%%\n", +(flags & TDF_COMMENT) ? ";; " : "", s_indent, +sum * 100.0 / REG_BR_PROB_BASE); + lsum = 0; + FOR_EACH_EDGE (e, ei, bb->succs) + lsum += e->count; + if (EDGE_COUNT (bb->succs) + && (lsum - bb->count > 100 || lsum - bb->count < -100)) + fprintf (file, "%s%sInvalid sum of outgoing counts %i, should be %i\n", +(flags & TDF_COMMENT) ? ";; " : "", s_indent, +(int) lsum, (int) bb->count); + } } if (bb != ENTRY_BLOCK_PTR_FOR_FN (fun)) {
Avoid mutiple predictions out of loop predictors
Hi, this patch makes loop predictors to run smoother on exists which quits multiple loops. Currently we will place multiple predictors on each such exit that does not really work as expected: PRED_LOOP_ITERATIONS predictors does not really expect the exit to be inside of internal loop and will give false probability. Bootstrapped/regtested x86_64-linux, commited. Honza * predict.c (predicted_by_loop_heuristics_p): New function. (predict_iv_comparison): Use it. (predict_loops): Walk from innermost loops; do not predict edges leaving multiple loops multiple times; implement PRED_LOOP_ITERATIONS_MAX heuristics. * predict.def (PRED_LOOP_ITERATIONS_MAX): New predictor. * gcc.dg/predict-9.c: Update template. Index: predict.c === --- predict.c (revision 237101) +++ predict.c (working copy) @@ -1215,6 +1215,27 @@ expr_coherent_p (tree t1, tree t2) return false; } +/* Return true if E is predicted by one of loop heuristics. */ + +static bool +predicted_by_loop_heuristics_p (basic_block bb) +{ + struct edge_prediction *i; + edge_prediction **preds = bb_predictions->get (bb); + + if (!preds) +return false; + + for (i = *preds; i; i = i->ep_next) +if (i->ep_predictor == PRED_LOOP_ITERATIONS_GUESSED + || i->ep_predictor == PRED_LOOP_ITERATIONS_MAX + || i->ep_predictor == PRED_LOOP_ITERATIONS + || i->ep_predictor == PRED_LOOP_EXIT + || i->ep_predictor == PRED_LOOP_EXTRA_EXIT) + return true; + return false; +} + /* Predict branch probability of BB when BB contains a branch that compares an induction variable in LOOP with LOOP_IV_BASE_VAR to LOOP_BOUND_VAR. The loop exit is compared using LOOP_BOUND_CODE, with step of LOOP_BOUND_STEP. @@ -1243,10 +1264,7 @@ predict_iv_comparison (struct loop *loop edge then_edge; edge_iterator ei; - if (predicted_by_p (bb, PRED_LOOP_ITERATIONS_GUESSED) - || predicted_by_p (bb, PRED_LOOP_ITERATIONS) - || predicted_by_p (bb, PRED_LOOP_EXIT) - || predicted_by_p (bb, PRED_LOOP_EXTRA_EXIT)) + if (predicted_by_loop_heuristics_p (bb)) return; stmt = last_stmt (bb); @@ -1493,10 +1512,10 @@ predict_loops (void) /* Try to predict out blocks in a loop that are not part of a natural loop. */ - FOR_EACH_LOOP (loop, 0) + FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) { basic_block bb, *bbs; - unsigned j, n_exits; + unsigned j, n_exits = 0; vec exits; struct tree_niter_desc niter_desc; edge ex; @@ -1508,7 +1527,9 @@ predict_loops (void) gcond *stmt = NULL; exits = get_loop_exit_edges (loop); - n_exits = exits.length (); + FOR_EACH_VEC_ELT (exits, j, ex) + if (!(ex->flags & (EDGE_EH | EDGE_ABNORMAL_CALL | EDGE_FAKE))) + n_exits ++; if (!n_exits) { exits.release (); @@ -1522,7 +1543,14 @@ predict_loops (void) int max = PARAM_VALUE (PARAM_MAX_PREDICTED_ITERATIONS); int probability; enum br_predictor predictor; + widest_int nit; + if (ex->flags & (EDGE_EH | EDGE_ABNORMAL_CALL | EDGE_FAKE)) + continue; + /* Loop heuristics do not expect exit conditional to be inside +inner loop. We predict from innermost to outermost loop. */ + if (predicted_by_loop_heuristics_p (ex->src)) + continue; predict_extra_loop_exits (ex); if (number_of_iterations_exit (loop, ex, &niter_desc, false, false)) @@ -1543,25 +1571,34 @@ predict_loops (void) /* If we have just one exit and we can derive some information about the number of iterations of the loop from the statements inside the loop, use it to predict this exit. */ - else if (n_exits == 1) + else if (n_exits == 1 + && estimated_stmt_executions (loop, &nit)) { - nitercst = estimated_stmt_executions_int (loop); - if (nitercst < 0) - continue; - if (nitercst > max) + if (wi::gtu_p (nit, max)) nitercst = max; - + else + nitercst = nit.to_shwi (); predictor = PRED_LOOP_ITERATIONS_GUESSED; } + /* If we have likely upper bound, trust it for very small iteration +counts. Such loops would otherwise get mispredicted by standard +LOOP_EXIT heuristics. */ + else if (n_exits == 1 + && likely_max_stmt_executions (loop, &nit) + && wi::ltu_p (nit, +RDIV (REG_BR_PROB_BASE, + REG_BR_PROB_BASE +- predictor_info +[PRED_LOOP_EXIT].hitrate))) + { + nitercst = nit.to_shwi (); +
[v3 PATCH] Support allocators in tuples of zero size.
All in all this is a bit inane, but the spec requires a zero-sized tuple to provide allocator overloads for constructors, even though they do absolutely nothing (they completely ignore the allocator). Presumably these are also useful for some metaprogrammers I haven't heard of. This takes us one small step further to pass libc++'s testsuite for tuple. Tested on Linux-X64. 2016-06-05 Ville Voutilainen Support allocators in tuples of zero size. * include/std/tuple (tuple<>::tuple(), tuple<>::tuple(allocator_arg_t, const _Alloc&), tuple<>::tuple(allocator_arg_t, const _Alloc&, const tuple&), tuple<>::tuple(allocator_arg_t, const _Alloc&, tuple&&)): New. * testsuite/20_util/tuple/cons/allocators.cc: Adjust. diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple index 17c8204..f7805ca 100644 --- a/libstdc++-v3/include/std/tuple +++ b/libstdc++-v3/include/std/tuple @@ -876,6 +876,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { public: void swap(tuple&) noexcept { /* no-op */ } + // We need the default since we're going to define no-op + // allocator constructors. + tuple() = default; + // No-op allocator constructors. + template + tuple(allocator_arg_t __tag, const _Alloc& __a) { } + template + tuple(allocator_arg_t __tag, const _Alloc& __a, const tuple& __in) { } + template + tuple(allocator_arg_t __tag, const _Alloc& __a, tuple&& __in) { } }; /// Partial specialization, 2-element tuple. diff --git a/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc b/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc index 052b79f..bc45780 100644 --- a/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc +++ b/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc @@ -162,8 +162,30 @@ void test01() } +void test02() +{ + bool test __attribute__((unused)) = true; + using std::allocator_arg; + using std::tuple; + using std::make_tuple; + + typedef tuple<> test_type; + + MyAlloc a; + + // default construction + test_type t1(allocator_arg, a); + // copy construction + test_type t2(allocator_arg, a, t1); + // move construction + test_type t3(allocator_arg, a, std::move(t1)); + // make_tuple + test_type empty = make_tuple(); +} + int main() { test01(); + test02(); return 0; }
Re: Ping! [fortran, patch, pr69659, v1] [6/7 Regression] ICE on using option -frepack-arrays, in gfc_conv_descriptor_data_get
Hi Paul, thanks for quick review. Comitted as r237105 for trunk. And r237107 for branch-6. Thanks again and regards, Andre On Sun, 5 Jun 2016 17:44:19 +0200 Paul Richard Thomas wrote: > Hi Andre, > > That's verging on 'obvious' and even does the job :-) > > OK for trunk and 6-branch. > > Thanks for the patch > > Paul > > On 5 June 2016 at 16:13, Andre Vehreschild wrote: > > Hi Paul, > > > > now with attachment. > > > > - Andre > > > > On Sun, 5 Jun 2016 16:06:09 +0200 > > Paul Richard Thomas wrote: > > > >> Hi Andre, > >> > >> There's no attachment. Get it posted tonight and I will take a look at it. > >> > >> Cheers > >> > >> Paul > >> > >> On 5 June 2016 at 14:04, Andre Vehreschild wrote: > >> > Ping! > >> > > >> > On Sun, 22 May 2016 18:51:22 +0200 > >> > Andre Vehreschild wrote: > >> > > >> >> Hi all, > >> >> > >> >> attached patch fixes a regression that occurred on some testcases when > >> >> doing a validation run with -frepack-arrays. The issue here was that > >> >> for class arrays the array descriptor is in the _data component and not > >> >> directly at the address of the class_array. The patch fixes this issue > >> >> for pr69659 on trunk 7 and gcc-6-branch. > >> >> > >> >> Ok for trunk and gcc-6? > >> >> > >> >> Bootstrapped and regtested on x86_64-linux-gnu. > >> >> > >> >> Regards, > >> >> Andre > >> > > >> > > >> > -- > >> > Andre Vehreschild * Email: vehre ad gmx dot de > >> > >> > >> > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de > > > -- Andre Vehreschild * Email: vehre ad gmx dot de Index: gcc/testsuite/gfortran.dg/class_array_22.f03 === --- gcc/testsuite/gfortran.dg/class_array_22.f03 (nicht existent) +++ gcc/testsuite/gfortran.dg/class_array_22.f03 (Revision 237105) @@ -0,0 +1,25 @@ +! { dg-do compile } +! { dg-options "-frepack-arrays " } +! +! Original class_array_11.f03 but with -frepack-arrays a new +! ICE was produced reported in +! PR fortran/69659 +! +! Original testcase by Ian Harvey +! Reduced by Janus Weil + + IMPLICIT NONE + + TYPE :: ParentVector +INTEGER :: a + END TYPE ParentVector + +CONTAINS + + SUBROUTINE vector_operation(pvec) +CLASS(ParentVector), INTENT(INOUT) :: pvec(:) +print *,pvec(1)%a + END SUBROUTINE + +END + Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 237104) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,8 @@ +2016-06-05 Andre Vehreschild + + PR fortran/69659 + * gfortran.dg/class_array_22.f03: New test. + 2016-06-05 Jan Hubicka * gcc.dg/tree-prof/peel-1.c: Fix testcase. Index: gcc/fortran/trans-array.c === --- gcc/fortran/trans-array.c (Revision 237104) +++ gcc/fortran/trans-array.c (Arbeitskopie) @@ -6386,7 +6386,12 @@ stmtCleanup = gfc_finish_block (&cleanup); /* Only do the cleanup if the array was repacked. */ - tmp = build_fold_indirect_ref_loc (input_location, dumdesc); + if (is_classarray) + /* For a class array the dummy array descriptor is in the _class + component. */ + tmp = gfc_class_data_get (dumdesc); + else + tmp = build_fold_indirect_ref_loc (input_location, dumdesc); tmp = gfc_conv_descriptor_data_get (tmp); tmp = fold_build2_loc (input_location, NE_EXPR, boolean_type_node, tmp, tmpdesc); Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 237104) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,10 @@ +2016-06-05 Andre Vehreschild + + PR fortran/69659 + * trans-array.c (gfc_trans_dummy_array_bias): For class arrays use + the address of the _data component to reference the arrays data + component. + 2016-06-03 Chung-Lin Tang * trans-openmp.c (gfc_trans_omp_reduction_list): Add mark_addressable Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 237101) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,10 @@ +2016-06-05 Andre Vehreschild + + PR fortran/69659 + * trans-array.c (gfc_trans_dummy_array_bias): For class arrays use + the address of the _data component to reference the arrays data + component. + 2016-06-01 Paul Thomas PR fortran/71156 Index: gcc/fortran/trans-array.c === --- gcc/fortran/trans-array.c (Revision 237101) +++ gcc/fortran/trans-array.c (Arbeitskopie) @@ -6376,7 +6376,12 @@ stmtCleanup = gfc_finish_block (&cleanup); /* Only do the cleanup if the array was repacked. */ - tmp = build_fold_indirect_ref_loc (input_location, dumdesc); + if (is_classarray) + /* For a class array the dummy array descriptor is in the _class + component. */ + tmp = gfc_class_data_get (dumde
Re: [v3 PATCH] Support allocators in tuples of zero size.
On 5 June 2016 at 20:59, Ville Voutilainen wrote: > All in all this is a bit inane, but the spec requires a zero-sized tuple > to provide allocator overloads for constructors, even though they do > absolutely nothing (they completely ignore the allocator). Presumably > these are also useful for some metaprogrammers I haven't heard of. > This takes us one small step further to pass libc++'s testsuite for tuple. > > Tested on Linux-X64. > > 2016-06-05 Ville Voutilainen > > Support allocators in tuples of zero size. > * include/std/tuple (tuple<>::tuple(), > tuple<>::tuple(allocator_arg_t, const _Alloc&), > tuple<>::tuple(allocator_arg_t, const _Alloc&, const tuple&), > tuple<>::tuple(allocator_arg_t, const _Alloc&, tuple&&)): New. > * testsuite/20_util/tuple/cons/allocators.cc: Adjust. Uh, perhaps we shouldn't add a useless move constructor. 2016-06-05 Ville Voutilainen Support allocators in tuples of zero size. * include/std/tuple (tuple<>::tuple(), tuple<>::tuple(allocator_arg_t, const _Alloc&), tuple<>::tuple(allocator_arg_t, const _Alloc&, const tuple&)): New. * testsuite/20_util/tuple/cons/allocators.cc: Adjust. diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple index 17c8204..7570883 100644 --- a/libstdc++-v3/include/std/tuple +++ b/libstdc++-v3/include/std/tuple @@ -876,6 +876,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { public: void swap(tuple&) noexcept { /* no-op */ } + // We need the default since we're going to define no-op + // allocator constructors. + tuple() = default; + // No-op allocator constructors. + template + tuple(allocator_arg_t __tag, const _Alloc& __a) { } + template + tuple(allocator_arg_t __tag, const _Alloc& __a, const tuple& __in) { } }; /// Partial specialization, 2-element tuple. diff --git a/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc b/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc index 052b79f..bc45780 100644 --- a/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc +++ b/libstdc++-v3/testsuite/20_util/tuple/cons/allocators.cc @@ -162,8 +162,30 @@ void test01() } +void test02() +{ + bool test __attribute__((unused)) = true; + using std::allocator_arg; + using std::tuple; + using std::make_tuple; + + typedef tuple<> test_type; + + MyAlloc a; + + // default construction + test_type t1(allocator_arg, a); + // copy construction + test_type t2(allocator_arg, a, t1); + // move construction + test_type t3(allocator_arg, a, std::move(t1)); + // make_tuple + test_type empty = make_tuple(); +} + int main() { test01(); + test02(); return 0; }
loop-ch tweek
Hi, both loop-ch and loop-ivcanon want to trottle down the heuristics on paths containing call. Testing for presence of GIMPLE_CALL is wrong for internal call and cheap builtins that are expanded inline. Bootstrapped/regtested x86_64-linux, OK? Honza * gimple.c: Include builtins.h. (gimple_real_call_p): New function. * gimple.h (gimple_real_call_p): Declare. * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Use it. * tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Likewise. Index: gimple.c === --- gimple.c(revision 237101) +++ gimple.c(working copy) @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. #include "gimple-walk.h" #include "gimplify.h" #include "target.h" +#include "builtins.h" /* All the tuples have their operand vector (if present) at the very bottom @@ -3018,3 +3019,20 @@ maybe_remove_unused_call_args (struct fu update_stmt_fn (fn, stmt); } } + +/* Return true if STMT will likely expand to real call statment. */ + +bool +gimple_real_call_p (gimple *stmt) +{ + if (gimple_code (stmt) != GIMPLE_CALL) +return false; + if (gimple_call_internal_p (stmt)) +return false; + tree decl = gimple_call_fndecl (stmt); + if (decl && DECL_IS_BUILTIN (decl) + && (is_simple_builtin (decl) + || is_inexpensive_builtin (decl))) +return false; + return true; +} Index: gimple.h === --- gimple.h(revision 237101) +++ gimple.h(working copy) @@ -1525,6 +1525,7 @@ extern void preprocess_case_label_vec_fo extern void gimple_seq_set_location (gimple_seq, location_t); extern void gimple_seq_discard (gimple_seq); extern void maybe_remove_unused_call_args (struct function *, gimple *); +extern bool gimple_real_call_p (gimple *); /* Formal (expression) temporary table handling: multiple occurrences of the same scalar expression are evaluated into the same temporary. */ Index: tree-ssa-loop-ch.c === --- tree-ssa-loop-ch.c (revision 237101) +++ tree-ssa-loop-ch.c (working copy) @@ -118,7 +118,7 @@ should_duplicate_loop_header_p (basic_bl if (is_gimple_debug (last)) continue; - if (is_gimple_call (last)) + if (gimple_real_call_p (last)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, Index: tree-ssa-loop-ivcanon.c === --- tree-ssa-loop-ivcanon.c (revision 237101) +++ tree-ssa-loop-ivcanon.c (working copy) @@ -339,15 +339,11 @@ tree_estimate_loop_size (struct loop *lo for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple *stmt = gsi_stmt (gsi); - if (gimple_code (stmt) == GIMPLE_CALL) + if (gimple_code (stmt) == GIMPLE_CALL + && gimple_real_call_p (stmt)) { int flags = gimple_call_flags (stmt); - tree decl = gimple_call_fndecl (stmt); - - if (decl && DECL_IS_BUILTIN (decl) - && is_inexpensive_builtin (decl)) - ; - else if (flags & (ECF_PURE | ECF_CONST)) + if (flags & (ECF_PURE | ECF_CONST)) size->num_pure_calls_on_hot_path++; else size->num_non_pure_calls_on_hot_path++;
[committed] Fix unused argument error in expr.c
The attached change fixes a build error on hppa-unknown-linux-gnu. Committed as obvious. Dave -- John David Anglin dave.ang...@bell.net 2016-06-05 John David Anglin * expr.c (move_by_pieces_d::generate): Mark mode parameter with ATTRIBUTE_UNUSED. Index: expr.c === --- expr.c (revision 237097) +++ expr.c (working copy) @@ -1143,7 +1143,8 @@ gen function that should be used to generate the mode. */ void -move_by_pieces_d::generate (rtx op0, rtx op1, machine_mode mode) +move_by_pieces_d::generate (rtx op0, rtx op1, + machine_mode mode ATTRIBUTE_UNUSED) { #ifdef PUSH_ROUNDING if (op0 == NULL_RTX)
Re: [patch, fortran] PR52393 I/O: "READ format" statement with parenthesized default-char-expr
On 06/03/2016 12:40 PM, H.J. Lu wrote: > On Wed, Jun 1, 2016 at 9:28 AM, Jerry DeLisle wrote: >> On 06/01/2016 12:25 AM, FX wrote: 2016-05-30 Jerry DeLisle PR fortran/52393 * io.c (match_io): For READ, try to match a default character expression. If found, set the dt format expression to this, otherwise go back and try control list. >>> >>> OK. Maybe you could add some “negative” tests too? To be sure we still >>> catch malformed parenthesized formats? >>> >>> FX >>> >> >> Thanks for review! yes I will add some tests. >> > > It caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71404 > > Patch committed. Author: jvdelisle Date: Sun Jun 5 19:49:59 2016 New Revision: 237108 URL: https://gcc.gnu.org/viewcvs?rev=237108&root=gcc&view=rev Log: 2016-06-05 Jerry DeLisle PR fortran/71404 * io.c (match_io): For READ, commit in pending symbols in the current statement before trying to match an expression so that if the match fails and we undo symbols we dont toss good symbols. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/io.c
[PATCH, i386]: Fix PR 71389, ICE on trunk gcc on ivybridge target (df_refs_verify)
Hello! As explained in the PR [1] Comment #4, this is a target problem with invalid RTL sharing. Invalid sharing is created by the misaligned expansion code in i386.c, when subregs are involved. vec_extract_hi_v32qi pattern is generated in loop2_invariant pass when misaligned V8SI move is generated, and later cprop3 pass propagates a register inside a subreg. The pass updates both instances of (reg:V8SI 181) to (reg:V8SI 175) in (insn 197) and (insn 198). However, since just renamed (reg 175) doesn't trigger rescan of (insn 198) in the substitution loop, we miss a rescan of (insn 198). The solution is to avoid invalid sharing by copying RTXes when subregs are created. 2016-06-06 Uros Bizjak PR target/71389 * config/i386/i386.c (ix86_avx256_split_vector_move_misalign): Copy op1 RTX to avoid invalid sharing. (ix86_expand_vector_move_misalign): Ditto. testsuite/ChangeLog: 2016-06-06 Uros Bizjak PR target/71389 * g++.dg/pr71389.C: New test. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN, will be backported to release branches. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71389 Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 237110) +++ config/i386/i386.c (working copy) @@ -19552,7 +19552,7 @@ ix86_avx256_split_vector_move_misalign (rtx op0, r m = adjust_address (op0, mode, 0); emit_insn (extract (m, op1, const0_rtx)); m = adjust_address (op0, mode, 16); - emit_insn (extract (m, op1, const1_rtx)); + emit_insn (extract (m, copy_rtx (op1), const1_rtx)); } else gcc_unreachable (); @@ -19724,7 +19724,7 @@ ix86_expand_vector_move_misalign (machine_mode mod m = adjust_address (op0, V2SFmode, 0); emit_insn (gen_sse_storelps (m, op1)); m = adjust_address (op0, V2SFmode, 8); - emit_insn (gen_sse_storehps (m, op1)); + emit_insn (gen_sse_storehps (m, copy_rtx (op1))); } } else Index: testsuite/g++.dg/pr71389.C === --- testsuite/g++.dg/pr71389.C (nonexistent) +++ testsuite/g++.dg/pr71389.C (working copy) @@ -0,0 +1,23 @@ +// { dg-do compile { target i?86-*-* x86_64-*-* } } +// { dg-options "-std=c++11 -O3 -march=ivybridge" } + +#include + +extern int le_s6, le_s9, le_s11; +long foo_v14[16][16]; + +void fn1() { + std::array, 16> v13; + for (; le_s6;) +for (int k1 = 2; k1 < 4; k1 = k1 + 1) { + for (int n1 = 0; n1 < le_s9; n1 = 8) { +*foo_v14[6] = 20923310; +for (int i2 = n1; i2 < n1 + 8; i2 = i2 + 1) + v13.at(5).at(i2 + 6 - n1) = 306146921; + } + + for (int l2 = 0; l2 < le_s11; l2 = l2 + 1) + *(l2 + v13.at(5).begin()) = 306146921; +} + v13.at(le_s6 - 4); +}
[patch committed FT32] use setup_incoming_varargs
The FT32 target now uses SETUP_INCOMING_VARARGS to handle varargs. This saves 24 byte per stack frame, because the caller no longer needs to allocate space on the stack for r0-r5 in the event of a varargs caller. * config/ft32/ft32.c (ft32_setup_incoming_varargs, ft32_expand_prolog, ft32_expand_epilogue): Handle pretend_args. * config/ft32/ft32.h: Remove OUTGOING_REG_PARM_STACK_SPACE. * config/ft32/ft32.md: Add pretend_returner. Index: gcc/config/ft32/ft32.h === --- gcc/config/ft32/ft32.h (revision 237115) +++ gcc/config/ft32/ft32.h (working copy) @@ -256,15 +256,6 @@ enum reg_class be allocated. */ #define STARTING_FRAME_OFFSET 0 -/* Define this if the above stack space is to be considered part of the - space allocated by the caller. */ -#define OUTGOING_REG_PARM_STACK_SPACE(FNTYPE) 1 -/* #define STACK_PARMS_IN_REG_PARM_AREA */ - -/* Define this if it is the responsibility of the caller to allocate - the area reserved for arguments passed in registers. */ -#define REG_PARM_STACK_SPACE(FNDECL) (6 * UNITS_PER_WORD) - /* Offset from the argument pointer register to the first argument's address. On some machines it may depend on the data type of the function. */ Index: gcc/config/ft32/ft32.md === --- gcc/config/ft32/ft32.md (revision 237115) +++ gcc/config/ft32/ft32.md (working copy) @@ -929,6 +929,14 @@ "reload_completed" "return") +(define_insn "pretend_returner" + [(set (reg:SI SP_REG) +(plus:SI (reg:SI SP_REG) + (match_operand:SI 0))) + (return)] + "reload_completed" + "pop.l $cc\;add.l $sp,$sp,%0\;jmpi $cc") + (define_insn "returner24" [ (set (reg:SI SP_REG) Index: gcc/config/ft32/ft32.c === --- gcc/config/ft32/ft32.c (revision 237115) +++ gcc/config/ft32/ft32.c (working copy) @@ -409,7 +409,7 @@ ft32_compute_frame (void) cfun->machine->callee_saved_reg_size += 4; cfun->machine->size_for_adjusting_sp = -crtl->args.pretend_args_size +0 // crtl->args.pretend_args_size + cfun->machine->local_vars_size + (ACCUMULATE_OUTGOING_ARGS ? crtl->outgoing_args_size : 0); } @@ -434,15 +434,32 @@ ft32_expand_prologue (void) ft32_compute_frame (); + int args_to_push = crtl->args.pretend_args_size; + if (args_to_push) +{ + int i; + + insn = emit_insn (gen_movsi_pop ((gen_rtx_REG (Pmode, FT32_R29; + + for (i = 0; i < (args_to_push / 4); i++) + { + insn = + emit_insn (gen_movsi_push ((gen_rtx_REG (Pmode, FT32_R5 - i; + RTX_FRAME_RELATED_P (insn) = 1; + } + + insn = emit_insn (gen_movsi_push ((gen_rtx_REG (Pmode, FT32_R29; +} + if (flag_stack_usage_info) current_function_static_stack_size = cfun->machine->size_for_adjusting_sp; if (!must_link () && (cfun->machine->callee_saved_reg_size == 4)) { insn = -emit_insn (gen_link - (gen_rtx_REG (Pmode, FT32_R13), -GEN_INT (-cfun->machine->size_for_adjusting_sp))); + emit_insn (gen_link + (gen_rtx_REG (Pmode, FT32_R13), + GEN_INT (-cfun->machine->size_for_adjusting_sp))); RTX_FRAME_RELATED_P (insn) = 1; return; } @@ -450,27 +467,27 @@ ft32_expand_prologue (void) if (optimize_size) { for (regno = FIRST_PSEUDO_REGISTER; regno-- > 0;) -{ - if (!fixed_regs[regno] && !call_used_regs[regno] - && df_regs_ever_live_p (regno)) -{ - rtx preg = gen_rtx_REG (Pmode, regno); - emit_insn (gen_call_prolog (preg)); - break; -} -} + { + if (!fixed_regs[regno] && !call_used_regs[regno] + && df_regs_ever_live_p (regno)) + { + rtx preg = gen_rtx_REG (Pmode, regno); + emit_insn (gen_call_prolog (preg)); + break; + } + } } else { for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) -{ - if (!fixed_regs[regno] && df_regs_ever_live_p (regno) - && !call_used_regs[regno]) -{ - insn = emit_insn (gen_movsi_push (gen_rtx_REG (Pmode, regno))); - RTX_FRAME_RELATED_P (insn) = 1; -} -} + { + if (!fixed_regs[regno] && df_regs_ever_live_p (regno) + && !call_used_regs[regno]) + { + insn = emit_insn (gen_movsi_push (gen_rtx_REG (Pmode, regno))); + RTX_FRAME_RELATED_P (insn) = 1; + } + } } if (65536 <= cfun->machine->size_for_adjusting_sp) @@ -481,17 +498,17 @@ ft32_expand_prologue (void) if (must_link ()) { insn = -emit_insn (gen_
[PATCH] Fix PR71398
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2016-06-06 Richard Biener PR tree-optimization/71398 * tree-ssa-loop-ivcanon.c (unloop_loops): First unloop, then remove edges. * gcc.dg/torture/pr71398.c: New testcase. Index: gcc/tree-ssa-loop-ivcanon.c === *** gcc/tree-ssa-loop-ivcanon.c (revision 237053) --- gcc/tree-ssa-loop-ivcanon.c (working copy) *** static void *** 615,630 unloop_loops (bitmap loop_closed_ssa_invalidated, bool *irred_invalidated) { - /* First remove edges in peeled copies. */ - unsigned i; - edge e; - FOR_EACH_VEC_ELT (edges_to_remove, i, e) - { - bool ok = remove_path (e); - gcc_assert (ok); - } - edges_to_remove.release (); - while (loops_to_unloop.length ()) { struct loop *loop = loops_to_unloop.pop (); --- 637,642 *** unloop_loops (bitmap loop_closed_ssa_inv *** 660,665 --- 672,687 } loops_to_unloop.release (); loops_to_unloop_nunroll.release (); + + /* Remove edges in peeled copies. */ + unsigned i; + edge e; + FOR_EACH_VEC_ELT (edges_to_remove, i, e) + { + bool ok = remove_path (e); + gcc_assert (ok); + } + edges_to_remove.release (); } /* Tries to unroll LOOP completely, i.e. NITER times. Index: gcc/testsuite/gcc.dg/torture/pr71398.c === *** gcc/testsuite/gcc.dg/torture/pr71398.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr71398.c (working copy) *** *** 0 --- 1,17 + /* { dg-do compile } */ + + unsigned a, b, c[1]; + void __assert_fail() __attribute__((__noreturn__)); + void fn1() + { + int d; + unsigned e; + for (;;) + { + d = 0; + for (; d <= 6; d++) + c[d] || a ? 0 : __assert_fail(); + for (; e <= 5; e++) + a = b; + } + }