[PATCH, aarch64 1/4] aarch64: Add movprfx alternatives for unpredicated patterns
* config/aarch64/aarch64.md (movprfx): New attr. (length): Default movprfx to 8. * config/aarch64/aarch64-sve.md (*mul3): Add movprfx alt. (*madd, *msubmul3_highpart): Likewise. (*3): Likewise. (*v3): Likewise. (*3): Likewise. (*3): Likewise. (*fma4, *fnma4): Likewise. (*fms4, *fnms4): Likewise. (*div4): Likewise. --- gcc/config/aarch64/aarch64-sve.md | 184 ++ gcc/config/aarch64/aarch64.md | 11 +- 2 files changed, 116 insertions(+), 79 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 8e2433385a8..3dee6a4376d 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -937,47 +937,53 @@ ;; to gain much and would make the instruction seem less uniform to the ;; register allocator. (define_insn "*mul3" - [(set (match_operand:SVE_I 0 "register_operand" "=w, w") + [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w") (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl, Upl") + [(match_operand: 1 "register_operand" "Upl, Upl, Upl") (mult:SVE_I -(match_operand:SVE_I 2 "register_operand" "%0, 0") -(match_operand:SVE_I 3 "aarch64_sve_mul_operand" "vsm, w"))] +(match_operand:SVE_I 2 "register_operand" "%0, 0, w") +(match_operand:SVE_I 3 "aarch64_sve_mul_operand" "vsm, w, w"))] UNSPEC_MERGE_PTRUE))] "TARGET_SVE" "@ mul\t%0., %0., #%3 - mul\t%0., %1/m, %0., %3." + mul\t%0., %1/m, %0., %3. + movprfx\t%0, %2\;mul\t%0., %1/m, %0., %3." + [(set_attr "movprfx" "*,*,yes")] ) (define_insn "*madd" - [(set (match_operand:SVE_I 0 "register_operand" "=w, w") + [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w") (plus:SVE_I (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl, Upl") -(mult:SVE_I (match_operand:SVE_I 2 "register_operand" "%0, w") -(match_operand:SVE_I 3 "register_operand" "w, w"))] + [(match_operand: 1 "register_operand" "Upl, Upl, Upl") +(mult:SVE_I (match_operand:SVE_I 2 "register_operand" "%0, w, w") +(match_operand:SVE_I 3 "register_operand" "w, w, w"))] UNSPEC_MERGE_PTRUE) - (match_operand:SVE_I 4 "register_operand" "w, 0")))] + (match_operand:SVE_I 4 "register_operand" "w, 0, w")))] "TARGET_SVE" "@ mad\t%0., %1/m, %3., %4. - mla\t%0., %1/m, %2., %3." + mla\t%0., %1/m, %2., %3. + movprfx\t%0, %4\;mla\t%0., %1/m, %2., %3." + [(set_attr "movprfx" "*,*,yes")] ) (define_insn "*msub3" - [(set (match_operand:SVE_I 0 "register_operand" "=w, w") + [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w") (minus:SVE_I - (match_operand:SVE_I 4 "register_operand" "w, 0") + (match_operand:SVE_I 4 "register_operand" "w, 0, w") (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl, Upl") -(mult:SVE_I (match_operand:SVE_I 2 "register_operand" "%0, w") -(match_operand:SVE_I 3 "register_operand" "w, w"))] + [(match_operand: 1 "register_operand" "Upl, Upl, Upl") +(mult:SVE_I (match_operand:SVE_I 2 "register_operand" "%0, w, w") +(match_operand:SVE_I 3 "register_operand" "w, w, w"))] UNSPEC_MERGE_PTRUE)))] "TARGET_SVE" "@ msb\t%0., %1/m, %3., %4. - mls\t%0., %1/m, %2., %3." + mls\t%0., %1/m, %2., %3. + movprfx\t%0, %4\;mls\t%0., %1/m, %2., %3." + [(set_attr "movprfx" "*,*,yes")] ) ;; Unpredicated highpart multiplication. @@ -997,15 +1003,18 @@ ;; Predicated highpart multiplication. (define_insn "*mul3_highpart" - [(set (match_operand:SVE_I 0 "register_operand" "=w") + [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w") (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl") - (unspec:SVE_I [(match_operand:SVE_I 2 "register_operand" "%0") - (match_operand:SVE_I 3 "register_operand" "w")] + [(match_operand: 1 "register_operand" "Upl, Upl") + (unspec:SVE_I [(match_operand:SVE_I 2 "register_operand" "%0, w") + (match_operand:SVE_I 3 "register_operand" "w, w")] MUL_HIGHPART)] UNSPEC_MERGE_PTRUE))] "TARGET_SVE" - "mulh\t%0., %1/m, %0., %3." + "@ + mulh\t%0., %1/m, %0., %3. + movprfx\t%0, %2\;mulh\t%0., %1/m, %0., %3." + [(set_attr "movprfx" "*,yes")] ) ;; Unpredicated division. @@ -1025,17 +1034,19 @@ ;; Division predicated with a PTRUE. (define_insn "*3" - [(set (match_operand:SVE_SDI 0 "register_operand" "=w, w") + [(set (match_operand:SVE_SDI 0 "register_operand" "=w, w, ?&w") (unspec:SVE_SDI - [(match_operand: 1 "register_operand" "Upl, Upl") + [(match_operand: 1 "regis
[PATCH, aarch64 0/4] Add movprfx patterns and alternatives
These don't fire very often, but at least a few times within the testsuite. Enough to test my qemu implementation of the insns. r~ Richard Henderson (4): aarch64: Add movprfx alternatives for unpredicated patterns aarch64: Remove predicate from inside SVE_COND_FP_BINARY aarch64: Add movprfx alternatives for predicate patterns aarch64: Add movprfx patterns for zero and unmatched select gcc/config/aarch64/aarch64-protos.h | 1 - gcc/config/aarch64/aarch64.c| 48 --- gcc/config/aarch64/aarch64-sve.md | 488 gcc/config/aarch64/aarch64.md | 11 +- gcc/config/aarch64/iterators.md | 26 +- gcc/config/aarch64/predicates.md| 3 + 6 files changed, 386 insertions(+), 191 deletions(-) -- 2.17.1
[PATCH, aarch64 2/4] aarch64: Remove predicate from inside SVE_COND_FP_BINARY
The predicate is present within the containing UNSPEC_SEL; there is no need to duplicate it. * config/aarch64/aarch64-sve.md (cond_): Remove match_dup 1 from the inner unspec. (*cond_): Likewise. --- gcc/config/aarch64/aarch64-sve.md | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 3dee6a4376d..2aceef65c80 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -2677,8 +2677,7 @@ (unspec:SVE_F [(match_operand: 1 "register_operand") (unspec:SVE_F -[(match_dup 1) - (match_operand:SVE_F 2 "register_operand") +[(match_operand:SVE_F 2 "register_operand") (match_operand:SVE_F 3 "register_operand")] SVE_COND_FP_BINARY) (match_operand:SVE_F 4 "register_operand")] @@ -2694,8 +2693,7 @@ (unspec:SVE_F [(match_operand: 1 "register_operand" "Upl") (unspec:SVE_F -[(match_dup 1) - (match_operand:SVE_F 2 "register_operand" "0") +[(match_operand:SVE_F 2 "register_operand" "0") (match_operand:SVE_F 3 "register_operand" "w")] SVE_COND_FP_BINARY) (match_dup 2)] @@ -2710,8 +2708,7 @@ (unspec:SVE_F [(match_operand: 1 "register_operand" "Upl") (unspec:SVE_F -[(match_dup 1) - (match_operand:SVE_F 2 "register_operand" "w") +[(match_operand:SVE_F 2 "register_operand" "w") (match_operand:SVE_F 3 "register_operand" "0")] SVE_COND_FP_BINARY) (match_dup 3)] -- 2.17.1
[PATCH, aarch64 4/4] aarch64: Add movprfx patterns for zero and unmatched select
* config/aarch64/aarch64-protos.h, config/aarch64/aarch64.c (aarch64_sve_prepare_conditional_op): Remove. * config/aarch64/aarch64-sve.md (cond_): Allow aarch64_simd_reg_or_zero as select operand; remove the aarch64_sve_prepare_conditional_op call. (cond_): Likewise. (cond_): Likewise. (*cond__z): New pattern. (*cond__z): New pattern. (*cond__z): New pattern. (*cond__any): New pattern. (*cond__any): New pattern. (*cond__any): New pattern and a splitters to match all of the *_any patterns. * config/aarch64/predicates.md (aarch64_sve_any_binary_operator): New. --- gcc/config/aarch64/aarch64-protos.h | 1 - gcc/config/aarch64/aarch64.c| 54 -- gcc/config/aarch64/aarch64-sve.md | 154 gcc/config/aarch64/predicates.md| 3 + 4 files changed, 136 insertions(+), 76 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 87c6ae20278..514ddc457ca 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -513,7 +513,6 @@ bool aarch64_gen_adjusted_ldpstp (rtx *, bool, scalar_mode, RTX_CODE); void aarch64_expand_sve_vec_cmp_int (rtx, rtx_code, rtx, rtx); bool aarch64_expand_sve_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool); void aarch64_expand_sve_vcond (machine_mode, machine_mode, rtx *); -void aarch64_sve_prepare_conditional_op (rtx *, unsigned int, bool); #endif /* RTX_CODE */ void aarch64_init_builtins (void); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 3af7e98e166..d75d45f4b8b 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -16058,60 +16058,6 @@ aarch64_expand_sve_vcond (machine_mode data_mode, machine_mode cmp_mode, emit_set_insn (ops[0], gen_rtx_UNSPEC (data_mode, vec, UNSPEC_SEL)); } -/* Prepare a cond_ operation that has the operands - given by OPERANDS, where: - - - operand 0 is the destination - - operand 1 is a predicate - - operands 2 to NOPS - 2 are the operands to an operation that is - performed for active lanes - - operand NOPS - 1 specifies the values to use for inactive lanes. - - COMMUTATIVE_P is true if operands 2 and 3 are commutative. In that case, - no pattern is provided for a tie between operands 3 and NOPS - 1. */ - -void -aarch64_sve_prepare_conditional_op (rtx *operands, unsigned int nops, - bool commutative_p) -{ - /* We can do the operation directly if the "else" value matches one - of the other inputs. */ - for (unsigned int i = 2; i < nops - 1; ++i) -if (rtx_equal_p (operands[i], operands[nops - 1])) - { - if (i == 3 && commutative_p) - std::swap (operands[2], operands[3]); - return; - } - - /* If the "else" value is different from the other operands, we have - the choice of doing a SEL on the output or a SEL on an input. - Neither choice is better in all cases, but one advantage of - selecting the input is that it can avoid a move when the output - needs to be distinct from the inputs. E.g. if operand N maps to - register N, selecting the output would give: - - MOVPRFX Z0.S, Z2.S - ADD Z0.S, P1/M, Z0.S, Z3.S - SEL Z0.S, P1, Z0.S, Z4.S - - whereas selecting the input avoids the MOVPRFX: - - SEL Z0.S, P1, Z2.S, Z4.S - ADD Z0.S, P1/M, Z0.S, Z3.S. - - ??? Matching the other input can produce - - MOVPRFX Z4.S, P1/M, Z2.S - ADD Z4.S, P1/M, Z4.S, Z3.S - */ - machine_mode mode = GET_MODE (operands[0]); - rtx temp = gen_reg_rtx (mode); - rtvec vec = gen_rtvec (3, operands[1], operands[2], operands[nops - 1]); - emit_set_insn (temp, gen_rtx_UNSPEC (mode, vec, UNSPEC_SEL)); - operands[2] = operands[nops - 1] = temp; -} - /* Implement TARGET_MODES_TIEABLE_P. In principle we should always return true. However due to issues with register allocation it is preferable to avoid tieing integer scalar and FP scalar modes. Executing integer diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index db16affc093..b16d0455159 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1817,13 +1817,10 @@ (SVE_INT_BINARY:SVE_I (match_operand:SVE_I 2 "register_operand") (match_operand:SVE_I 3 "register_operand")) - (match_operand:SVE_I 4 "register_operand")] + (match_operand:SVE_I 4 "aarch64_simd_reg_or_zero")] UNSPEC_SEL))] "TARGET_SVE" -{ - bool commutative_p = (GET_RTX_CLASS () == RTX_COMM_ARITH); - aarch64_sve_prepare_conditional_op (operands, 5, commutative_p); -}) +) (define_expand "cond_" [(set (match_operand:SVE_SDI 0 "register_operand") @@ -1832,19 +1829,12 @@ (SVE_INT_BINARY_SD:SVE_SDI (match_operand:SVE_SDI 2 "register
[PATCH, aarch64 3/4] aarch64: Add movprfx alternatives for predicate patterns
* config/aarch64/iterators.md (SVE_INT_BINARY_REV): Remove. (SVE_COND_FP_BINARY_REV): Remove. (sve_int_op_rev, sve_fp_op_rev): New. * config/aarch64/aarch64-sve.md (*cond__0): New. (*cond__0): New. (*cond__0): New. (*cond__2): Rename, add movprfx alternative. (*cond__2): Similarly. (*cond__2): Similarly. (*cond__3): Similarly; use sve_int_op_rev. (*cond__3): Similarly. (*cond__3): Similarly; use sve_fp_op_rev. --- gcc/config/aarch64/aarch64.c | 8 +- gcc/config/aarch64/aarch64-sve.md | 163 ++ gcc/config/aarch64/iterators.md | 26 - 3 files changed, 149 insertions(+), 48 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index b88e7cac27a..3af7e98e166 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -16098,7 +16098,13 @@ aarch64_sve_prepare_conditional_op (rtx *operands, unsigned int nops, whereas selecting the input avoids the MOVPRFX: SEL Z0.S, P1, Z2.S, Z4.S - ADD Z0.S, P1/M, Z0.S, Z3.S. */ + ADD Z0.S, P1/M, Z0.S, Z3.S. + + ??? Matching the other input can produce + + MOVPRFX Z4.S, P1/M, Z2.S + ADD Z4.S, P1/M, Z4.S, Z3.S + */ machine_mode mode = GET_MODE (operands[0]); rtx temp = gen_reg_rtx (mode); rtvec vec = gen_rtvec (3, operands[1], operands[2], operands[nops - 1]); diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 2aceef65c80..db16affc093 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1841,57 +1841,108 @@ }) ;; Predicated integer operations. -(define_insn "*cond_" - [(set (match_operand:SVE_I 0 "register_operand" "=w") +;; All other things being equal, prefer the patterns for which the +;; destination matches the select input, as that gives us the most +;; freedom to swap the other operands. + +(define_insn "*cond__0" + [(set (match_operand:SVE_I 0 "register_operand" "+w, w, ?&w") (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl") + [(match_operand: 1 "register_operand" "Upl, Upl, Upl") (SVE_INT_BINARY:SVE_I -(match_operand:SVE_I 2 "register_operand" "0") -(match_operand:SVE_I 3 "register_operand" "w")) - (match_dup 2)] +(match_operand:SVE_I 2 "register_operand" "0, w, w") +(match_operand:SVE_I 3 "register_operand" "w, 0, w")) + (match_dup 0)] UNSPEC_SEL))] "TARGET_SVE" - "\t%0., %1/m, %0., %3." + "@ + \t%0., %1/m, %0., %3. + \t%0., %1/m, %0., %2. + movprfx\t%0, %1/m, %2\;\t%0., %1/m, %0., %3." + [(set_attr "movprfx" "*,*,yes")] ) -(define_insn "*cond_" - [(set (match_operand:SVE_SDI 0 "register_operand" "=w") +(define_insn "*cond__0" + [(set (match_operand:SVE_SDI 0 "register_operand" "+w, w, ?&w") (unspec:SVE_SDI - [(match_operand: 1 "register_operand" "Upl") + [(match_operand: 1 "register_operand" "Upl, Upl, Upl") (SVE_INT_BINARY_SD:SVE_SDI -(match_operand:SVE_SDI 2 "register_operand" "0") -(match_operand:SVE_SDI 3 "register_operand" "w")) - (match_dup 2)] +(match_operand:SVE_SDI 2 "register_operand" "0, w, w") +(match_operand:SVE_SDI 3 "register_operand" "w, 0, w")) + (match_dup 0)] UNSPEC_SEL))] "TARGET_SVE" - "\t%0., %1/m, %0., %3." + "@ + \t%0., %1/m, %0., %3. + \t%0., %1/m, %0., %2. + movprfx\t%0, %1/m, %2\;\t%0., %1/m, %0., %3." + [(set_attr "movprfx" "*,*,yes")] ) -;; Predicated integer operations with the operands reversed. -(define_insn "*cond_" - [(set (match_operand:SVE_I 0 "register_operand" "=w") +;; Predicated integer operations with select matching the first operand. +(define_insn "*cond__2" + [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w") (unspec:SVE_I - [(match_operand: 1 "register_operand" "Upl") - (SVE_INT_BINARY_REV:SVE_I -(match_operand:SVE_I 2 "register_operand" "w") -(match_operand:SVE_I 3 "register_operand" "0")) - (match_dup 3)] + [(match_operand: 1 "register_operand" "Upl, Upl") + (SVE_INT_BINARY:SVE_I +(match_operand:SVE_I 2 "register_operand" "0, w") +(match_operand:SVE_I 3 "register_operand" "w, w")) + (match_dup 2)] UNSPEC_SEL))] "TARGET_SVE" - "r\t%0., %1/m, %0., %2." + "@ + \t%0., %1/m, %0., %3. + movprfx\t%0, %2\;\t%0., %1/m, %0., %3." + [(set_attr "movprfx" "*,yes")] ) -(define_insn "*cond_" - [(set (match_operand:SVE_SDI 0 "register_operand" "=w") +(define_insn "*cond__2" + [(set (match_operand:SVE_SDI 0 "register_operand" "=w, ?&w") (unspec:SVE_SDI - [(match_operand: 1 "register_operand" "Upl") + [(match_operand: 1 "register_operand" "Upl, Upl") (SVE_INT_BINARY_SD:SVE_SDI -
[committed] Reinstate dump_generic_expr_loc
In r262149 ("Introduce dump_location_t", aka c309657f69df19eaa590b6650acf4d3bea9ac9e6), I removed dump_generic_expr_loc from dumpfile.h/c as it was unused in the source tree. It looks like this was overzealous of me, as Richard wants to use it in "Re: [14/n] PR85694: Rework overwidening detection" https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01872.html (which also reinstates the function, porting it to dump_location_t). Sorry about the breakage. This is a minimal patch to reinstate it, as I have followup patches to dumpfile.c which need to touch it (and the other dump_* functions), towards being able to save optimization records to a file. I believe it's a subset of Richard's patch above. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Committed to trunk as r262295 under the "obvious" rule. gcc/ChangeLog: * dumpfile.c (dump_generic_expr_loc): Undo removal of this function in r262149, changing "loc" param from source_location to const dump_location_t &. * dumpfile.h (dump_generic_expr_loc): Undo removal of this declaration, as above. --- gcc/dumpfile.c | 22 ++ gcc/dumpfile.h | 2 ++ 2 files changed, 24 insertions(+) diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c index 93bc651..5f69f9b 100644 --- a/gcc/dumpfile.c +++ b/gcc/dumpfile.c @@ -506,6 +506,28 @@ dump_generic_expr (dump_flags_t dump_kind, dump_flags_t extra_dump_flags, print_generic_expr (alt_dump_file, t, dump_flags | extra_dump_flags); } + +/* Similar to dump_generic_expr, except additionally print the source + location. */ + +void +dump_generic_expr_loc (dump_flags_t dump_kind, const dump_location_t &loc, + dump_flags_t extra_dump_flags, tree t) +{ + location_t srcloc = loc.get_location_t (); + if (dump_file && (dump_kind & pflags)) +{ + dump_loc (dump_kind, dump_file, srcloc); + print_generic_expr (dump_file, t, dump_flags | extra_dump_flags); +} + + if (alt_dump_file && (dump_kind & alt_flags)) +{ + dump_loc (dump_kind, alt_dump_file, srcloc); + print_generic_expr (alt_dump_file, t, dump_flags | extra_dump_flags); +} +} + /* Output a formatted message using FORMAT on appropriate dump streams. */ void diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h index 9828a3f..0e588a6 100644 --- a/gcc/dumpfile.h +++ b/gcc/dumpfile.h @@ -425,6 +425,8 @@ extern void dump_printf_loc (dump_flags_t, const dump_location_t &, const char *, ...) ATTRIBUTE_PRINTF_3; extern void dump_function (int phase, tree fn); extern void dump_basic_block (dump_flags_t, basic_block, int); +extern void dump_generic_expr_loc (dump_flags_t, const dump_location_t &, + dump_flags_t, tree); extern void dump_generic_expr (dump_flags_t, dump_flags_t, tree); extern void dump_gimple_stmt_loc (dump_flags_t, const dump_location_t &, dump_flags_t, gimple *, int); -- 1.8.5.3
abstract ABS_EXPR code for ranges into separate function
Boy those extract_range_from_*_expr functions are huge. OK to move the ABS_EXPR code into its own function? Tested on x86-64 Linux. Aldy commit 1e0dd52b909722e9387a34ef546fc308c68dac23 Author: Aldy Hernandez Date: Fri Jun 29 20:12:36 2018 +0200 * tree-vrp.c (extract_range_from_unary_expr): Abstract ABS_EXPR code... (extract_range_from_abs_expr): ...here. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d1bb6f54d4a..ddbbdd1253e 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2018-06-29 Aldy Hernandez + + * tree-vrp.c (extract_range_from_unary_expr): Abstract ABS_EXPR + code... + (extract_range_from_abs_expr): ...here. + 2018-06-29 Aldy Hernandez * tree-vrp.c (extract_range_from_binary_expr_1): Abstract a lot of the diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index ee112bb1826..c966334acbc 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2365,6 +2365,85 @@ extract_range_from_binary_expr_1 (value_range *vr, set_value_range (vr, type, min, max, NULL); } +/* Calculates the absolute value of a range and puts the result in VR. + VR0 is the input range. TYPE is the type of the resulting + range. */ + +static void +extract_range_from_abs_expr (value_range &vr, tree type, value_range &vr0) +{ + /* Pass through vr0 in the easy cases. */ + if (TYPE_UNSIGNED (type) + || value_range_nonnegative_p (&vr0)) +{ + copy_value_range (&vr, &vr0); + return; +} + + /* For the remaining varying or symbolic ranges we can't do anything + useful. */ + if (vr0.type == VR_VARYING + || symbolic_range_p (&vr0)) +{ + set_value_range_to_varying (&vr); + return; +} + + /* -TYPE_MIN_VALUE = TYPE_MIN_VALUE with flag_wrapv so we can't get a + useful range. */ + if (!TYPE_OVERFLOW_UNDEFINED (type) + && ((vr0.type == VR_RANGE + && vrp_val_is_min (vr0.min)) + || (vr0.type == VR_ANTI_RANGE + && !vrp_val_is_min (vr0.min +{ + set_value_range_to_varying (&vr); + return; +} + + /* ABS_EXPR may flip the range around, if the original range + included negative values. */ + tree min, max; + if (!vrp_val_is_min (vr0.min)) +min = fold_unary_to_constant (ABS_EXPR, type, vr0.min); + else +min = TYPE_MAX_VALUE (type); + + if (!vrp_val_is_min (vr0.max)) +max = fold_unary_to_constant (ABS_EXPR, type, vr0.max); + else +max = TYPE_MAX_VALUE (type); + + int cmp = compare_values (min, max); + gcc_assert (vr0.type != VR_ANTI_RANGE); + + /* If the range contains zero then we know that the minimum value in the + range will be zero. */ + if (range_includes_zero_p (vr0.min, vr0.max) == 1) +{ + if (cmp == 1) + max = min; + min = build_int_cst (type, 0); +} + else +{ + /* If the range was reversed, swap MIN and MAX. */ + if (cmp == 1) + std::swap (min, max); +} + + cmp = compare_values (min, max); + if (cmp == -2 || cmp == 1) +{ + /* If the new range has its limits swapped around (MIN > MAX), + then the operation caused one of them to wrap around, mark + the new range VARYING. */ + set_value_range_to_varying (&vr); +} + else +set_value_range (&vr, vr0.type, min, max, NULL); +} + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ @@ -2495,117 +2574,7 @@ extract_range_from_unary_expr (value_range *vr, return; } else if (code == ABS_EXPR) -{ - tree min, max; - int cmp; - - /* Pass through vr0 in the easy cases. */ - if (TYPE_UNSIGNED (type) - || value_range_nonnegative_p (&vr0)) - { - copy_value_range (vr, &vr0); - return; - } - - /* For the remaining varying or symbolic ranges we can't do anything - useful. */ - if (vr0.type == VR_VARYING - || symbolic_range_p (&vr0)) - { - set_value_range_to_varying (vr); - return; - } - - /* -TYPE_MIN_VALUE = TYPE_MIN_VALUE with flag_wrapv so we can't get a - useful range. */ - if (!TYPE_OVERFLOW_UNDEFINED (type) - && ((vr0.type == VR_RANGE - && vrp_val_is_min (vr0.min)) - || (vr0.type == VR_ANTI_RANGE - && !vrp_val_is_min (vr0.min - { - set_value_range_to_varying (vr); - return; - } - - /* ABS_EXPR may flip the range around, if the original range - included negative values. */ - if (!vrp_val_is_min (vr0.min)) - min = fold_unary_to_constant (code, type, vr0.min); - else - min = TYPE_MAX_VALUE (type); - - if (!vrp_val_is_min (vr0.max)) - max = fold_unary_to_constant (code, type, vr0.max); - else - max = TYPE_MAX_VALUE (type); - - cmp = compare_values (min, max); - - /* If a VR_ANTI_RANGEs contains zero, then we have - ~[-INF, min(MIN, MAX)]. */ - if (vr0.type == VR_ANTI_RANGE) - { - if (range_includes_zero_p (vr0.min, vr0.max) == 1) - { - /* Take the l
Re: [patch] jump threading multiple paths that start from the same BB
On 06/29/2018 02:50 PM, Jeff Law wrote: [ Returning to another old patch... ] On 11/07/2017 10:33 AM, Aldy Hernandez wrote: [One more time, but without rejected HTML mail, because apparently this is my first post to gcc-patches *ever* ;-)]. Howdy! While poking around in the backwards threader I noticed that we bail if we have already seen a starting BB. /* Do not jump-thread twice from the same block. */ if (bitmap_bit_p (threaded_blocks, entry->src->index) This limitation discards paths that are sub-paths of paths that have already been threaded. The following patch scans the remaining to-be-threaded paths to identify if any of them start from the same point, and are thus sub-paths of the just-threaded path. By removing the common prefix of blocks in upcoming threadable paths, and then rewiring first non-common block appropriately, we expose new threading opportunities, since we are no longer starting from the same BB. We also simplify the would-be threaded paths, because we don't duplicate already duplicated paths. This sounds confusing, but the documentation for the entry point to my patch (adjust_paths_after_duplication) shows an actual example: +/* After an FSM path has been jump threaded, adjust the remaining FSM + paths that are subsets of this path, so these paths can be safely + threaded within the context of the new threaded path. + + For example, suppose we have just threaded: + + 5 -> 6 -> 7 -> 8 -> 12 => 5 -> 6' -> 7' -> 8' -> 12' + + And we have an upcoming threading candidate: + 5 -> 6 -> 7 -> 8 -> 15 -> 20 + + This function adjusts the upcoming path into: + 8' -> 15 -> 20 Notice that we will no longer have two paths that start from the same BB. One will start with bb5, while the adjusted path will start with bb8'. With this we kill two birds-- we are able to thread more paths, and these paths will avoid duplicating a whole mess of things that have already been threaded. The attached patch is a subset of some upcoming work that can live on its own. It bootstraps and regtests. Also, by running it on a handful of .ii files, I can see that we are able to thread sub-paths that we previously dropped on the floor. More is better, right? :) To test this, I stole Jeff's method of using cachegrind to benchmark instruction counts and conditional branches (https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02434.html). Basically, I bootstrapped two compilers, with and without improvements, and used each to build a stage1 trunk. Each of these stage1-trunks were used on 246 .ii GCC files I have lying around from a bootstrap some random point last year. I used no special flags on builds apart from --enable-languages=c,c++. Although I would've wished a larger improvement, this works comes for free, as it's just a subset of other work I'm hacking on. Without further ado, here are my monumental, earth shattering improvements: Conditional branches Without patch: 411846839709 Withpatch: 411831330316 %changed: -0.0037660% Number of instructions Without patch: 2271844650634 Withpatch: 2271761153578 %changed: -0.0036754% OK for trunk? Aldy p.s. There is a lot of debugging/dumping code in my patch, which I can gladly remove if/when approved. It helps keep my head straight while looking at this spaghetti :). curr.patch gcc/ * tree-ssa-threadupdate.c (mark_threaded_blocks): Avoid dereferencing path[] beyond its length. (debug_path): New. (debug_all_paths): New. (rewire_first_differing_edge): New. (adjust_paths_after_duplication): New. (duplicate_thread_path): Call adjust_paths_after_duplication. Add new argument. (thread_through_all_blocks): Add new argument to duplicate_thread_path. This is fine for the trunk. I'd keep the dumping code as-is. It'll be useful in the future :-) Retested against current trunk and committed. Thanks. Aldy
[testsuite, committed] Fix get-absolute-line error handling
Hi, in factoring out get-absolute-line, I made a typo in a variable name used in an error message, which causes a tcl ERROR when the error condition is triggered. Fixed by this patch. This tcl ERROR didn't show up unless the error message was triggered by an error condition, which normally doesn't happen, so I've added a test in gcc.dg-selftests for that. Committed as obvious. Thanks, - Tom [testsuite] Fix get-absolute-line error handling 2018-07-01 Tom de Vries * gcc.dg-selftests/dg-final.exp (verify_call_1): Factor out of ... (verify_call): ... here. Move to toplevel. (verify_call_np, dg_final_directive_check_utils): New proc. (toplevel): Call dg_final_directive_check_utils. * lib/gcc-dg.exp (get-absolute-line): Fix typo in variable reference. --- gcc/testsuite/gcc.dg-selftests/dg-final.exp | 66 - gcc/testsuite/lib/gcc-dg.exp| 2 +- 2 files changed, 47 insertions(+), 21 deletions(-) diff --git a/gcc/testsuite/gcc.dg-selftests/dg-final.exp b/gcc/testsuite/gcc.dg-selftests/dg-final.exp index 79ffeb3b14f..1d98666e137 100644 --- a/gcc/testsuite/gcc.dg-selftests/dg-final.exp +++ b/gcc/testsuite/gcc.dg-selftests/dg-final.exp @@ -25,29 +25,47 @@ load_lib "scanasm.exp" load_lib "scanwpaipa.exp" load_lib "scanltranstree.exp" load_lib "scanoffloadtree.exp" +load_lib "gcc-dg.exp" -proc dg_final_directive_check_num_args {} { -proc verify_call { args } { - set call_name [lindex $args 0] - set call_args [lindex $args 1] - set expected_error [lindex $args 2] - - set errMsg "" - catch { - eval $call_name $call_args - } errMsg - - if { "$errMsg" != "$call_name: $expected_error" } { - send_log "For call $call_name $call_args\n" - send_log "expected: $call_name: $expected_error\n" - send_log "but got: $errMsg\n" - fail "$call_name: $expected_error" - return - } else { - pass "$call_name: $expected_error" - } +proc verify_call_1 { args } { +set call_name [lindex $args 0] +set call_args [lindex $args 1] +set expected_error [lindex $args 2] +set testid [lindex $args 3] + +set errMsg "" +catch { + eval $call_name $call_args +} errMsg + +if { "$errMsg" != "$expected_error" } { + send_log "For call $call_name $call_args\n" + send_log "expected: $expected_error\n" + send_log "but got: $errMsg\n" + fail "$testid" + return +} else { + pass "$testid" } +} + +proc verify_call { args } { +set call_name [lindex $args 0] +set call_args [lindex $args 1] +set expected_error [lindex $args 2] +verify_call_1 $call_name $call_args "$call_name: $expected_error" \ + "$call_name: $expected_error" +} + +proc verify_call_np { args } { +set call_name [lindex $args 0] +set call_args [lindex $args 1] +set expected_error [lindex $args 2] +verify_call_1 $call_name $call_args "$expected_error" \ + "$call_name: $expected_error" +} +proc dg_final_directive_check_num_args {} { proc verify_args { args } { set proc_name [lindex $args 0] set min [lindex $args 1] @@ -98,9 +116,17 @@ proc dg_final_directive_check_num_args {} { unset testname_with_flags } +proc dg_final_directive_check_utils {} { +verify_call_np get-absolute-line [list "" bla] \ + "dg-line var bla used, but not defined" +verify_call_np get-absolute-line [list 1 bla] \ + "dg-line var bla used at line 1, but not defined" +} + if ![gcc_parallel_test_run_p dg-final] { return } gcc_parallel_test_enable 0 dg_final_directive_check_num_args +dg_final_directive_check_utils gcc_parallel_test_enable 1 diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index 6f88ce2213e..4f1796348c3 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -1085,7 +1085,7 @@ proc get-absolute-line { useline line } { eval set var_defined [info exists $varname] if { ! $var_defined } { if { "$useline" != "" } { - error "dg-line var $org_varname used at line $uselinenr, but not defined" + error "dg-line var $org_varname used at line $useline, but not defined" } else { error "dg-line var $org_varname used, but not defined" }
[testsuite/guality, committed] Use relative line numbers in vla-1.c
Hi, this patch replaces absolute with relative line numbers in guality/vla-1.c. The test-case has a line: ... $ cat -n gcc/testsuite/gcc.dg/guality/vla-1.c 17return a[0]; /* { dg-final { gdb-test 17 "sizeof (a)" "6" } } */ ... which corresponds to relative line number '.'. This was not supported in get-absolute-line, so this patch adds support for that as well. Committed as obvious. Thanks, - Tom [testsuite/guality] Use relative line numbers in vla-1.c 2018-07-01 Tom de Vries * lib/gcc-dg.exp (get-absolute-line): Handle '.'. * gcc.dg/guality/vla-1.c: Use relative line numbers. --- gcc/testsuite/gcc.dg/guality/vla-1.c | 8 gcc/testsuite/lib/gcc-dg.exp | 4 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/guality/vla-1.c b/gcc/testsuite/gcc.dg/guality/vla-1.c index 651465428ac..264b9f3f92b 100644 --- a/gcc/testsuite/gcc.dg/guality/vla-1.c +++ b/gcc/testsuite/gcc.dg/guality/vla-1.c @@ -13,15 +13,15 @@ int __attribute__((noinline)) f1 (int i) { char a[i + 1]; - a[0] = 5;/* { dg-final { gdb-test 17 "i" "5" } } */ - return a[0]; /* { dg-final { gdb-test 17 "sizeof (a)" "6" } } */ + a[0] = 5;/* { dg-final { gdb-test .+1 "i" "5" } } */ + return a[0]; /* { dg-final { gdb-test . "sizeof (a)" "6" } } */ } int __attribute__((noinline)) f2 (int i) { - short a[i * 2 + 7]; /* { dg-final { gdb-test 24 "i" "5" } } */ - bar (a); /* { dg-final { gdb-test 24 "sizeof (a)" "17 * sizeof (short)" } } */ + short a[i * 2 + 7]; /* { dg-final { gdb-test .+1 "i" "5" } } */ + bar (a); /* { dg-final { gdb-test . "sizeof (a)" "17 * sizeof (short)" } } */ return a[i + 4]; } diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index 4f1796348c3..9e0b3f4ef95 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -1066,6 +1066,10 @@ proc dg-line { linenr varname } { # Argument 1 is the relative line number or line number variable reference # proc get-absolute-line { useline line } { +if { "$line" == "." } { + return $useline +} + if { [regsub "^\.\[+-\](\[0-9\]+)$" $line "\\1" num] && $useline != "" } { # Handle relative line specification, .+1 or .-1 etc. set num [expr $useline [string index $line 1] $num]
[testsuite/guality, committed] Prevent optimization of local in vla-1.c
Hi, Atm vla-1.c has the following failures: ... FAIL: gcc.dg/guality/vla-1.c -O1 -DPREVENT_OPTIMIZATION line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O2 -DPREVENT_OPTIMIZATION line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O3 -g -DPREVENT_OPTIMIZATION line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -Os -DPREVENT_OPTIMIZATION line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none -DPREVENT_OPTIMIZATION line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none -DPREVENT_OPTIMIZATION line 24 sizeof (a) == 17 * sizeof (short) FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects -DPREVENT_OPTIMIZATION line 17 sizeof (a) == 6 FAIL: gcc.dg/guality/vla-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects -DPREVENT_OPTIMIZATION line 24 sizeof (a) == 17 * sizeof (short) ... For the non-lto failures, the relevant bit is: ... int __attribute__((noinline)) f1 (int i) { char a[i + 1]; a[0] = 5; /* { dg-final { gdb-test .+1 "i" "5" } } */ return a[0]; /* { dg-final { gdb-test . "sizeof (a)" "6" } } */ } ... which at -O1 already ends up as: ... # vla-1.c:14:1 .loc 1 14 1 is_stmt 1 .cfi_startproc # DEBUG i => di # vla-1.c:15:3 .loc 1 15 3 # DEBUG => sxn(di+0x1)-0x1 # vla-1.c:16:3 .loc 1 16 3 # vla-1.c:17:3 .loc 1 17 3 # vla-1.c:18:1 .loc 1 18 1 is_stmt 0 movl$5, %eax ret ... So, the local vla a is optimized away. This patch adds VOLATILE to 'a', which prevents it from being optimized away, and fixes the non-lto failures. Committed as obvious. Thanks, - Tom [testsuite/guality] Prevent optimization of local in vla-1.c 2018-07-01 Tom de Vries * gcc.dg/guality/prevent-optimization.h (VOLATILE): Define. * gcc.dg/guality/vla-1.c (f1): Mark local vla a as VOLATILE. --- gcc/testsuite/gcc.dg/guality/prevent-optimization.h | 2 ++ gcc/testsuite/gcc.dg/guality/vla-1.c| 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/guality/prevent-optimization.h b/gcc/testsuite/gcc.dg/guality/prevent-optimization.h index 0ef84a3d9c8..57e945cafb4 100644 --- a/gcc/testsuite/gcc.dg/guality/prevent-optimization.h +++ b/gcc/testsuite/gcc.dg/guality/prevent-optimization.h @@ -21,8 +21,10 @@ along with GCC; see the file COPYING3. If not see #ifdef PREVENT_OPTIMIZATION #define ATTRIBUTE_USED __attribute__((used)) +#define VOLATILE volatile #else #define ATTRIBUTE_USED +#define VOLATILE #endif #endif diff --git a/gcc/testsuite/gcc.dg/guality/vla-1.c b/gcc/testsuite/gcc.dg/guality/vla-1.c index 264b9f3f92b..d281185c18c 100644 --- a/gcc/testsuite/gcc.dg/guality/vla-1.c +++ b/gcc/testsuite/gcc.dg/guality/vla-1.c @@ -2,6 +2,8 @@ /* { dg-do run } */ /* { dg-options "-g" } */ +#include "prevent-optimization.h" + void __attribute__((noinline)) bar (short *p) { @@ -12,7 +14,7 @@ bar (short *p) int __attribute__((noinline)) f1 (int i) { - char a[i + 1]; + VOLATILE char a[i + 1]; a[0] = 5;/* { dg-final { gdb-test .+1 "i" "5" } } */ return a[0]; /* { dg-final { gdb-test . "sizeof (a)" "6" } } */ }
Re: [testsuite/guality, committed] Prevent optimization of local in vla-1.c
On Sun, Jul 01, 2018 at 06:19:20PM +0200, Tom de Vries wrote: > So, the local vla a is optimized away. > > This patch adds VOLATILE to 'a', which prevents it from being optimized away, > and fixes the non-lto failures. > > Committed as obvious. That isn't obvious, it is just wrong. The intent of the testcase is to test how debugging of optimized code with VLAs works. With your change we don't test that anymore. Please revert. > [testsuite/guality] Prevent optimization of local in vla-1.c > > 2018-07-01 Tom de Vries > > * gcc.dg/guality/prevent-optimization.h (VOLATILE): Define. > * gcc.dg/guality/vla-1.c (f1): Mark local vla a as VOLATILE. Jakub
[PATCH, committed] Add -mgnu-asm to pdp11 target, change -mdec-asm
The pdp11 target has long had -mdec-asm which was documented to generate DEC compatible assembly language output but actually produces GNU assembler output. This patch adds -mgnu-asm to do what -mdec-asm used to do, and -mdec-asm now does produces output acceptable to DEC Macro-11. Committed. paul ChangeLog: 2018-07-01 Paul Koning * common/config/pdp11/pdp11-common.c (pdp11_handle_option): Handle -munit-asm, -mgnu-asm, -mdec-asm. * config/pdp11/pdp11-protos.h (pdp11_gen_int_label): New. (pdp11_output_labelref): New. (pdp11_output_def): New. (pdp11_output_addr_vec_elt): New. * config/pdp11/pdp11.c: Use tab between opcode and operands. Use %# and %@ format codes. (pdp11_option_override): New. (TARGET_ASM_FILE_START_FILE_DIRECTIVE): Define. (pdp11_output_ident): New. (pdp11_asm_named_section): New. (pdp11_asm_init_sections): New. (pdp11_file_start): New. (pdp11_file_end): New. (output_ascii): Use .ascii/.asciz for -mdec-asm. (pdp11_asm_print_operand): Update %# and %$ for -mdec-asm. Add %o, like %c but octal. (pdp11_option_override): New. * config/pdp11/pdp11.h (TEXT_SECTION_ASM_OP): Update for -mdec-asm. (DATA_SECTION_ASM_OP): Ditto. (READONLY_DATA_SECTION_ASM_OP): New. (IS_ASM_LOGICAL_LINE_SEPARATOR): New. (ASM_GENERATE_INTERNAL_LABEL): Use new function. (ASM_OUTPUT_LABELREF): Ditto. (ASM_OUTPUT_DEF): Ditto. (ASM_OUTPUT_EXTERNAL): New. (ASM_OUTPUT_SOURCE_FILENAME): New. (ASM_OUTPUT_ADDR_VEC_ELT): Use new function. (ASM_OUTPUT_SKIP): Update for -mdec-asm. * config/pdp11/pdp11.md: Use tab between opcode and operands. Use %# and %@ format codes. * config/pdp11/pdp11.opt (mgnu-asm): New. (mdec-asm): Conflicts with -mgnu-asm and -munix-asm. (munix-asm): Conflicts with -mdec-asm and -mgnu-asm. * doc/invoke.txt (PDP-11 Options): Add -mgnu-asm. Index: common/config/pdp11/pdp11-common.c === --- common/config/pdp11/pdp11-common.c (revision 262287) +++ common/config/pdp11/pdp11-common.c (working copy) @@ -59,7 +59,16 @@ pdp11_handle_option (struct gcc_options *opts, opts->x_target_flags &= ~MASK_40; opts->x_target_flags |= MASK_45; return true; - + +case OPT_munix_asm: +case OPT_mgnu_asm: + targetm_common.have_named_sections = false; + return true; + +case OPT_mdec_asm: + targetm_common.have_named_sections = true; + return true; + default: return true; } Index: config/pdp11/pdp11-protos.h === --- config/pdp11/pdp11-protos.h (revision 262287) +++ config/pdp11/pdp11-protos.h (working copy) @@ -51,3 +51,7 @@ extern void pdp11_asm_output_var (FILE *, const ch extern void pdp11_expand_prologue (void); extern void pdp11_expand_epilogue (void); extern poly_int64 pdp11_push_rounding (poly_int64); +extern void pdp11_gen_int_label (char *, const char *, int); +extern void pdp11_output_labelref (FILE *, const char *); +extern void pdp11_output_def (FILE *, const char *, const char *); +extern void pdp11_output_addr_vec_elt (FILE *, int); Index: config/pdp11/pdp11.c === --- config/pdp11/pdp11.c(revision 262287) +++ config/pdp11/pdp11.c(working copy) @@ -212,7 +212,7 @@ static bool pdp11_scalar_mode_supported_p (scalar_ #undef TARGET_PREFERRED_OUTPUT_RELOAD_CLASS #define TARGET_PREFERRED_OUTPUT_RELOAD_CLASS pdp11_preferred_output_reload_class -#undef TARGET_LRA_P +#undef TARGET_LRA_P #define TARGET_LRA_P hook_bool_void_false #undef TARGET_LEGITIMATE_ADDRESS_P @@ -221,9 +221,30 @@ static bool pdp11_scalar_mode_supported_p (scalar_ #undef TARGET_CONDITIONAL_REGISTER_USAGE #define TARGET_CONDITIONAL_REGISTER_USAGE pdp11_conditional_register_usage +#undef TARGET_OPTION_OVERRIDE +#define TARGET_OPTION_OVERRIDE pdp11_option_override + +#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE +#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true + +#undef TARGET_ASM_OUTPUT_IDENT +#define TARGET_ASM_OUTPUT_IDENT pdp11_output_ident + #undef TARGET_ASM_FUNCTION_SECTION #define TARGET_ASM_FUNCTION_SECTION pdp11_function_section +#undef TARGET_ASM_NAMED_SECTION +#defineTARGET_ASM_NAMED_SECTION pdp11_asm_named_section + +#undef TARGET_ASM_INIT_SECTIONS +#define TARGET_ASM_INIT_SECTIONS pdp11_asm_init_sections + +#undef TARGET_ASM_FILE_START +#define TARGET_ASM_FILE_START pdp11_file_start + +#undef TARGET_ASM_FILE_END +#define TARGET_ASM_FILE_END pdp11_file_end + #undef TARGET_PRINT_OPERAND #define TARGET_PRINT_OPERAND pdp11_asm_print_operand @@ -238,6 +259,7 @@ static bool pdp11_scalar_mode_supported_p (scalar_ #undef
Re: Limit Debug mode impact: overload __niter_base
Here is a new proposal between yours and mine. It is still adding a function to wrap what __niter_base unwrap, I called it __nwrap_iter for this reason. But it takes advantage of knowing that __niter_base will only unwrap random access iterator to use an expression to that will do the right thing, no matter the original iterator type. I eventually found no issue in the testsuite, despite the std::equal change. I might have had a local invalid test. Ok to commit ? François On 27/06/2018 22:02, Jonathan Wakely wrote: On 27/06/18 21:25 +0200, François Dumont wrote: On 27/06/2018 02:13, Jonathan Wakely wrote: On 26/06/18 17:03 +0100, Jonathan Wakely wrote: On 18/06/18 23:01 +0200, François Dumont wrote: Hi I abandon the idea of providing Debug algos, it would be too much code to add and maintain. However I haven't quit on reducing Debug mode performance impact. So this patch make use of the existing std::__niter_base to get rid of the debug layer around __gnu_debug::vector<>::iterator so that __builtin_memmove replacement can take place. As _Safe_iterator<> do not have a constructor taking a pointer I change algos implementation so that we do not try to instantiate the iterator type ourselves but rather rely on its operators + or -. The small drawback is that for std::equal algo where we can't use the __glibcxx_can_increment we need to keep the debug layer to make sure we don't reach past-the-end iterator. So I had to remove usage of __niter_base when in Debug mode, doing so it also disable potential usage of __builtin_memcmp when calling std::equal on std::__cxx1998::vector<> iterators. A rather rare situation I think. We don't need to give up all checks in std::equal, we can do this: @@ -1044,7 +1085,13 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO typename iterator_traits<_II1>::value_type, typename iterator_traits<_II2>::value_type>) __glibcxx_requires_valid_range(__first1, __last1); - +#ifdef _GLIBCXX_DEBUG + typedef typename iterator_traits<_II1>::iterator_category _Cat1; + typedef typename iterator_traits<_II2>::iterator_category _Cat2; + if (!__are_same<_Cat1, input_iterator_tag>::__value + && __are_same<_Cat2, random_access_iterator_tag>::__value) + __glibcxx_requires_can_increment_range(__first1, __last1, __first2); +#endif return std::__equal_aux(std::__niter_base(__first1), std::__niter_base(__last1), std::__niter_base(__first2)); We don't give up any check. We do for my version of the patch, because with the new __niter_base overload the debug layer gets removed. Your patch kept the checking by not removing the debug layer, but loses the memcmp optimization. My suggestion above removes the debug layer so uses the optimization, but adds a separate range check, so we still get checking as well. We only give up on using the __builtin_memcmp when std::equal is being used while Debug mode is active and std::equal is called directly with std::__cxx1998::vector. Moreover this check is wrong and will introduce regression when running testsuite in Debug mode (this is why I know). It will assert in the following case: vector v1 { 0, 0, 0 }; vector v2 { 0, 1 }; equal(v1.begin(), v1.end(), v2.begin()); We should assert for that test, it's undefined behaviour. Note that I don't know how to test that __builtin_memmove has been indeed used. So I've been through some debug sessions to check that. The attached patch (not fully tested) seems to be a much simpler way to achieve the same thing. Instead of modifying all the helper structs, just define a new function to re-wrap the result into the desired iterator type. diff --git a/libstdc++-v3/include/debug/stl_iterator.h b/libstdc++-v3/include/debug/stl_iterator.h index a6a2a76..eca7203 100644 --- a/libstdc++-v3/include/debug/stl_iterator.h +++ b/libstdc++-v3/include/debug/stl_iterator.h @@ -120,4 +120,19 @@ namespace __gnu_debug #endif } +#if __cplusplus >= 201103L +namespace std +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + +template_Sequence> + _Iterator + __niter_base(const __gnu_debug::_Safe_iterator< + __gnu_cxx::__normal_iterator<_Iterator, _Container>, + _Sequence>&); + +_GLIBCXX_END_NAMESPACE_VERSION +} +#endif Why is this overload only defined for C++11 and later? I defined it unconditionally in the attached patch. What do you think? Here's a complete patch that passes all tests in normal mode and causes no regressions in debug mode (we already have some debug test failing). I wondered whether we need another overload of __wrap_iter for handling move_iterator, but I think the first overload works OK. I don't think we need it neither. Algos that handle move iterator are already doing it. OK great. diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index d429943..003a
Re: [testsuite/guality, committed] Prevent optimization of local in vla-1.c
On 07/01/2018 09:11 PM, Jakub Jelinek wrote: > On Sun, Jul 01, 2018 at 06:19:20PM +0200, Tom de Vries wrote: >> So, the local vla a is optimized away. >> >> This patch adds VOLATILE to 'a', which prevents it from being optimized away, >> and fixes the non-lto failures. >> >> Committed as obvious. > > That isn't obvious, it is just wrong. > The intent of the testcase is to test how debugging of optimized code with > VLAs works. With your change we don't test that anymore. Please revert. > Hi, Sure, but ... if this is wrong, then for my understanding can you explain to me how the fail should be addressed? [ FWIW, I considered this obvious, given the ok for "[testsuite] Fix guality/pr45882.c for flto" ( https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01304.html ) which seemed similar to me. ] Thanks, - Tom >> [testsuite/guality] Prevent optimization of local in vla-1.c >> >> 2018-07-01 Tom de Vries >> >> * gcc.dg/guality/prevent-optimization.h (VOLATILE): Define. >> * gcc.dg/guality/vla-1.c (f1): Mark local vla a as VOLATILE. > > Jakub >
[testsuite/guality, committed] Use @main as bp loc instead of line nrs in const-volatile.c
Hi, this patch replaces the absolute line numbers used in gdb-test in guality testcase const-volatile.c. First there's line number 50, which used to point at the start of main: ... 47 int 48 main (int argc, char **argv) 49 { 50score as = argc; ... but has drifted in time to before f: ... 48 score s; 49 const score cs; 50 51 static __attribute__((noclone, noinline)) int 52 f (const char *progname, volatile struct foo *dummy, const score s) 53 { ... Then there's line number 58, which already failed to set a breakpoint at the initial commit, so I'm classifying that as a typo. I could have used a dg-line at the start of main to introduce variable bp, and then use that variable everywhere, but I've added a mechanism that allows us to set breakpoints at function addresses, which means we don't need the dg-line, and can just use '@main'. Committed as obvious. Thanks, - Tom [testsuite/guality] Use @main as bp loc instead of line nrs in const-volatile.c 2018-07-01 Tom de Vries * lib/gcc-gdb-test.exp (gdb-test): Handle '@' prefix in line number argument. * gcc.dg/guality/const-volatile.c: Replace gdb-test line nrs 50 and 58 with @main. --- gcc/testsuite/gcc.dg/guality/const-volatile.c | 44 +-- gcc/testsuite/lib/gcc-gdb-test.exp| 7 - 2 files changed, 28 insertions(+), 23 deletions(-) diff --git a/gcc/testsuite/gcc.dg/guality/const-volatile.c b/gcc/testsuite/gcc.dg/guality/const-volatile.c index d657f48079c..3bfca0d14d3 100644 --- a/gcc/testsuite/gcc.dg/guality/const-volatile.c +++ b/gcc/testsuite/gcc.dg/guality/const-volatile.c @@ -62,33 +62,33 @@ main (int argc, char **argv) return f (argv[0], &dummy, as) - 1; } -/* { dg-final { gdb-test 50 "type:main" "int (int, char **)" } } */ +/* { dg-final { gdb-test "@main" "type:main" "int (int, char **)" } } */ -/* { dg-final { gdb-test 50 "type:i" "int" } } */ -/* { dg-final { gdb-test 50 "type:ci" "const int" } } */ -/* { dg-final { gdb-test 50 "type:vi" "volatile int" } } */ -/* { dg-final { gdb-test 50 "type:cvi" "const volatile int" } } */ +/* { dg-final { gdb-test "@main" "type:i" "int" } } */ +/* { dg-final { gdb-test "@main" "type:ci" "const int" } } */ +/* { dg-final { gdb-test "@main" "type:vi" "volatile int" } } */ +/* { dg-final { gdb-test "@main" "type:cvi" "const volatile int" } } */ -/* { dg-final { gdb-test 50 "type:pi" "int *" } } */ -/* { dg-final { gdb-test 50 "type:pci" "const int *" } } */ -/* { dg-final { gdb-test 50 "type:pvi" "volatile int *" } } */ -/* { dg-final { gdb-test 50 "type:pcvi" "const volatile int *" } } */ +/* { dg-final { gdb-test "@main" "type:pi" "int *" } } */ +/* { dg-final { gdb-test "@main" "type:pci" "const int *" } } */ +/* { dg-final { gdb-test "@main" "type:pvi" "volatile int *" } } */ +/* { dg-final { gdb-test "@main" "type:pcvi" "const volatile int *" } } */ -/* { dg-final { gdb-test 50 "type:cip" "int * const" } } */ -/* { dg-final { gdb-test 50 "type:vip" "int * volatile" } } */ -/* { dg-final { gdb-test 50 "type:cvip" "int * const volatile" } } */ +/* { dg-final { gdb-test "@main" "type:cip" "int * const" } } */ +/* { dg-final { gdb-test "@main" "type:vip" "int * volatile" } } */ +/* { dg-final { gdb-test "@main" "type:cvip" "int * const volatile" } } */ -/* { dg-final { gdb-test 50 "type:vs" "volatile struct { const long cli; const signed char csc; }" } } */ +/* { dg-final { gdb-test "@main" "type:vs" "volatile struct { const long cli; const signed char csc; }" } } */ -/* { dg-final { gdb-test 50 "type:cvip" "int * const volatile" } } */ +/* { dg-final { gdb-test "@main" "type:cvip" "int * const volatile" } } */ -/* { dg-final { gdb-test 50 "type:bar" "struct bar { short s; const short cs; volatile short vs; const volatile short cvs; volatile long long vll; }" } } */ -/* { dg-final { gdb-test 50 "type:foo" "struct foo { const long cli; const signed char csc; }" } } */ -/* { dg-final { gdb-test 50 "type:cfoo" "const struct foo { const long cli; const signed char csc; }" } } */ -/* { dg-final { gdb-test 50 "type:vfoo" "volatile struct foo { const long cli; const signed char csc; }" } } */ -/* { dg-final { gdb-test 50 "type:cvfoo" "const volatile struct foo { const long cli; const signed char csc; }" } } */ +/* { dg-final { gdb-test "@main" "type:bar" "struct bar { short s; const short cs; volatile short vs; const volatile short cvs; volatile long long vll; }" } } */ +/* { dg-final { gdb-test "@main" "type:foo" "struct foo { const long cli; const signed char csc; }" } } */ +/* { dg-final { gdb-test "@main" "type:cfoo" "const struct foo { const long cli; const signed char csc; }" } } */ +/* { dg-final { gdb-test "@main" "type:vfoo" "volatile struct foo { const long cli; const signed char csc; }" } } */ +/* { dg-final { gdb-test "@main" "type:cvfoo" "const volatile struct foo { const long cli; const signed char csc; }" } } */ -/* { dg-final { gdb-test 58
Re: [PATCH 3/3][POPCOUNT] Remove unnecessary if condition in phiopt
Hi Richard, On 29 June 2018 at 18:45, Richard Biener wrote: > On Wed, Jun 27, 2018 at 7:09 AM Kugan Vivekanandarajah > wrote: >> >> Hi Richard, >> >> Thanks for the review, >> >> On 25 June 2018 at 20:20, Richard Biener wrote: >> > On Fri, Jun 22, 2018 at 11:16 AM Kugan Vivekanandarajah >> > wrote: >> >> >> >> gcc/ChangeLog: >> > >> > @@ -1516,6 +1521,114 @@ minmax_replacement (basic_block cond_bb, >> > basic_block middle_bb, >> > >> >return true; >> > } >> > +/* Convert >> > + >> > + >> > + if (b_4(D) != 0) >> > + goto >> > >> > vertical space before the comment. >> Done. >> >> > >> > + edge e0 ATTRIBUTE_UNUSED, edge e1 >> > ATTRIBUTE_UNUSED, >> > >> > why pass them if they are unused? >> Removed. >> >> > >> > + if (stmt_count != 2) >> > +return false; >> > >> > so what about the case when there is no conversion? >> Done. >> >> > >> > + /* Check that we have a popcount builtin. */ >> > + if (!is_gimple_call (popcount) >> > + || !gimple_call_builtin_p (popcount, BUILT_IN_NORMAL)) >> > +return false; >> > + tree fndecl = gimple_call_fndecl (popcount); >> > + if ((DECL_FUNCTION_CODE (fndecl) != BUILT_IN_POPCOUNT) >> > + && (DECL_FUNCTION_CODE (fndecl) != BUILT_IN_POPCOUNTL) >> > + && (DECL_FUNCTION_CODE (fndecl) != BUILT_IN_POPCOUNTLL)) >> > +return false; >> > >> > look at popcount handling in tree-vrp.c how to properly also handle >> > IFN_POPCOUNT. >> > (CASE_CFN_POPCOUNT) >> Done. >> > >> > + /* Cond_bb has a check for b_4 != 0 before calling the popcount >> > + builtin. */ >> > + if (gimple_code (cond) != GIMPLE_COND >> > + || gimple_cond_code (cond) != NE_EXPR >> > + || TREE_CODE (gimple_cond_lhs (cond)) != SSA_NAME >> > + || rhs != gimple_cond_lhs (cond)) >> > +return false; >> > >> > The check for SSA_NAME is redundant. >> > You fail to check that gimple_cond_rhs is zero. >> Done. >> >> > >> > + /* Remove the popcount builtin and cast stmt. */ >> > + gsi = gsi_for_stmt (popcount); >> > + gsi_remove (&gsi, true); >> > + gsi = gsi_for_stmt (cast); >> > + gsi_remove (&gsi, true); >> > + >> > + /* And insert the popcount builtin and cast stmt before the cond_bb. */ >> > + gsi = gsi_last_bb (cond_bb); >> > + gsi_insert_before (&gsi, popcount, GSI_NEW_STMT); >> > + gsi_insert_before (&gsi, cast, GSI_NEW_STMT); >> > >> > use gsi_move_before (). You need to reset flow sensitive info on the >> > LHS of the popcount call as well as on the LHS of the cast. >> Done. >> >> > >> > You fail to check the PHI operand on the false edge. Consider >> > >> > if (b != 0) >> >res = __builtin_popcount (b); >> > else >> >res = 1; >> > >> > You fail to check the PHI operand on the true edge. Consider >> > >> > res = 0; >> > if (b != 0) >> >{ >> > __builtin_popcount (b); >> > res = 2; >> >} >> > >> > and using -fno-tree-dce and whatever you need to keep the >> > popcount call in the IL. A gimple testcase for phiopt will do. >> > >> > Your testcase relies on popcount detection. Please write it >> > using __builtin_popcount instead. Write one with a cast and >> > one without. >> Added the testcases. >> >> Is this OK now. > > + for (gsi = gsi_start_bb (middle_bb); !gsi_end_p (gsi); gsi_next (&gsi)) > +{ > > use gsi_after_labels (middle_bb) > > + popcount = last_stmt (middle_bb); > + if (popcount == NULL) > +return false; > > after the counting this test is always false, remove it. > > + /* We have a cast stmt feeding popcount builtin. */ > + cast = first_stmt (middle_bb); > > looking at the implementation of first_stmt this will > give you a label in case the BB has one. I think it's better > to merge this and the above with the "counting" like > > gsi = gsi_start_nondebug_after_labels_bb (middle_bb); > if (gsi_end_p (gsi)) > return false; > cast = gsi_stmt (gsi); > gsi_next_nondebug (&gsi); > if (!gsi_end_p (gsi)) > { > popcount = gsi_stmt (gsi); > gsi_next_nondebug (&gsi); > if (!gsi_end_p (gsi)) >return false; > } > else > { > popcount = cast; > cast = NULL; > } Done. > > + /* Check that we have a cast prior to that. */ > + if (gimple_code (cast) != GIMPLE_ASSIGN > + || gimple_assign_rhs_code (cast) != NOP_EXPR) > + return false; > > CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (cast)) > Done. > + /* Check PHI arguments. */ > + if (lhs != arg0 || !integer_zerop (arg1)) > +return false; > > that is not sufficient, you do not know whether arg0 is the true > value or the false value. The edge flags will tell you. Done. Please find the modified patch. Is this OK. Bootstrapped and regression tested with no new regressions. Thanks, Kugan > > Otherwise looks OK. > > Richard. > >> Thanks, >> Kugan >> > >> > Thanks, >> > Richard. >> > >> > >> >> 2018-06-22 Kugan Vivekanandarajah >> >> >> >> * tree-ssa-phiopt.c (cond_removal_in_popcount_pattern): New. >> >> (tree_ssa_phiopt_