Re: [PATCH] Tree-level fix for PR 69526
ping.
Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch
On Mon, 8 May 2017, Peter Bergner wrote: > On 05/08/2017 01:20 PM, Peter Bergner wrote: > > That is what the previous patch did, but as I mention above, > > we generate slightly better code for some test cases (other > > tests seemed to generate the same code) if we don't attempt > > to handle the decision tree case. I'll note that the current > > unpatched compiler already knows how to remove unreachable > > case statement blocks when we expand to a decision tree. > > I should be more careful with my description here. The patch does > affect both unreachable case statements for both decision trees as > well as jump tables, and that leads to improved code for both > decision trees as well as jump tables. > > What I meant to say above, is that the current handling of unreachable > default case statements in the unpatched compiler seems to lead to > slightly better code for some test cases than attempting to handle > default_label == NULL in the decision tree code. It was for that > reason, I placed the code in expand_case() only in the jump table > path. The fact it made the patch smaller was a bonus, since I didn't > need to protect emit_case_nodes() from a NULL default_label. Ah, ok. > As I said, if you think it will help some test case I haven't tried yet, > I can add that support back. Well, let's to that incremental then, if at all. Thanks, Richard.
[PATCH] Simplify VRP vrp_int_const_binop
The following avoids using trees with TREE_OVERFLOW in vrp_int_const_binop which is GC memory heavy and simplifies vrp_int_const_binop by using wide_ints (instead of implementing unsigned overflow detection manually). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2017-05-09 Richard Biener * tree-vrp.c (vrp_int_const_binop): Use wide-ints and simplify. (extract_range_from_multiplicative_op_1): Adjust. (extract_range_from_binary_expr_1): Use int_const_binop. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 247733) +++ gcc/tree-vrp.c (working copy) @@ -1617,66 +1613,91 @@ extract_range_from_ssa_name (value_range /* Wrapper around int_const_binop. If the operation overflows and we are not using wrapping arithmetic, then adjust the result to be -INF or +INF depending on CODE, VAL1 and VAL2. This can return - NULL_TREE if we need to use an overflow infinity representation but - the type does not support it. */ + NULL_TREE for division by zero. */ -static tree -vrp_int_const_binop (enum tree_code code, tree val1, tree val2) +static wide_int +vrp_int_const_binop (enum tree_code code, tree val1, tree val2, +bool *overflow_p) { - tree res; - - res = int_const_binop (code, val1, val2); + bool overflow = false; + signop sign = TYPE_SIGN (TREE_TYPE (val1)); + wide_int res; - /* If we are using unsigned arithmetic, operate symbolically - on -INF and +INF as int_const_binop only handles signed overflow. */ - if (TYPE_UNSIGNED (TREE_TYPE (val1))) + switch (code) { - int checkz = compare_values (res, val1); - bool overflow = false; +case RSHIFT_EXPR: +case LSHIFT_EXPR: + { + wide_int wval2 = wi::to_wide (val2, TYPE_PRECISION (TREE_TYPE (val1))); + if (wi::neg_p (wval2)) + { + wval2 = -wval2; + if (code == RSHIFT_EXPR) + code = LSHIFT_EXPR; + else + code = RSHIFT_EXPR; + } - /* Ensure that res = val1 [+*] val2 >= val1 - or that res = val1 - val2 <= val1. */ - if ((code == PLUS_EXPR - && !(checkz == 1 || checkz == 0)) - || (code == MINUS_EXPR - && !(checkz == 0 || checkz == -1))) + if (code == RSHIFT_EXPR) + /* It's unclear from the C standard whether shifts can overflow. +The following code ignores overflow; perhaps a C standard +interpretation ruling is needed. */ + res = wi::rshift (val1, wval2, sign); + else + res = wi::lshift (val1, wval2); + break; + } + +case MULT_EXPR: + res = wi::mul (val1, val2, sign, &overflow); + break; + +case TRUNC_DIV_EXPR: +case EXACT_DIV_EXPR: + if (val2 == 0) { - overflow = true; + *overflow_p = true; + return res; } - /* Checking for multiplication overflow is done by dividing the -output of the multiplication by the first input of the -multiplication. If the result of that division operation is -not equal to the second input of the multiplication, then the -multiplication overflowed. */ - else if (code == MULT_EXPR && !integer_zerop (val1)) + else + res = wi::div_trunc (val1, val2, sign, &overflow); + break; + +case FLOOR_DIV_EXPR: + if (val2 == 0) { - tree tmp = int_const_binop (TRUNC_DIV_EXPR, - res, - val1); - int check = compare_values (tmp, val2); + *overflow_p = true; + return res; + } + res = wi::div_floor (val1, val2, sign, &overflow); + break; - if (check != 0) - overflow = true; +case CEIL_DIV_EXPR: + if (val2 == 0) + { + *overflow_p = true; + return res; } + res = wi::div_ceil (val1, val2, sign, &overflow); + break; - if (overflow) +case ROUND_DIV_EXPR: + if (val2 == 0) { - res = copy_node (res); - TREE_OVERFLOW (res) = 1; + *overflow_p = 0; + return res; } + res = wi::div_round (val1, val2, sign, &overflow); + break; +default: + gcc_unreachable (); } - else if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (val1))) -/* If the singed operation wraps then int_const_binop has done - everything we want. */ -; - /* Signed division of -1/0 overflows and by the time it gets here - returns NULL_TREE. */ - else if (!res) -return NULL_TREE; - else if (TREE_OVERFLOW (res) - && ! TREE_OVERFLOW (val1) - && ! TREE_OVERFLOW (val2)) + + *overflow_p = overflow; + + if (overflow + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (val1))) { /* If the operation overflowed but neither VAL1 nor VAL2 are over
[PATCH] Fix bootstrap on arm target
Hi, since a few days the bootstrap of ada fails on a native arm target. It is due to a -Werror warning when passing GNAT_EXCEPTION_CLASS which is a string constant to exception_class_eq, but C++ forbids to cast that to "char*". Not sure what is the smartest solution, I tried the following and it seems to work for x86_64-pc-linux-gnu and arm-linux-gnueabihf. Is it OK for trunk? Thanks Bernd.2017-05-09 Bernd Edlinger * raise-gcc.c (exception_class_eq): Make ec parameter const. --- gcc/ada/raise-gcc.c.jj 2017-04-27 12:00:42.0 +0200 +++ gcc/ada/raise-gcc.c 2017-05-09 09:45:59.557507045 +0200 @@ -909,7 +909,8 @@ /* Return true iff the exception class of EXCEPT is EC. */ static int -exception_class_eq (const _GNAT_Exception *except, _Unwind_Exception_Class ec) +exception_class_eq (const _GNAT_Exception *except, + const _Unwind_Exception_Class ec) { #ifdef __ARM_EABI_UNWINDER__ return memcmp (except->common.exception_class, ec, 8) == 0;
Re: [Testsuite, committed] Fix vector peeling test failures
On Mon, May 8, 2017 at 3:49 PM, Richard Biener wrote: > On Mon, May 8, 2017 at 2:41 PM, Wilco Dijkstra wrote: >> This fixes a few failures on ARM and AArch64 due to a recent change in >> alignment peeling by switching the vector cost model off >> (https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00407.html). >> >> Tested on AArch64, ARM and x64 - committed as obvious. > > Thanks. Note that I'm not sure what -fno-vect-cost-model actually means when > peeling for alignment, so this fix might not "prevail". This broke the testcases -- you have to use dg-additional-options, otherwise I now get gcc.dg/vect/vect-44.c -flto -ffat-lto-objects : dump file does not exist UNRESOLVED: gcc.dg/vect/vect-44.c -flto -ffat-lto-objects scan-tree-dump-times vect "Alignment of access forced using peeling" 1 ... fixing that now. Richard. > Richard. > >> ChangeLog: >> 2017-05-08 Wilco Dijkstra >> >> * testsuite/gcc.dg/vect/vect-44.c: Add -fno-vect-cost-model. >> * gcc/testsuite/gcc.dg/vect/vect-50.c: Likewise. >> -- >> >> diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c >> b/gcc/testsuite/gcc.dg/vect/vect-44.c >> index >> 186f9cfc9e26d6eb53514dec0fac176d696ec578..fbc593572429422c8e527c5e5559c515efd38aa6 >> 100644 >> --- a/gcc/testsuite/gcc.dg/vect/vect-44.c >> +++ b/gcc/testsuite/gcc.dg/vect/vect-44.c >> @@ -1,4 +1,5 @@ >> /* { dg-require-effective-target vect_float } */ >> +/* { dg-options "-fno-vect-cost-model" } */ >> >> #include >> #include "tree-vect.h" >> diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c >> b/gcc/testsuite/gcc.dg/vect/vect-50.c >> index >> 78bfd8d3920445fe51c7393a82870ea85f62bb55..0d5febc165ee3bf3b7b595237168d9d4b9604d4b >> 100644 >> --- a/gcc/testsuite/gcc.dg/vect/vect-50.c >> +++ b/gcc/testsuite/gcc.dg/vect/vect-50.c >> @@ -1,4 +1,5 @@ >> /* { dg-require-effective-target vect_float } */ >> +/* { dg-options "-fno-vect-cost-model" } */ >> >> #include >> #include "tree-vect.h"
Re: [Testsuite, committed] Fix vector peeling test failures
On Tue, May 9, 2017 at 10:05 AM, Richard Biener wrote: > On Mon, May 8, 2017 at 3:49 PM, Richard Biener > wrote: >> On Mon, May 8, 2017 at 2:41 PM, Wilco Dijkstra >> wrote: >>> This fixes a few failures on ARM and AArch64 due to a recent change in >>> alignment peeling by switching the vector cost model off >>> (https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00407.html). >>> >>> Tested on AArch64, ARM and x64 - committed as obvious. >> >> Thanks. Note that I'm not sure what -fno-vect-cost-model actually means when >> peeling for alignment, so this fix might not "prevail". > > This broke the testcases -- you have to use dg-additional-options, > otherwise I now get > > gcc.dg/vect/vect-44.c -flto -ffat-lto-objects : dump file does not exist > UNRESOLVED: gcc.dg/vect/vect-44.c -flto -ffat-lto-objects > scan-tree-dump-times vect "Alignment of access forced using peeling" 1 > ... > > fixing that now. Oh, and -fno-vect-cost-model is the default anyway so your patch couldn't have fixed anything. Thus reverting instead. Richard. > Richard. > >> Richard. >> >>> ChangeLog: >>> 2017-05-08 Wilco Dijkstra >>> >>> * testsuite/gcc.dg/vect/vect-44.c: Add -fno-vect-cost-model. >>> * gcc/testsuite/gcc.dg/vect/vect-50.c: Likewise. >>> -- >>> >>> diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c >>> b/gcc/testsuite/gcc.dg/vect/vect-44.c >>> index >>> 186f9cfc9e26d6eb53514dec0fac176d696ec578..fbc593572429422c8e527c5e5559c515efd38aa6 >>> 100644 >>> --- a/gcc/testsuite/gcc.dg/vect/vect-44.c >>> +++ b/gcc/testsuite/gcc.dg/vect/vect-44.c >>> @@ -1,4 +1,5 @@ >>> /* { dg-require-effective-target vect_float } */ >>> +/* { dg-options "-fno-vect-cost-model" } */ >>> >>> #include >>> #include "tree-vect.h" >>> diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c >>> b/gcc/testsuite/gcc.dg/vect/vect-50.c >>> index >>> 78bfd8d3920445fe51c7393a82870ea85f62bb55..0d5febc165ee3bf3b7b595237168d9d4b9604d4b >>> 100644 >>> --- a/gcc/testsuite/gcc.dg/vect/vect-50.c >>> +++ b/gcc/testsuite/gcc.dg/vect/vect-50.c >>> @@ -1,4 +1,5 @@ >>> /* { dg-require-effective-target vect_float } */ >>> +/* { dg-options "-fno-vect-cost-model" } */ >>> >>> #include >>> #include "tree-vect.h"
[PATCH] More VRP strict-overflow stuff
This removes the remaining case (hopefully) where we disabled optimization during propagation when we encountered cases that require undefined overflow knowledge. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. I don't plan to immediately remove the warnings emitted at final folding time (though given test coverage for the propagation stuff I don't expect any fallout if I'd do that). Richard. 2017-05-09 Richard Biener * tree-vrp.c (get_single_symbol): Add assert that we don't get overflowed constants as invariant part. (compare_values_warnv): Add comment before the TREE_NO_WARNING checks. Use wi::cmp instead of recursing for integer constants. (compare_values): Just ignore whether we assumed undefined overflow instead of failing the compare. (extract_range_for_var_from_comparison_expr): Add comment before the TREE_NO_WARNING sets. (test_for_singularity): Likewise. (extract_range_from_comparison): Do not disable optimization when we assumed undefined overflow. (extract_range_basic): Remove init of unused var. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 247738) +++ gcc/tree-vrp.c (working copy) @@ -803,6 +803,8 @@ get_single_symbol (tree t, bool *neg, tr if (TREE_CODE (t) != SSA_NAME) return NULL_TREE; + gcc_assert (! inv_ || ! TREE_OVERFLOW_P (inv_)); + *neg = neg_; *inv = inv_; return t; @@ -1069,6 +1071,8 @@ compare_values_warnv (tree val1, tree va return -2; if (strict_overflow_p != NULL + /* Symbolic range building sets TREE_NO_WARNING to declare +that overflow doesn't happen. */ && (!inv1 || !TREE_NO_WARNING (val1)) && (!inv2 || !TREE_NO_WARNING (val2))) *strict_overflow_p = true; @@ -1078,7 +1082,7 @@ compare_values_warnv (tree val1, tree va if (!inv2) inv2 = build_int_cst (TREE_TYPE (val2), 0); - return compare_values_warnv (inv1, inv2, strict_overflow_p); + return wi::cmp (inv1, inv2, TYPE_SIGN (TREE_TYPE (val1))); } const bool cst1 = is_gimple_min_invariant (val1); @@ -1092,6 +1096,8 @@ compare_values_warnv (tree val1, tree va return -2; if (strict_overflow_p != NULL + /* Symbolic range building sets TREE_NO_WARNING to declare +that overflow doesn't happen. */ && (!sym1 || !TREE_NO_WARNING (val1)) && (!sym2 || !TREE_NO_WARNING (val2))) *strict_overflow_p = true; @@ -1119,14 +1125,9 @@ compare_values_warnv (tree val1, tree va if (!POINTER_TYPE_P (TREE_TYPE (val1))) { - /* We cannot compare overflowed values, except for overflow -infinities. */ + /* We cannot compare overflowed values. */ if (TREE_OVERFLOW (val1) || TREE_OVERFLOW (val2)) - { - if (strict_overflow_p != NULL) - *strict_overflow_p = true; - return -2; - } + return -2; return tree_int_cst_compare (val1, val2); } @@ -1162,21 +1163,13 @@ compare_values_warnv (tree val1, tree va } } -/* Compare values like compare_values_warnv, but treat comparisons of - nonconstants which rely on undefined overflow as incomparable. */ +/* Compare values like compare_values_warnv. */ static int compare_values (tree val1, tree val2) { bool sop; - int ret; - - sop = false; - ret = compare_values_warnv (val1, val2, &sop); - if (sop - && (!is_gimple_min_invariant (val1) || !is_gimple_min_invariant (val2))) -ret = -2; - return ret; + return compare_values_warnv (val1, val2, &sop); } @@ -1499,6 +1492,7 @@ extract_range_for_var_from_comparison_ex else max = fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, build_int_cst (TREE_TYPE (max), 1)); + /* Signal to compare_values_warnv this expr doesn't overflow. */ if (EXPR_P (max)) TREE_NO_WARNING (max) = 1; } @@ -1538,6 +1532,7 @@ extract_range_for_var_from_comparison_ex else min = fold_build2 (PLUS_EXPR, TREE_TYPE (min), min, build_int_cst (TREE_TYPE (min), 1)); + /* Signal to compare_values_warnv this expr doesn't overflow. */ if (EXPR_P (min)) TREE_NO_WARNING (min) = 1; } @@ -3448,18 +3441,12 @@ static void extract_range_from_comparison (value_range *vr, enum tree_code code, tree type, tree op0, tree op1) { - bool sop = false; + bool sop; tree val; val = vrp_evaluate_conditional_warnv_with_ops (code, op0, op1, false, &sop, NULL); - - /* A disadvantage of using a special infinity as an overflow - representation is that we lose the ability to record overflow -
Re: [PATCH, GCC/ARM, Stage 1] PR71607: Fix ICE when loading constant
Hi, On 5 May 2017 at 15:19, Richard Earnshaw (lists) wrote: > On 04/05/17 11:40, Prakhar Bahuguna wrote: >> On 03/05/2017 11:30:13, Richard Earnshaw (lists) wrote: >>> On 20/04/17 10:54, Prakhar Bahuguna wrote: [ARM] PR71607: Fix ICE when loading constant gcc/ChangeLog: 2017-04-18 Andre Vieira Prakhar Bahuguna PR target/71607 * config/arm/arm.md (use_literal_pool): Removes. (64-bit immediate split): No longer takes cost into consideration if 'arm_disable_literal_pool' is enabled. * config/arm/arm.c (arm_tls_referenced_p): Add diagnostic if TLS is used when arm_disable_literal_pool is enabled. (arm_max_const_double_inline_cost): Remove use of arm_disable_literal_pool. (arm_reorg): Add return if arm_disable_literal_pool is enabled. * config/arm/vfp.md (no_literal_pool_df_immediate): New. (no_literal_pool_sf_immediate): New. testsuite/ChangeLog: 2017-04-18 Andre Vieira Thomas Preud'homme Prakhar Bahuguna PR target/71607 * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ... * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this. * gcc.target/arm/thumb2-slow-flash-data-2.c: New. * gcc.target/arm/thumb2-slow-flash-data-3.c: New. * gcc.target/arm/thumb2-slow-flash-data-4.c: New. * gcc.target/arm/thumb2-slow-flash-data-5.c: New. * gcc.target/arm/tls-disable-literal-pool.c: New. I've noticed that the last new test (tls-disable-literal-pool.c) fails on arm-eabi --with-mode=thumb --with-cpu=cortex-m3: no error message is generated. Thanks, Christophe Okay for stage1? >>> >>> This patch lacks a description of what's going on and why the change is >>> necessary (it should stand alone from the PR data). It's clearly a >>> non-trivial change, so why have you adopted this approach? >>> >>> R. >>> >> >> Hi, >> >> This patch is based off an earlier patch that was applied to the >> embedded-6-branch, and I had neglected to include the full description, which >> is presented below: >> >> This patch tackles the issue reported in PR71607. This patch takes a >> different >> approach for disabling the creation of literal pools. Instead of disabling >> the >> patterns that would normally transform the rtl into actual literal pools, it >> disables the creation of this literal pool rtl by making the target hook >> TARGET_CANNOT_FORCE_CONST_MEM return true if arm_disable_literal_pool is >> true. >> I added patterns to split floating point constants for both SF and DFmode. A >> pattern to handle the addressing of label_refs had to be included as well >> since >> all "memory_operand" patterns are disabled when TARGET_CANNOT_FORCE_CONST_MEM >> returns true. Also the pattern for splitting 32-bit immediates had to be >> changed, it was not accepting unsigned 32-bit unsigned integers with the MSB >> set. I believe const_int_operand expects the mode of the operand to be set to >> VOIDmode and not SImode. I have only changed it in the patterns that were >> affecting this code, though I suggest looking into changing it in the rest of >> the ARM backend. >> >> Additionally, the use of thread-local storage is disabled if literal pools >> are >> disabled, as there are no relocations for TLS variables and incorrect code is >> generated as a result. The patch now emits a diagnostic in TLS-enabled >> toolchains if a TLS symbol is found when -mpure-code or -mslow-flash-data are >> enabled. >> > > Thanks, that helps a lot. > > + { > + /* ARM currently does not provide relocations to encode TLS > variables > > ARM ELF does not define relocations ... > > + /* Make sure we do not attempt to create a literal pool even though > it should > + no longer be necessary to create any. */ > + if (arm_disable_literal_pool) > +return ; > + > > It would be safer to run through the code and then assert that fixups > aren't needed; though that would cost a little computation time. I > think you could put such an assert at the start of push_minipool_fix. > > OK with those changes. > > R.
Re: [PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests
On Mon, May 8, 2017 at 9:53 AM, Peryt, Sebastian wrote: > Hi, > > This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS > instructions. > > gcc/testsuite/ > * gcc.target/i386/avx512f-vaddsd-2.c: Test fixed. > * gcc.target/i386/avx512f-vaddss-2.c: Ditto. > * gcc.target/i386/avx512f-vsubsd-2.c: Ditto. > * gcc.target/i386/avx512f-vsubss-2.c: Ditto. > > Is it ok for trunk? > > Thanks, > Sebastian
Re: [PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests
On Mon, May 8, 2017 at 9:53 AM, Peryt, Sebastian wrote: > Hi, > > This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS > instructions. > > gcc/testsuite/ > * gcc.target/i386/avx512f-vaddsd-2.c: Test fixed. > * gcc.target/i386/avx512f-vaddss-2.c: Ditto. > * gcc.target/i386/avx512f-vsubsd-2.c: Ditto. > * gcc.target/i386/avx512f-vsubss-2.c: Ditto. > > Is it ok for trunk? OK. Thanks, Uros.
[PATCH] VRP comments cleanup
This removes traces of "overflow infinity" from VRP comments and one unreachable case. Bootstrap / regtest running on x86_64-unknown-linux-gnu. Richard. 2017-05-09 Richard Biener * tree-vrp.c (vrp_val_is_max): Adjust comment. (vrp_val_is_min): Likewise. (set_value_range_to_value): Likewise. (set_value_range_to_nonnegative): Likewise. (gimple_assign_nonzero_p): Likewise. (gimple_stmt_nonzero_p): Likewise. (vrp_int_const_binop): Likewise. Remove unreachable case. (adjust_range_with_scev): Adjust comments. (compare_range_with_value): Likewise. (extract_range_from_phi_node): Likewise. (test_for_singularity): Likewise. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 247781) +++ gcc/tree-vrp.c (working copy) @@ -185,11 +185,10 @@ vrp_val_min (const_tree type) return TYPE_MIN_VALUE (type); } -/* Return whether VAL is equal to the maximum value of its type. This - will be true for a positive overflow infinity. We can't do a - simple equality comparison with TYPE_MAX_VALUE because C typedefs - and Ada subtypes can produce types whose TYPE_MAX_VALUE is not == - to the integer constant with the same value in the type. */ +/* Return whether VAL is equal to the maximum value of its type. + We can't do a simple equality comparison with TYPE_MAX_VALUE because + C typedefs and Ada subtypes can produce types whose TYPE_MAX_VALUE + is not == to the integer constant with the same value in the type. */ static inline bool vrp_val_is_max (const_tree val) @@ -200,8 +199,7 @@ vrp_val_is_max (const_tree val) && operand_equal_p (val, type_max, 0))); } -/* Return whether VAL is equal to the minimum value of its type. This - will be true for a negative overflow infinity. */ +/* Return whether VAL is equal to the minimum value of its type. */ static inline bool vrp_val_is_min (const_tree val) @@ -412,8 +410,7 @@ copy_value_range (value_range *to, value /* Set value range VR to a single value. This function is only called with values we get from statements, and exists to clear the - TREE_OVERFLOW flag so that we don't think we have an overflow - infinity when we shouldn't. */ + TREE_OVERFLOW flag. */ static inline void set_value_range_to_value (value_range *vr, tree val, bitmap equiv) @@ -424,11 +421,7 @@ set_value_range_to_value (value_range *v set_value_range (vr, VR_RANGE, val, val, equiv); } -/* Set value range VR to a non-negative range of type TYPE. - OVERFLOW_INFINITY indicates whether to use an overflow infinity - rather than TYPE_MAX_VALUE; this should be true if we determine - that the range is nonnegative based on the assumption that signed - overflow does not occur. */ +/* Set value range VR to a non-negative range of type TYPE. */ static inline void set_value_range_to_nonnegative (value_range *vr, tree type) @@ -853,10 +846,7 @@ symbolic_range_based_on_p (value_range * return (min_has_symbol || max_has_symbol); } -/* Return true if the result of assignment STMT is know to be non-zero. - If the return value is based on the assumption that signed overflow is - undefined, set *STRICT_OVERFLOW_P to true; otherwise, don't change - *STRICT_OVERFLOW_P.*/ +/* Return true if the result of assignment STMT is know to be non-zero. */ static bool gimple_assign_nonzero_p (gimple *stmt) @@ -888,10 +878,7 @@ gimple_assign_nonzero_p (gimple *stmt) } } -/* Return true if STMT is known to compute a non-zero value. - If the return value is based on the assumption that signed overflow is - undefined, set *STRICT_OVERFLOW_P to true; otherwise, don't change - *STRICT_OVERFLOW_P.*/ +/* Return true if STMT is known to compute a non-zero value. */ static bool gimple_stmt_nonzero_p (gimple *stmt) @@ -1610,10 +1597,11 @@ extract_range_from_ssa_name (value_range } -/* Wrapper around int_const_binop. If the operation overflows and we - are not using wrapping arithmetic, then adjust the result to be - -INF or +INF depending on CODE, VAL1 and VAL2. This can return - NULL_TREE for division by zero. */ +/* Wrapper around int_const_binop. If the operation overflows and + overflow is undefined, then adjust the result to be + -INF or +INF depending on CODE, VAL1 and VAL2. Sets *OVERFLOW_P + to whether the operation overflowed. For division by zero + the result is indeterminate but *OVERFLOW_P is set. */ static wide_int vrp_int_const_binop (enum tree_code code, tree val1, tree val2, @@ -1699,9 +1687,8 @@ vrp_int_const_binop (enum tree_code code if (overflow && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (val1))) { - /* If the operation overflowed but neither VAL1 nor VAL2 are -overflown, return -INF or +INF depending on the operation -and the combination of signs of the operands. */ + /* If the
Re: [PATCH] disable -Walloc-size-larger-than and -Wstringop-overflow for non-C front ends (PR 80545)
On Mon, May 8, 2017 at 4:31 PM, Martin Sebor wrote: > On 05/04/2017 10:13 PM, Jeff Law wrote: >> >> On 04/28/2017 04:02 PM, Martin Sebor wrote: >>> >>> The two options were included in -Wall and enabled for all front >>> ends but only made to be recognized by the driver for the C family >>> of compilers. That made it impossible to suppress those warnings >>> when compiling code for those other front ends (like Fortran). >>> >>> The attached patch adjusts the warnings so that they are only >>> enabled for the C family of front ends and not for any others, >>> as per Richard's suggestion. (The other solution would have >>> been to make the warnings available to all front ends. Since >>> non-C languages don't have a way of calling the affected >>> functions -- or do they? -- this is probably not necessary.) >>> >>> Martin >>> >>> gcc-80545.diff >>> >>> >>> PR driver/80545 - option -Wstringop-overflow not recognized by Fortran >>> >>> gcc/c-family/ChangeLog: >>> >>> PR driver/80545 >>> * c.opt (-Walloc-size-larger-than, -Wstringop-overflow): Enable >>> and make available for the C family only. >> >> OK. >> jeff > > > It turns out that this is not the right fix. I overlooked that > -Wstringop-overflow is meant to be enabled by default and while > removing the Init(2) bit and replacing it with LangEnabledBy (C > ObjC C++ ObjC++, Wall, 2, 0) suppresses the warning in Fortran > it also disables it by default in C/C++ unless -Wall is used. > > By my reading of the Option properties part of the GCC Internals > manual there is no way to initialize a warning to on by default > while making it available only in a subset of languages. The > only way I can think of is to initialize it in the .opt file to > something like -1 and then change it at some point to 2 somewhere > in the C/C++ front ends. That seems pretty cumbersome. Am I > missing some trick? Maybe just enhance the machinery to allow LangEnabledBy (C ObjC C++ ObjC++, , 2, 0) (note empty "by") ? > Martin
[PR80582][X86] Add missing __mm256_set[r] intrinsics
Hi, This patch implements missing intrinsics: _mm256_set_m128 _mm256_set_m128d _mm256_set_m128i _mm256_setr_m128 _mm256_setr_m128d _mm256_setr_m128i gcc/ * config/i386/avxintrin.h (_mm256_set_m128, _mm256_set_m128d, _mm256_set_m128i, _mm256_setr_m128, _mm256_setr_m128d, _mm256_setr_m128i): New intrinsics. gcc/testsuite/ * gcc.target/i386/avx-vinsertf128-256-1: Test new intrinsics. * gcc.target/i386/avx-vinsertf128-256-2: Ditto. * gcc.target/i386/avx-vinsertf128-256-3: Ditto. Ok for trunk? Thanks, Julia 0001-set_.patch Description: 0001-set_.patch
Re: [PR80582][X86] Add missing __mm256_set[r] intrinsics
On Tue, May 09, 2017 at 09:28:40AM +, Koval, Julia wrote: > Hi, > > This patch implements missing intrinsics: > _mm256_set_m128 > _mm256_set_m128d > _mm256_set_m128i > _mm256_setr_m128 > _mm256_setr_m128d > _mm256_setr_m128i > > gcc/ > * config/i386/avxintrin.h (_mm256_set_m128, _mm256_set_m128d, > _mm256_set_m128i, _mm256_setr_m128, _mm256_setr_m128d, > _mm256_setr_m128i): New intrinsics. > > gcc/testsuite/ > * gcc.target/i386/avx-vinsertf128-256-1: Test new intrinsics. > * gcc.target/i386/avx-vinsertf128-256-2: Ditto. > * gcc.target/i386/avx-vinsertf128-256-3: Ditto. > > Ok for trunk? --- a/gcc/config/i386/avxintrin.h +++ b/gcc/config/i386/avxintrin.h @@ -746,6 +746,7 @@ _mm256_broadcast_ps (__m128 const *__X) return (__m256) __builtin_ia32_vbroadcastf128_ps256 (__X); } + #ifdef __OPTIMIZE__ extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_insertf128_pd (__m256d __X, __m128d __Y, const int __O) @@ -770,7 +771,6 @@ _mm256_insertf128_si256 (__m256i __X, __m128i __Y, const int __O) (__v4si)__Y, __O); } - extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_insert_epi32 (__m256i __X, int __D, int const __N) { Why the above extra whitespace changes? Especially the latter looks undesirable, there should be one empty line in between different inline functions. Jakub
RE: [PR80582][X86] Add missing __mm256_set[r] intrinsics
Sorry, fixed that. Thanks, Julia -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Tuesday, May 09, 2017 11:36 AM To: Koval, Julia Cc: GCC Patches ; Uros Bizjak ; Kirill Yukhin Subject: Re: [PR80582][X86] Add missing __mm256_set[r] intrinsics On Tue, May 09, 2017 at 09:28:40AM +, Koval, Julia wrote: > Hi, > > This patch implements missing intrinsics: > _mm256_set_m128 > _mm256_set_m128d > _mm256_set_m128i > _mm256_setr_m128 > _mm256_setr_m128d > _mm256_setr_m128i > > gcc/ > * config/i386/avxintrin.h (_mm256_set_m128, _mm256_set_m128d, > _mm256_set_m128i, _mm256_setr_m128, _mm256_setr_m128d, > _mm256_setr_m128i): New intrinsics. > > gcc/testsuite/ > * gcc.target/i386/avx-vinsertf128-256-1: Test new intrinsics. > * gcc.target/i386/avx-vinsertf128-256-2: Ditto. > * gcc.target/i386/avx-vinsertf128-256-3: Ditto. > > Ok for trunk? --- a/gcc/config/i386/avxintrin.h +++ b/gcc/config/i386/avxintrin.h @@ -746,6 +746,7 @@ _mm256_broadcast_ps (__m128 const *__X) return (__m256) __builtin_ia32_vbroadcastf128_ps256 (__X); } + #ifdef __OPTIMIZE__ extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_insertf128_pd (__m256d __X, __m128d __Y, const int __O) @@ -770,7 +771,6 @@ _mm256_insertf128_si256 (__m256i __X, __m128i __Y, const int __O) (__v4si)__Y, __O); } - extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_insert_epi32 (__m256i __X, int __D, int const __N) { Why the above extra whitespace changes? Especially the latter looks undesirable, there should be one empty line in between different inline functions. Jakub 0001-set_.patch Description: 0001-set_.patch
[Patch, testsuite, committed] Fix cunroll-13.c failure for avr
Hi, The test reports bogus failures because the loop variable i is declared as int, and the constant expected in the dump doesn't fit in an int for avr. Fixed by explicitly using __INT32_TYPE__ for targets with __SIZEOF_INT__ < 4. Committed to trunk as obvious. Regards Senthil gcc/testsuite/ 2017-05-09 Senthil Kumar Selvaraj * gcc.dg/tree-ssa/cunroll-13.c: Use __INT32_TYPE__ for for targets with __SIZEOF_INT__ < 4. diff --git gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c index f3fe8b51468..904e6dc075b 100644 --- gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c +++ gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c @@ -1,10 +1,17 @@ /* { dg-do compile } */ /* { dg-options "-O3 -fdisable-tree-evrp -fdisable-tree-cunrolli -fdisable-tree-vrp1 -fdump-tree-cunroll-blocks-details" } */ + +#if __SIZEOF_INT__ < 4 +__extension__ typedef __INT32_TYPE__ i32; +#else +typedef int i32; +#endif + struct a {int a[8];int b;}; void t(struct a *a) { - for (int i=0;i<123456 && a->a[i];i++) + for (i32 i=0;i<123456 && a->a[i];i++) a->a[i]++; } /* This pass relies on the fact that we do not eliminate the redundant test for i early.
Re: Handle data dependence relations with different bases
On Thu, May 4, 2017 at 7:21 PM, Richard Sandiford wrote: > Richard Biener writes: >> On Thu, May 4, 2017 at 2:12 PM, Richard Biener >> wrote: >>> On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford >>> wrote: This patch tries to calculate conservatively-correct distance vectors for two references whose base addresses are not the same. It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence isn't guaranteed to occur. The motivating example is: struct s { int x[8]; }; void f (struct s *a, struct s *b) { for (int i = 0; i < 8; ++i) a->x[i] += b->x[i]; } in which the "a" and "b" accesses are either independent or have a dependence distance of 0 (assuming -fstrict-aliasing). Neither case prevents vectorisation, so we can vectorise without an alias check. I'd originally wanted to do the same thing for arrays as well, e.g.: void f (int a[][8], struct b[][8]) { for (int i = 0; i < 8; ++i) a[0][i] += b[0][i]; } I think this is valid because C11 6.7.6.2/6 says: For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. So if we access an array through an int (*)[8], it must have type X[8] or X[], where X is compatible with int. It doesn't seem possible in either case for "a[0]" and "b[0]" to overlap when "a != b". However, Richard B said that (at least in gimple) we support arbitrary overlap of arrays and allow arrays to be accessed with different dimensionality. There are examples of this in PR50067. I've therefore only handled references that end in a structure field access. There are two ways of handling these dependences in the vectoriser: use them to limit VF, or check at runtime as before. I've gone for the approach of checking at runtime if we can, to avoid limiting VF unnecessarily. We still fall back to a VF cap when runtime checks aren't allowed. The patch tests whether we queued an alias check with a dependence distance of X and then picked a VF <= X, in which case it's safe to drop the alias check. Since vect_prune_runtime_alias_check_list can be called twice with different VF for the same loop, it's no longer safe to clear may_alias_ddrs on exit. Instead we should use comp_alias_ddrs to check whether versioning is necessary. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? >>> >>> You seem to do your "fancy" thing but also later compute the old >>> base equality anyway (for same_base_p). It looks to me for this >>> case the new fancy code can be simply skipped, keeping num_dimensions >>> as before? >>> >>> + /* Try to approach equal type sizes. */ >>> + if (!COMPLETE_TYPE_P (type_a) >>> + || !COMPLETE_TYPE_P (type_b) >>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a)) >>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b))) >>> + break; >>> >>> ah, interesting idea to avoid a quadratic search. Note that you should >>> conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR >>> as they are used for type-punning. > > All the component refs here should be REALPART_EXPRs, IMAGPART_EXPRs, > ARRAY_REFs or COMPONENT_REFs of structures, since that's all that > dr_analyze_indices allows, so I think we safe in terms of the tree codes. Yeah. I think we need to document that we should have a 1:1 match here. >>> I see nonoverlapping_component_refs_of_decl_p should simply skip >>> ARRAY_REFs - but I also see there: >>> >>> /* ??? We cannot simply use the type of operand #0 of the refs here >>> as the Fortran compiler smuggles type punning into COMPONENT_REFs >>> for common blocks instead of using unions like everyone else. */ >>> tree type1 = DECL_CONTEXT (field1); >>> tree type2 = DECL_CONTEXT (field2); >>> >>> so you probably can't simply use TREE_TYPE (outer_ref) for type >>> compatibility. >>> You also may not use types_compatible_p here as for LTO that is _way_ too >>> lax for aggregates. The above uses >>> >>> /* We cannot disambiguate fields in a union or qualified union. */ >>> if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE) >>> return false; >>> >>> so you should also bail out on unions here, rather than the check you do >>> later. > > The loop stops before we get to a union, so I think "only" the RECORD_TYPE > COMPONENT_REF handling is a potential problem. Does this mean that > I should use the nonoverlapping_component_refs_of_decl_p code: > > tree field1 = TREE_OPERAND (ref1, 1); > tree field2 = TREE_OPERAND (ref2, 1); > > /* ??? We cannot simpl
Re: Handle data dependence relations with different bases
On Fri, May 5, 2017 at 11:27 PM, Bernhard Reutner-Fischer wrote: > On 4 May 2017 14:12:04 CEST, Richard Biener > wrote: > >>nonoverlapping_component_refs_of_decl_p >>should simply skip ARRAY_REFs - but I also see there: >> >>/* ??? We cannot simply use the type of operand #0 of the refs here >> as the Fortran compiler smuggles type punning into COMPONENT_REFs >> for common blocks instead of using unions like everyone else. */ >> tree type1 = DECL_CONTEXT (field1); >> tree type2 = DECL_CONTEXT (field2); >> >>so you probably can't simply use TREE_TYPE (outer_ref) for type >>compatibility. >>You also may not use types_compatible_p here as for LTO that is _way_ >>too >>lax for aggregates. The above uses >> >>/* We cannot disambiguate fields in a union or qualified union. */ >> if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE) >> return false; >> >>so you should also bail out on unions here, rather than the check you >>do later. >> >>You seem to rely on getting an access_fn entry for each >>handled_component_p. >>It looks like this is the case -- we even seem to stop at unions (with >>the same >>fortran "issue"). I'm not sure that's the best thing to do but you >>rely on that. > > Is there a PR for the (IIUC) common as union? > Maybe around > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41227 > COMMON block, BIND(C) and LTO interoperability issues I'm not sure, this is Erics code so maybe he remembers. Richard. > Thanks
Re: [RFC] S/390: Alignment peeling prolog generation
On Mon, May 8, 2017 at 6:11 PM, Robin Dapp wrote: >> So the new part is the last point? There's a lot of refactoring in > 3/3 that >> makes it hard to see what is actually changed ... you need to resist >> in doing this, it makes review very hard. > > The new part is actually spread across the three last "-"s. Attached is > a new version of [3/3] split up into two patches with hopefully less > blending of refactoring and new functionality. > > [3/4] Computes full costs when peeling for unknown alignment, uses > either read or write and compares the better one with the peeling costs > for known alignment. If the peeling for unknown alignment "aligns" more > than twice the number of datarefs, it is preferred over the peeling for > known alignment. > > [4/4] Computes the costs for no peeling and compares them with the costs > of the best peeling so far. If it is not more expensive, no peeling > will be performed. > >> I think it's always best to align a ref with known alignment as that > simplifies >> conditions and allows followup optimizations (unrolling of the >> prologue / epilogue). >> I think for this it's better to also compute full costs rather than > relying on >> sth as simple as "number of same aligned refs". >> >> Does the code ever end up misaligning a previously known aligned ref? > > The following case used to get aligned via the known alignment of dd but > would not anymore since peeling for unknown alignment aligns two > accesses. I guess the determining factor is still up for scrutiny and > should probably > 2. Still, on e.g. s390x no peeling is performed due > to costs. Ok, in principle this makes sense if we manage to correctly compute the costs. What exactly is profitable or not is of course subject to the target costs. Richard. > void foo(int *restrict a, int *restrict b, int *restrict c, int > *restrict d, unsigned int n) > { > int *restrict dd = __builtin_assume_aligned (d, 8); > for (unsigned int i = 0; i < n; i++) > { > b[i] = b[i] + a[i]; > c[i] = c[i] + b[i]; > dd[i] = a[i]; > } > } > > Regards > Robin >
Re: [PATCH 3/4] Vect peeling cost model
On Mon, May 8, 2017 at 6:12 PM, Robin Dapp wrote: > gcc/ChangeLog: + /* Compare costs of peeling for known and unknown alignment. */ + if (unknown_align_inside_cost > peel_for_known_alignment.inside_cost + || (unknown_align_inside_cost == peel_for_known_alignment.inside_cost + && unknown_align_outside_cost > peel_for_known_alignment.outside_cost)) +{ no braces around single stmts. + dr0 = dr0_known_align; +} + I think when equal we should prefer dr0_known_align peeling. That is, I'd simply use if (unknown_align_inside_cost >= peel_for_known_alignment.inside_cost) dr0 = dr0_known_align; this is because followup optimizations are easier with the prologue/epilogue having niters known. + /* We might still want to try to align the datarefs with unknown + misalignment if peeling for known alignment aligns significantly + less datarefs. */ + if (peel_for_known_alignment.peel_info.count * 2 > unknown_align_count) +{ + dr0 = dr0_known_align; the comment doesn't match the code. I also think this heuristic is bogus and instead the cost computation should have figured out the correct DR to peel in the first place. Otherwise this patch looks ok. Thanks, Richard. > 2017-05-08 Robin Dapp > > * tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling): > Return peel info. > (vect_enhance_data_refs_alignment): > Compute full costs when peeling for unknown alignment, compare > to costs for peeling for known alignment and choose the cheaper > one.
Re: [PATCH, GCC/LTO, ping3] Fix PR69866: LTO with def for weak alias in regular object file
Ping? Best regards, Thomas On 02/05/17 17:52, Thomas Preudhomme wrote: Now that GCC 7 is released, ping? Original message below: Hi, This patch fixes an assert failure when linking one LTOed object file having a weak alias with a regular object file containing a strong definition for that same symbol. The patch is twofold: + do not add an alias to a partition if it is external + do not declare (.globl) an alias if it is external ChangeLog entries are as follow: *** gcc/lto/ChangeLog *** 2017-03-01 Thomas Preud'homme PR lto/69866 * lto/lto-partition.c (add_symbol_to_partition_1): Do not add external aliases to partition. *** gcc/ChangeLog *** 2017-03-01 Thomas Preud'homme PR lto/69866 * cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not declare external aliases. *** gcc/testsuite/ChangeLog *** 2017-02-28 Thomas Preud'homme PR lto/69866 * gcc.dg/lto/pr69866_0.c: New test. * gcc.dg/lto/pr69866_1.c: Likewise. Testing: Testsuite shows no regression when targeting Cortex-M3 with an arm-none-eabi GCC cross-compiler, neither does it show any regression with native LTO-bootstrapped x86-64_linux-gnu and aarch64-linux-gnu compilers. Is this ok for stage4? Best regards, Thomas On 31/03/17 18:07, Richard Biener wrote: On March 31, 2017 5:23:03 PM GMT+02:00, Jeff Law wrote: On 03/16/2017 08:05 AM, Thomas Preudhomme wrote: Ping? Is this ok for stage4? Given the lack of response from Richi, I'd suggest deferring to stage1. Honza needs to review this, i habe too little knowledge here. Richard. jeff diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index c82a88a599ca61b068dd9783d2a6158163809b37..580500ff922b8546d33119261a2455235edbf16d 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1972,7 +1972,7 @@ cgraph_node::assemble_thunks_and_aliases (void) FOR_EACH_ALIAS (this, ref) { cgraph_node *alias = dyn_cast (ref->referring); - if (!alias->transparent_alias) + if (!alias->transparent_alias && !DECL_EXTERNAL (alias->decl)) { bool saved_written = TREE_ASM_WRITTEN (decl); diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c index e27d0d1690c1fcfb39e2fac03ce0f4154031fc7c..f44fd435ed075a27e373bdfdf0464eb06e1731ef 100644 --- a/gcc/lto/lto-partition.c +++ b/gcc/lto/lto-partition.c @@ -178,7 +178,8 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node) /* Add all aliases associated with the symbol. */ FOR_EACH_ALIAS (node, ref) -if (!ref->referring->transparent_alias) +if (!ref->referring->transparent_alias + && ref->referring->get_partitioning_class () != SYMBOL_EXTERNAL) add_symbol_to_partition_1 (part, ref->referring); else { @@ -189,7 +190,8 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node) { /* Nested transparent aliases are not permitted. */ gcc_checking_assert (!ref2->referring->transparent_alias); - add_symbol_to_partition_1 (part, ref2->referring); + if (ref2->referring->get_partitioning_class () != SYMBOL_EXTERNAL) + add_symbol_to_partition_1 (part, ref2->referring); } } diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_0.c b/gcc/testsuite/gcc.dg/lto/pr69866_0.c new file mode 100644 index ..f49ef8d4c1da7a21d1bfb5409d647bd18141595b --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr69866_0.c @@ -0,0 +1,13 @@ +/* { dg-lto-do link } */ + +int _umh(int i) +{ + return i+1; +} + +int weaks(int i) __attribute__((weak, alias("_umh"))); + +int main() +{ + return weaks(10); +} diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_1.c b/gcc/testsuite/gcc.dg/lto/pr69866_1.c new file mode 100644 index ..3a14f850eefaffbf659ce4642adef7900330f4ed --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr69866_1.c @@ -0,0 +1,6 @@ +/* { dg-options { -fno-lto } } */ + +int weaks(int i) +{ + return i+1; +}
Re: [patch, fortran] Reduce stack use in blocked matmul
On Mai 05 2017, Thomas Koenig wrote: > @@ -227,6 +226,17 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl >if (m == 0 || n == 0 || k == 0) > return; > > + /* Adjust size of t1 to what is needed. */ > + index_type t1_dim; > + t1_dim = (a_dim1-1) * 256 + b_dim1; > + if (t1_dim > 65536) > + t1_dim = 65536; What happens if (a_dim1-1) * 256 + b_dim1 > 65536? Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH 4/4] Vect peeling cost model
On Mon, May 8, 2017 at 6:13 PM, Robin Dapp wrote: > gcc/ChangeLog: > > 2017-05-08 Robin Dapp > > * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): > Remove unused variable. > (vect_enhance_data_refs_alignment): > Compare best peelings costs to doing no peeling and choose no > peeling if equal. no braces around single stmt ifs please. + /* Add epilogue costs. As we do no peeling for alignment here, no prologue + costs will be recorded. */ + stmt_vector_for_cost prologue_cost_vec, epilogue_cost_vec; + prologue_cost_vec.create (2); + epilogue_cost_vec.create (2); + + int dummy2; + nopeel_outside_cost += vect_get_known_peeling_cost +(loop_vinfo, vf / 2, &dummy2, ^^ pass 0 instead of vf / 2? + &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), + &prologue_cost_vec, &epilogue_cost_vec); + /* Check if doing no peeling is not more expensive than the best peeling we + have so far. */ + if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)) + && vect_supportable_dr_alignment (dr0, false) + && ((nopeel_inside_cost < best_peel.inside_cost) + || (nopeel_inside_cost == best_peel.inside_cost + && nopeel_outside_cost <= best_peel.outside_cost))) +{ + do_peeling = false; + npeel = 0; +} please on tie do no peeling, thus change to if (... && nopeel_inside_cost <= best_peel.inside_cost) I'm not sure why you test for unlimited_cost_model here as I said elsewhere I'm not sure what not cost modeling means for static decisions. The purpose of unlimited_cost_model is to always vectorize when possible and omit the runtime profitability check. So for peeling I'd just always use the cost model. Thus please drop this check. Otherwise ok. Thanks, Richard.
[PATCH] Fix typo in -Wimplicit-fallthrough documentation
It was pointed out to me that we have "the these" in docs. Thus fixed. Applying to trunk. 2017-05-09 Marek Polacek * doc/invoke.texi: Fix typo. diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi index 3308b63..2a075b2 100644 --- gcc/doc/invoke.texi +++ gcc/doc/invoke.texi @@ -4279,10 +4279,10 @@ switch (cond) C++17 provides a standard way to suppress the @option{-Wimplicit-fallthrough} warning using @code{[[fallthrough]];} instead of the GNU attribute. In C++11 or C++14 users can use @code{[[gnu::fallthrough]];}, which is a GNU extension. -Instead of the these attributes, it is also possible to add a fallthrough -comment to silence the warning. The whole body of the C or C++ style comment -should match the given regular expressions listed below. The option argument -@var{n} specifies what kind of comments are accepted: +Instead of these attributes, it is also possible to add a fallthrough comment +to silence the warning. The whole body of the C or C++ style comment should +match the given regular expressions listed below. The option argument @var{n} +specifies what kind of comments are accepted: @itemize @bullet Marek
Re: [PATCH] decl lang hooks
On Mon, May 08, 2017 at 03:16:13PM -0400, Nathan Sidwell wrote: > +/* Return the list of decls in the global namespace. */ > + > +static > +tree get_global_decls () > +{ > + return NAMESPACE_LEVEL (global_namespace)->names; > +} > + > +/* Push DECL into the current scope. */ > + > +static > +tree cxx_pushdecl (tree decl) > +{ > + return pushdecl (decl); > +} These look weird - I'd expect to see "static tree" on the same line. Marek
Re: [PATCH] decl lang hooks
On 05/09/2017 07:01 AM, Marek Polacek wrote: On Mon, May 08, 2017 at 03:16:13PM -0400, Nathan Sidwell wrote: +/* Return the list of decls in the global namespace. */ + +static +tree get_global_decls () These look weird - I'd expect to see "static tree" on the same line. D'oh! thanks for noticing. nathan -- Nathan Sidwell
Re: [PATCH] decl lang hooks
On 05/08/2017 05:34 PM, Joseph Myers wrote: On Mon, 8 May 2017, Nathan Sidwell wrote: This patch changes the C++ FE to override the pushdecl and getdecl lang hooks. In addition to simply overriding them there, I had to fixup a couple of places in c-family/c-common.c and objc/objc-gnu-runtime-abi-01.c to use the pushdecl hook. The c/ and c-family/ changes are OK. Thanks. I've taken the objc change as therefore obvious and committed (with the formatting fix Marek pointed out). nathan -- Nathan Sidwell
RE: [PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests
Hi, Can you please commit it for me? Thanks, Sebastian -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, May 9, 2017 10:40 AM To: Peryt, Sebastian Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com Subject: Re: [PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests On Mon, May 8, 2017 at 9:53 AM, Peryt, Sebastian wrote: > Hi, > > This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS > instructions. > > gcc/testsuite/ > * gcc.target/i386/avx512f-vaddsd-2.c: Test fixed. > * gcc.target/i386/avx512f-vaddss-2.c: Ditto. > * gcc.target/i386/avx512f-vsubsd-2.c: Ditto. > * gcc.target/i386/avx512f-vsubss-2.c: Ditto. > > Is it ok for trunk? OK. Thanks, Uros.
[PATCH][x86] Add missing intrinsics for DIV[SD,SS] and MUL[SD,SS]
Hi, This patch adds missing intrinsics for DIVSD, DIVSS, MULSD and MULSS instructions. 2017-05-09 Sebastian Peryt gcc/ * config/i386/avx512fintrin.h (_mm_mask_mul_round_sd, _mm_maskz_mul_round_sd, _mm_mask_mul_round_ss, _mm_maskz_mul_round_ss, _mm_mask_div_round_sd, _mm_maskz_div_round_sd, _mm_mask_div_round_ss, _mm_maskz_div_round_ss, _mm_mask_mul_sd, _mm_maskz_mul_sd, _mm_mask_mul_ss, _mm_maskz_mul_ss, _mm_mask_div_sd, _mm_maskz_div_sd, _mm_mask_div_ss, _mm_maskz_div_ss): New intrinsics. * config/i386/i386-builtin-types.def (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): New function type aliases. * config/i386/i386-builtin.def (__builtin_ia32_divsd_mask_round, __builtin_ia32_divss_mask_round, __builtin_ia32_mulsd_mask_round, __builtin_ia32_mulss_mask_round): New builtins. * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types. * config/i386/sse.md (_vm3): Renamed to ... (_vm3): ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... (v\t{%2, %1, %0|%0, %1, %2}): ... this. gcc/testsuite/ * gcc.target/i386/avx512f-vdivsd-1.c (_mm_mask_div_sd, _mm_maskz_div_sd, _mm_mask_div_round_sd, _mm_maskz_div_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vdivsd-2.c: New. * gcc.target/i386/avx512f-vdivss-1.c (_mm_mask_div_ss, _mm_maskz_div_ss, _mm_mask_div_round_ss, _mm_maskz_div_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vdivss-2.c: New. * gcc.target/i386/avx512f-vmulsd-1.c (_mm_mask_mul_sd, _mm_maskz_mul_sd, _mm_mask_mul_round_sd, _mm_maskz_mul_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vmulsd-2.c: New. * gcc.target/i386/avx512f-vmulss-1.c (_mm_mask_mul_ss, _mm_maskz_mul_ss, _mm_mask_mul_round_ss, _mm_maskz_mul_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vmulss-2.c: New. * gcc.target/i386/avx-1.c (__builtin_ia32_divsd_mask_round, __builtin_ia32_divss_mask_round, __builtin_ia32_mulsd_mask_round, __builtin_ia32_mulss_mask_round): Test new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c (_mm_maskz_div_round_sd, _mm_maskz_div_round_ss, _mm_maskz_mul_round_sd, _mm_maskz_mul_round_ss): Test new intrinsics. * gcc.target/i386/testround-1.c: Ditto. Is it ok for trunk? Sebastian DIV[SD_SS]_MUL[SD_SS]_patch.patch Description: DIV[SD_SS]_MUL[SD_SS]_patch.patch
[PATCH][x86] Add missing intrinsics for MAX[SD,SS] and MIN[SD,SS]
Hi, This patch adds missing intrinsics for MAXSD, MAXSS, MINSD and MINSS instructions. 2017-05-09 Sebastian Peryt gcc/ * config/i386/avx512fintrin.h (_mm_mask_max_round_sd, _mm_maskz_max_round_sd, _mm_mask_max_round_ss, _mm_maskz_max_round_ss, _mm_mask_min_round_sd, _mm_maskz_min_round_sd, _mm_mask_min_round_ss, _mm_maskz_min_round_ss): New intrinsics. * config/i386/i386-builtin-types.def (V2DF, V2DF, V2DF, V2DF, UQI, INT, V4SF, V4SF, V4SF, V4SF, UQI, INT): New function type aliases. * config/i386/i386-builtin.def (__builtin_ia32_maxsd_mask_round, __builtin_ia32_maxss_mask_round, __builtin_ia32_minsd_mask_round, __builtin_ia32_minss_mask_round): New builtins. * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types. * config/i386/sse.md (_vm3): Renamed to ... (_vm3): ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... (v\t{%2, %1, %0|%0, %1, %2}): ... this. gcc/testsuite/ * gcc.target/i386/avx512f-vmaxsd-1.c (_mm_mask_max_round_sd, _mm_maskz_max_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vmaxsd-2.c: New. * gcc.target/i386/avx512f-vmaxss-1.c (_mm_mask_max_round_ss, _mm_maskz_max_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vmaxss-2.c: New. * gcc.target/i386/avx512f-vminsd-1.c (_mm_mask_min_round_sd, _mm_maskz_min_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vminsd-2.c: New. * gcc.target/i386/avx512f-vminss-1.c (_mm_mask_min_round_ss, _mm_maskz_min_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vminss-2.c: New. * gcc.target/i386/avx-1.c (__builtin_ia32_maxsd_mask_round, __builtin_ia32_maxss_mask_round, __builtin_ia32_minsd_mask_round, __builtin_ia32_minss_mask_round): Test new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c (_mm_maskz_max_round_sd, _mm_maskz_max_round_ss, _mm_maskz_min_round_sd, _mm_maskz_min_round_ss, _mm_mask_max_round_sd, _mm_mask_max_round_ss, _mm_mask_min_round_sd, _mm_mask_min_round_ss): Test new intrinsics. * gcc.target/i386/testround-1.c: Ditto. Is it ok for trunk? Thanks, Sebastian MAX[SD_SS]_MIN[SD_SS]_patch.patch Description: MAX[SD_SS]_MIN[SD_SS]_patch.patch
Re: [RFC][PATCH] Introduce -fdump*-folding
On 05/05/2017 01:50 PM, Richard Biener wrote: > On Thu, May 4, 2017 at 1:10 PM, Martin Liška wrote: >> On 05/04/2017 12:40 PM, Richard Biener wrote: >>> >>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška wrote: On 05/03/2017 12:12 PM, Richard Biener wrote: > > > On Wed, May 3, 2017 at 10:10 AM, Martin Liška wrote: >> >> >> Hello >> >> Last release cycle I spent quite some time with reading of IVOPTS pass >> dump file. Using -fdump*-details causes to generate a lot of 'Applying >> pattern' >> lines, which can make reading of a dump file more complicated. >> >> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >> Percentage number >> shows how many lines are of the aforementioned pattern: >> >> tramp3d-v4.cpp.164t.ivopts: 6.34% >> tramp3d-v4.cpp.091t.ccp2: 5.04% >> tramp3d-v4.cpp.093t.cunrolli: 4.41% >> tramp3d-v4.cpp.129t.laddress: 3.70% >> tramp3d-v4.cpp.032t.ccp1: 2.31% >> tramp3d-v4.cpp.038t.evrp: 1.90% >> tramp3d-v4.cpp.033t.forwprop1: 1.74% >> tramp3d-v4.cpp.103t.vrp1: 1.52% >> tramp3d-v4.cpp.124t.forwprop3: 1.31% >> tramp3d-v4.cpp.181t.vrp2: 1.30% >>tramp3d-v4.cpp.161t.cunroll: 1.22% >> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>tramp3d-v4.cpp.153t.ivcanon: 1.07% >> tramp3d-v4.cpp.126t.ccp3: 0.96% >> tramp3d-v4.cpp.143t.sccp: 0.91% >> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>tramp3d-v4.cpp.011t.cfg: 0.74% >> tramp3d-v4.cpp.096t.forwprop2: 0.50% >> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>tramp3d-v4.cpp.133t.pre: 0.32% >> tramp3d-v4.cpp.182t.phicprop2: 0.27% >> tramp3d-v4.cpp.170t.veclower21: 0.25% >>tramp3d-v4.cpp.029t.einline: 0.24% >> >> I'm suggesting to add new TDF that will be allocated for that. >> Patch can bootstrap on ppc64le-redhat-linux and survives regression >> tests. >> >> Thoughts? > > > > Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 > bit > left > if you allow negative dump_flags). It'll tickle down on a lot of > interfaces > so introducing dump_flags_t at the same time might be a good idea. Hello. I've prepared patch that migrates all interfaces and introduces dump_flags_t. >>> >>> >>> Great. >>> I've been currently testing that. Apart from that Richi requested to come up with more generic approach of hierarchical structure of options. >>> >>> >>> Didn't really "request" it, it's just something we eventually need to do >>> when >>> we run out of bits again ;) >> >> >> I know, but it was me who came up with the idea of more fine suboptions :) >> >>> Can you please take a look at self-contained source file that shows way I've decided to go? Another question is whether we want to implement also "aliases", where for instance current 'all' is equal to union of couple of suboptions? >>> >>> >>> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >>> about -all-foo-all. >> >> >> Actually only having 'all' is quite easy to implement. >> >> Let's imagine following hierarchy: >> >> (root) >> - vops >> - folding >> - gimple >> - ctor >> - array_ref >> - arithmetic >> - generic >> - c >> - c++ >> - ctor >> - xyz >> >> Then '-fdump-passname-folding-all' will be equal to >> '-fdump-passname-folding'. > > Ok, so you envision that sub-options restrict stuff. I thought of > > -gimple >-vops > -generic >-folding > > so the other way around. We do not have many options that would be RTL > specific but gimple only are -vops -alias -scev -gimple -rhs-only > -verbose -memsyms > while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup > stuff you already saw. > > So it looks like a 8 bit "group id" plus 56 bits of flags would do. > > Yes, this implies reworking how & and | work. For example you can't > | dump-flags of different groups. Well, I'm not opposed to idea of converting that to way you described. So, you're willing to introduce something like: (root) - generic - eh - folding - ... - gimple - vops - folding - rhs-only - ... - vops - rtl - cselib - ... ? > >>> >>> The important thing is to make sure dump_flags_t stays POD and thus is
Re: [PATCH 2/N] Add dump_flags_type for handling of suboptions.
On 05/05/2017 12:44 PM, Martin Liška wrote: > Hi. > > This one is more interesting as it implements hierarchical option parsing > and as a first step I implemented that for optgroup suboptions. > > Next candidates are dump_option_value_info and obviously my primary > motivation: > dump_option_value_info. > > I'm expecting feedback for implementation I've decided to come up with. > Patch has been tested. > > Thanks, > Martin > Update version of the patch. Actually it contains of 2 parts, where the second one is mechanical replacement of enum values. It's still work-in-progress as we're still tuning internal representation. Martin 0002-Port-MSG_-option-values-to-OPGROUP_.patch.bz2 Description: application/bzip >From 1a6f1dc6333bef2a643eacacbbc31f83a10e28ee Mon Sep 17 00:00:00 2001 From: marxin Date: Tue, 9 May 2017 13:22:23 +0200 Subject: [PATCH 1/2] Port -fopt-info to new option infrastructure. --- gcc/dumpfile.c | 332 ++-- gcc/dumpfile.h | 125 ++--- gcc/passes.c| 2 +- gcc/tree-pass.h | 2 +- 4 files changed, 240 insertions(+), 221 deletions(-) diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c index 82c4fc9d4ff..e3db0a5ff7e 100644 --- a/gcc/dumpfile.c +++ b/gcc/dumpfile.c @@ -33,10 +33,10 @@ along with GCC; see the file COPYING3. If not see #define skip_leading_substring(whole, part) \ (strncmp (whole, part, strlen (part)) ? NULL : whole + strlen (part)) -static dump_flags_t pflags; /* current dump_flags */ -static dump_flags_t alt_flags; /* current opt_info flags */ +static dump_flags_t pflags; /* current dump_flags */ +static optgroup_dump_flags_t opt_info_flags; /* current opt_info flags */ -static void dump_loc (dump_flags_t, FILE *, source_location); +static void dump_loc (optgroup_dump_flags_t, FILE *, source_location); static FILE *dump_open_alternate_stream (struct dump_file_info *); /* These are currently used for communicating between passes. @@ -51,31 +51,33 @@ dump_flags_t dump_flags; TREE_DUMP_INDEX enumeration in dumpfile.h. */ static struct dump_file_info dump_files[TDI_end] = { - {NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 0, 0, false, false}, + {NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, optgroup_dump_flags_t (), + optgroup_dump_flags_t (), 0, 0, 0, false, false}, {".cgraph", "ipa-cgraph", NULL, NULL, NULL, NULL, NULL, TDF_IPA, - 0, 0, 0, 0, 0, false, false}, - {".type-inheritance", "ipa-type-inheritance", NULL, NULL, NULL, NULL, NULL, TDF_IPA, - 0, 0, 0, 0, 0, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 0, false, false}, + {".type-inheritance", "ipa-type-inheritance", NULL, NULL, NULL, NULL, NULL, +TDF_IPA, optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 0, +false, false}, {".ipa-clones", "ipa-clones", NULL, NULL, NULL, NULL, NULL, TDF_IPA, - 0, 0, 0, 0, 0, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 0, false, false}, {".tu", "translation-unit", NULL, NULL, NULL, NULL, NULL, TDF_TREE, - 0, 0, 0, 0, 1, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 1, false, false}, {".class", "class-hierarchy", NULL, NULL, NULL, NULL, NULL, TDF_TREE, - 0, 0, 0, 0, 2, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 2, false, false}, {".original", "tree-original", NULL, NULL, NULL, NULL, NULL, TDF_TREE, - 0, 0, 0, 0, 3, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 3, false, false}, {".gimple", "tree-gimple", NULL, NULL, NULL, NULL, NULL, TDF_TREE, - 0, 0, 0, 0, 4, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 4, false, false}, {".nested", "tree-nested", NULL, NULL, NULL, NULL, NULL, TDF_TREE, - 0, 0, 0, 0, 5, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 5, false, false}, #define FIRST_AUTO_NUMBERED_DUMP 6 {NULL, "tree-all", NULL, NULL, NULL, NULL, NULL, TDF_TREE, - 0, 0, 0, 0, 0, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 0, false, false}, {NULL, "rtl-all", NULL, NULL, NULL, NULL, NULL, TDF_RTL, - 0, 0, 0, 0, 0, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 0, false, false}, {NULL, "ipa-all", NULL, NULL, NULL, NULL, NULL, TDF_IPA, - 0, 0, 0, 0, 0, false, false}, + optgroup_dump_flags_t (), optgroup_dump_flags_t (), 0, 0, 0, false, false}, }; /* Define a name->number mapping for a dump flag value. */ @@ -94,9 +96,7 @@ static const struct dump_option_value_info dump_options[] = {"slim", TDF_SLIM}, {"raw", TDF_RAW}, {"graph", TDF_GRAPH}, - {"details", (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS - | MSG_MISSED_OPTIMIZATION - | MSG_NOTE)}, + {"details", TDF_DETAILS}, {"cselib", TDF_CSELIB}, {"stats", TDF_STATS}, {"blocks", TDF_BLOCKS}, @@ -112,10 +112,6 @@ static cons
[PATCH] new -fdump flag
This patch adds a new '-fdump-front-end' flag and associated dump machinery. I use it on the modules branch, as that's sufficiently complex to need a dumper. The dump file is unnumbered with a '.fe' suffix. Perhaps it will be useful for other front ends too. I'm also prepared to remove the -fdump-translation-unit dumper, which is a completely inscrutable c++ only dump, that I think is well past its best-before date. (Jason?) If there's a preference to hold this off until merging modules, that's fine. Just thought I'd float it now. nathan -- Nathan Sidwell 2017-05-09 Nathan Sidwell Front end dump file. gcc/ * dumpfile.h (tree_dump_index): Add TDI_lang. (TDF_LANG): New. * dumpfile.c (dump_files): Add front-end. (dump_option_value_info): Add lang. Adjust all. gcc/doc/ * invoke.texi (-fdump-front-end): Document. Index: doc/invoke.texi === --- doc/invoke.texi (revision 247784) +++ doc/invoke.texi (working copy) @@ -542,6 +542,7 @@ Objective-C and Objective-C++ Dialects}. -fdump-noaddr -fdump-unnumbered -fdump-unnumbered-links @gol -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol -fdump-final-insns@r{[}=@var{file}@r{]} +-fdump-front-end @gol -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol -fdump-passes @gol -fdump-rtl-@var{pass} -fdump-rtl-@var{pass}=@var{filename} @gol @@ -12948,6 +12953,11 @@ same directory as the output file. If t is used, @var{options} controls the details of the dump as described for the @option{-fdump-tree} options. +@item -fdump-front-end +@opindex fdump-front-end +Dump front-end-specific information. The file name is made by appending +@file{.fe} to the source file name. + @item -fdump-ipa-@var{switch} @opindex fdump-ipa Control the dumping at various stages of inter-procedural analysis Index: dumpfile.c === --- dumpfile.c (revision 247784) +++ dumpfile.c (working copy) @@ -51,6 +51,8 @@ int dump_flags; static struct dump_file_info dump_files[TDI_end] = { {NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 0, 0, false, false}, + {".fe", "front-end", NULL, NULL, NULL, NULL, NULL, TDF_LANG, + OPTGROUP_OTHER, 0, 0, 0, -1, false, false}, {".cgraph", "ipa-cgraph", NULL, NULL, NULL, NULL, NULL, TDF_IPA, 0, 0, 0, 0, 0, false, false}, {".type-inheritance", "ipa-type-inheritance", NULL, NULL, NULL, NULL, NULL, TDF_IPA, @@ -115,10 +117,11 @@ static const struct dump_option_value_in {"missed", MSG_MISSED_OPTIMIZATION}, {"note", MSG_NOTE}, {"optall", MSG_ALL}, + {"lang", TDF_LANG}, {"all", ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_TREE | TDF_RTL | TDF_IPA | TDF_STMTADDR | TDF_GRAPH | TDF_DIAGNOSTIC | TDF_VERBOSE | TDF_RHS_ONLY | TDF_NOUID | TDF_ENUMERATE_LOCALS | TDF_SCEV - | TDF_GIMPLE)}, + | TDF_GIMPLE | TDF_LANG)}, {NULL, 0} }; Index: dumpfile.h === --- dumpfile.h (revision 247784) +++ dumpfile.h (working copy) @@ -27,6 +27,7 @@ along with GCC; see the file COPYING3. enum tree_dump_index { TDI_none, /* No dump */ + TDI_lang, /* Lang-specific. */ TDI_cgraph, /* dump function call graph. */ TDI_inheritance, /* dump type inheritance graph. */ TDI_clones, /* dump IPA cloning decisions. */ @@ -89,7 +90,7 @@ enum tree_dump_index #define MSG_NOTE (1 << 29) /* general optimization info */ #define MSG_ALL (MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION \ | MSG_NOTE) - +#define TDF_LANG (1 << 30) /* Lang-specific dumper. */ /* Flags to control high-level -fopt-info dumps. Usually these flags define a group of passes. An optimization pass can be part of
[c++ PATCH] PR c++/80682
Tested on Linux-x64, not tested with the full suite yet. 2017-05-09 Ville Voutilainen gcc/ PR c++/80682 * cp/method.c (is_trivially_xible): Reject void types. testsuite/ PR c++/80682 * g++.dg/ext/is_trivially_constructible1.C: Add tests for void target. diff --git a/gcc/cp/method.c b/gcc/cp/method.c index b4c1f60..911f6ee 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -1207,6 +1207,8 @@ constructible_expr (tree to, tree from) bool is_trivially_xible (enum tree_code code, tree to, tree from) { + if (to == void_type_node) +return false; tree expr; if (code == MODIFY_EXPR) expr = assignable_expr (to, from); diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C index a5bac7b..bfe17dc 100644 --- a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C +++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C @@ -27,10 +27,12 @@ SA(!__is_trivially_constructible(C,C&)); SA(__is_trivially_assignable(C,C&)); SA(!__is_trivially_assignable(C,C)); SA(!__is_trivially_assignable(C,C&&)); +SA(!__is_trivially_assignable(void,int)); SA(__is_trivially_constructible(int,int)); SA(__is_trivially_constructible(int,double)); SA(!__is_trivially_constructible(int,B)); +SA(!__is_trivially_constructible(void,int)); SA(!__is_trivially_constructible(D));
Re: [RFC][PATCH] Introduce -fdump*-folding
On Tue, May 9, 2017 at 2:01 PM, Martin Liška wrote: > On 05/05/2017 01:50 PM, Richard Biener wrote: >> On Thu, May 4, 2017 at 1:10 PM, Martin Liška wrote: >>> On 05/04/2017 12:40 PM, Richard Biener wrote: On Thu, May 4, 2017 at 11:22 AM, Martin Liška wrote: > > On 05/03/2017 12:12 PM, Richard Biener wrote: >> >> >> On Wed, May 3, 2017 at 10:10 AM, Martin Liška wrote: >>> >>> >>> Hello >>> >>> Last release cycle I spent quite some time with reading of IVOPTS pass >>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>> pattern' >>> lines, which can make reading of a dump file more complicated. >>> >>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>> Percentage number >>> shows how many lines are of the aforementioned pattern: >>> >>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>> tramp3d-v4.cpp.129t.laddress: 3.70% >>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>> tramp3d-v4.cpp.038t.evrp: 1.90% >>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>tramp3d-v4.cpp.161t.cunroll: 1.22% >>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>tramp3d-v4.cpp.153t.ivcanon: 1.07% >>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>> tramp3d-v4.cpp.143t.sccp: 0.91% >>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>tramp3d-v4.cpp.011t.cfg: 0.74% >>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>tramp3d-v4.cpp.133t.pre: 0.32% >>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>tramp3d-v4.cpp.029t.einline: 0.24% >>> >>> I'm suggesting to add new TDF that will be allocated for that. >>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>> tests. >>> >>> Thoughts? >> >> >> >> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >> bit >> left >> if you allow negative dump_flags). It'll tickle down on a lot of >> interfaces >> so introducing dump_flags_t at the same time might be a good idea. > > > > Hello. > > I've prepared patch that migrates all interfaces and introduces > dump_flags_t. Great. > I've been > currently testing that. Apart from that Richi requested to come up with > more > generic approach > of hierarchical structure of options. Didn't really "request" it, it's just something we eventually need to do when we run out of bits again ;) >>> >>> >>> I know, but it was me who came up with the idea of more fine suboptions :) >>> > > Can you please take a look at self-contained source file that shows way > I've > decided to go? > Another question is whether we want to implement also "aliases", where > for > instance > current 'all' is equal to union of couple of suboptions? Yeah, I think we do want -all-all-all and -foo-all to work. Not sure about -all-foo-all. >>> >>> >>> Actually only having 'all' is quite easy to implement. >>> >>> Let's imagine following hierarchy: >>> >>> (root) >>> - vops >>> - folding >>> - gimple >>> - ctor >>> - array_ref >>> - arithmetic >>> - generic >>> - c >>> - c++ >>> - ctor >>> - xyz >>> >>> Then '-fdump-passname-folding-all' will be equal to >>> '-fdump-passname-folding'. >> >> Ok, so you envision that sub-options restrict stuff. I thought of >> >> -gimple >>-vops >> -generic >>-folding >> >> so the other way around. We do not have many options that would be RTL >> specific but gimple only are -vops -alias -scev -gimple -rhs-only >> -verbose -memsyms >> while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup >> stuff you already saw. >> >> So it looks like a 8 bit "group id" plus 56 bits of flags would do. >> >> Yes, this implies reworking how & and | work. For example you can't >> | dump-flags of different groups. > > Well, I'm not opposed to idea of converting that to way you described. > So, you're willing to introduce something like: > > (root) > - generic >
Re: [PATCH] new -fdump flag
On Tue, May 9, 2017 at 2:05 PM, Nathan Sidwell wrote: > This patch adds a new '-fdump-front-end' flag and associated dump machinery. > I use it on the modules branch, as that's sufficiently complex to need a > dumper. The dump file is unnumbered with a '.fe' suffix. Perhaps it will > be useful for other front ends too. > > I'm also prepared to remove the -fdump-translation-unit dumper, which is a > completely inscrutable c++ only dump, that I think is well past its > best-before date. (Jason?) > > If there's a preference to hold this off until merging modules, that's fine. > Just thought I'd float it now. Can you please use sth else than 'front-end', specifically sth without a dash? Maybe simply 'lang'? Why do you need a new sub-switch ('-lang')? Is this for dumping into other dump-files from, say, langhooks? Richard. > nathan > -- > Nathan Sidwell
[committed] Another 6.x backport (PR testsuite/80678)
Hi! Apparently one of the testcases in my recent 6.x backports ICEs on powerpc64le-linux and doesn't on the trunk/7.x. The following backports fix that, bootstrapped/regtested on x86_64-linux and i686-linux and additionally tested with a cross-compiler on the constexpr-77* and pr71310.c testcases, committed to 6.x branch. 2017-05-09 Jakub Jelinek PR testsuite/80678 2016-06-14 Richard Biener PR middle-end/71310 PR bootstrap/71510 * expr.h (get_bit_range): Declare. * expr.c (get_bit_range): Export. * fold-const.c (optimize_bit_field_compare): Use get_bit_range and word_mode again to constrain the bitfield access. 2016-06-11 Segher Boessenkool PR middle-end/71310 * fold-const.c (optimize_bit_field_compare): Don't try to use word_mode unconditionally for reading the bit field, look at DECL_BIT_FIELD_REPRESENTATIVE instead. * gcc.target/powerpc/pr71310.c: New testcase. --- gcc/fold-const.c(revision 237318) +++ gcc/fold-const.c(revision 237426) @@ -3902,9 +3902,19 @@ return 0; } + /* Honor the C++ memory model and mimic what RTL expansion does. */ + unsigned HOST_WIDE_INT bitstart = 0; + unsigned HOST_WIDE_INT bitend = 0; + if (TREE_CODE (lhs) == COMPONENT_REF) +{ + get_bit_range (&bitstart, &bitend, lhs, &lbitpos, &offset); + if (offset != NULL_TREE) + return 0; +} + /* See if we can find a mode to refer to this field. We should be able to, but fail if we can't. */ - nmode = get_best_mode (lbitsize, lbitpos, 0, 0, + nmode = get_best_mode (lbitsize, lbitpos, bitstart, bitend, const_p ? TYPE_ALIGN (TREE_TYPE (linner)) : MIN (TYPE_ALIGN (TREE_TYPE (linner)), TYPE_ALIGN (TREE_TYPE (rinner))), --- gcc/expr.c (revision 237425) +++ gcc/expr.c (revision 237426) @@ -4795,7 +4795,7 @@ optimize_bitfield_assignment_op (unsigne If the access does not need to be restricted, 0 is returned in both *BITSTART and *BITEND. */ -static void +void get_bit_range (unsigned HOST_WIDE_INT *bitstart, unsigned HOST_WIDE_INT *bitend, tree exp, --- gcc/expr.h (revision 237425) +++ gcc/expr.h (revision 237426) @@ -242,6 +242,10 @@ extern rtx push_block (rtx, int, int); extern bool emit_push_insn (rtx, machine_mode, tree, rtx, unsigned int, int, rtx, int, rtx, rtx, int, rtx, bool); +/* Extract the accessible bit-range from a COMPONENT_REF. */ +extern void get_bit_range (unsigned HOST_WIDE_INT *, unsigned HOST_WIDE_INT *, + tree, HOST_WIDE_INT *, tree *); + /* Expand an assignment that stores the value of FROM into TO. */ extern void expand_assignment (tree, tree, bool); --- gcc/testsuite/gcc.target/powerpc/pr71310.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr71310.c (revision 237319) @@ -0,0 +1,23 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-options "-O2" } */ + +/* { dg-final { scan-assembler-not {\mld} } } */ +/* { dg-final { scan-assembler-not {\mlwz} } } */ +/* { dg-final { scan-assembler-times {\mlbz} 2 } } */ + +struct mmu_gather { +long end; +int fullmm : 1; +}; + +void __tlb_reset_range(struct mmu_gather *p1) +{ +if (p1->fullmm) +p1->end = 0; +} + +void tlb_gather_mmu(struct mmu_gather *p1) +{ +p1->fullmm = 1; +__tlb_reset_range(p1); +} Jakub
Re: [patch, fortran] Reduce stack use in blocked matmul
Hi, On 8 May 2017 at 18:58, Jerry DeLisle wrote: > On 05/05/2017 01:31 PM, Thomas Koenig wrote: >> Hello world, >> >> the attached patch reduces the stack usage by the blocked >> version of matmul for cases where we don't need the full buffer. >> This should improve stack usage. >> >> Regression-tested. I also added a stress test (around 3 secs of >> CPU time on my system), it will only run once due to the "dg-do run" >> hack). >> >> OK for trunk? >> > > OK, thanks. > Since this was committed (r247753), I've noticed the following failures on arm* targets: - PASS now FAIL [PASS => FAIL]: Executed from: gfortran.dg/dg.exp gfortran.dg/allocatable_function_8.f90 -O0 execution test gfortran.dg/allocatable_function_8.f90 -O1 execution test gfortran.dg/allocatable_function_8.f90 -O2 execution test gfortran.dg/allocatable_function_8.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test gfortran.dg/allocatable_function_8.f90 -O3 -g execution test gfortran.dg/allocatable_function_8.f90 -Os execution test gfortran.dg/generic_20.f90 -O0 execution test gfortran.dg/generic_20.f90 -O1 execution test gfortran.dg/generic_20.f90 -O2 execution test gfortran.dg/generic_20.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test gfortran.dg/generic_20.f90 -O3 -g execution test gfortran.dg/generic_20.f90 -Os execution test gfortran.dg/matmul_6.f90 -O0 execution test gfortran.dg/matmul_bounds_6.f90 -O0 execution test gfortran.dg/matmul_bounds_6.f90 -O1 execution test gfortran.dg/matmul_bounds_6.f90 -O2 execution test gfortran.dg/matmul_bounds_6.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test gfortran.dg/matmul_bounds_6.f90 -O3 -g execution test gfortran.dg/matmul_bounds_6.f90 -Os execution test gfortran.dg/operator_1.f90 -O0 execution test Executed from: gfortran.fortran-torture/execute/execute.exp gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution, -O0 and the new tests fail too: - FAIL appears [ => FAIL]: Executed from: gfortran.dg/dg.exp gfortran.dg/matmul_15.f90 -O execution test gfortran.dg/matmul_bounds_5.f90 -O0 output pattern test, is qemu: uncaught target signal 11 (Segmentation fault) - core dumped Christophe > Jerry
Re: [RFC][PATCH] Introduce -fdump*-folding
On 05/09/2017 02:16 PM, Richard Biener wrote: > On Tue, May 9, 2017 at 2:01 PM, Martin Liška wrote: >> On 05/05/2017 01:50 PM, Richard Biener wrote: >>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška wrote: On 05/04/2017 12:40 PM, Richard Biener wrote: > > On Thu, May 4, 2017 at 11:22 AM, Martin Liška wrote: >> >> On 05/03/2017 12:12 PM, Richard Biener wrote: >>> >>> >>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška wrote: Hello Last release cycle I spent quite some time with reading of IVOPTS pass dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern' lines, which can make reading of a dump file more complicated. There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number shows how many lines are of the aforementioned pattern: tramp3d-v4.cpp.164t.ivopts: 6.34% tramp3d-v4.cpp.091t.ccp2: 5.04% tramp3d-v4.cpp.093t.cunrolli: 4.41% tramp3d-v4.cpp.129t.laddress: 3.70% tramp3d-v4.cpp.032t.ccp1: 2.31% tramp3d-v4.cpp.038t.evrp: 1.90% tramp3d-v4.cpp.033t.forwprop1: 1.74% tramp3d-v4.cpp.103t.vrp1: 1.52% tramp3d-v4.cpp.124t.forwprop3: 1.31% tramp3d-v4.cpp.181t.vrp2: 1.30% tramp3d-v4.cpp.161t.cunroll: 1.22% tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% tramp3d-v4.cpp.153t.ivcanon: 1.07% tramp3d-v4.cpp.126t.ccp3: 0.96% tramp3d-v4.cpp.143t.sccp: 0.91% tramp3d-v4.cpp.185t.forwprop4: 0.82% tramp3d-v4.cpp.011t.cfg: 0.74% tramp3d-v4.cpp.096t.forwprop2: 0.50% tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% tramp3d-v4.cpp.120t.phicprop1: 0.33% tramp3d-v4.cpp.133t.pre: 0.32% tramp3d-v4.cpp.182t.phicprop2: 0.27% tramp3d-v4.cpp.170t.veclower21: 0.25% tramp3d-v4.cpp.029t.einline: 0.24% I'm suggesting to add new TDF that will be allocated for that. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Thoughts? >>> >>> >>> >>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >>> bit >>> left >>> if you allow negative dump_flags). It'll tickle down on a lot of >>> interfaces >>> so introducing dump_flags_t at the same time might be a good idea. >> >> >> >> Hello. >> >> I've prepared patch that migrates all interfaces and introduces >> dump_flags_t. > > > Great. > >> I've been >> currently testing that. Apart from that Richi requested to come up with >> more >> generic approach >> of hierarchical structure of options. > > > Didn't really "request" it, it's just something we eventually need to do > when > we run out of bits again ;) I know, but it was me who came up with the idea of more fine suboptions :) > >> >> Can you please take a look at self-contained source file that shows way >> I've >> decided to go? >> Another question is whether we want to implement also "aliases", where >> for >> instance >> current 'all' is equal to union of couple of suboptions? > > > Yeah, I think we do want -all-all-all and -foo-all to work. Not sure > about -all-foo-all. Actually only having 'all' is quite easy to implement. Let's imagine following hierarchy: (root) - vops - folding - gimple - ctor - array_ref - arithmetic - generic - c - c++ - ctor - xyz Then '-fdump-passname-folding-all' will be equal to '-fdump-passname-folding'. >>> >>> Ok, so you envision that sub-options restrict stuff. I thought of >>> >>> -gimple >>>-vops >>> -generic >>>-folding >>> >>> so the other way around. We do not have many options that would be RTL >>> specific but gimple only are -vops -alias -scev -gimple -rhs-only >>> -verbose -memsyms >>> while RTL has -cselib. -eh sounds gimple specific. Then there's the >>> optgroup >>> stuff you already saw. >>> >>> So it looks like a 8 bit "group id" plus 56 bits of flags would do. >>> >>> Yes, this implies reworking how & and | work. For example you c
Re: [RFC][PATCH] Introduce -fdump*-folding
On Tue, May 9, 2017 at 2:46 PM, Martin Liška wrote: > On 05/09/2017 02:16 PM, Richard Biener wrote: >> On Tue, May 9, 2017 at 2:01 PM, Martin Liška wrote: >>> On 05/05/2017 01:50 PM, Richard Biener wrote: On Thu, May 4, 2017 at 1:10 PM, Martin Liška wrote: > On 05/04/2017 12:40 PM, Richard Biener wrote: >> >> On Thu, May 4, 2017 at 11:22 AM, Martin Liška wrote: >>> >>> On 05/03/2017 12:12 PM, Richard Biener wrote: On Wed, May 3, 2017 at 10:10 AM, Martin Liška wrote: > > > Hello > > Last release cycle I spent quite some time with reading of IVOPTS pass > dump file. Using -fdump*-details causes to generate a lot of 'Applying > pattern' > lines, which can make reading of a dump file more complicated. > > There are stats for tramp3d with -O2 and -fdump-tree-all-details. > Percentage number > shows how many lines are of the aforementioned pattern: > > tramp3d-v4.cpp.164t.ivopts: 6.34% > tramp3d-v4.cpp.091t.ccp2: 5.04% > tramp3d-v4.cpp.093t.cunrolli: 4.41% > tramp3d-v4.cpp.129t.laddress: 3.70% > tramp3d-v4.cpp.032t.ccp1: 2.31% > tramp3d-v4.cpp.038t.evrp: 1.90% > tramp3d-v4.cpp.033t.forwprop1: 1.74% > tramp3d-v4.cpp.103t.vrp1: 1.52% > tramp3d-v4.cpp.124t.forwprop3: 1.31% > tramp3d-v4.cpp.181t.vrp2: 1.30% >tramp3d-v4.cpp.161t.cunroll: 1.22% > tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >tramp3d-v4.cpp.153t.ivcanon: 1.07% > tramp3d-v4.cpp.126t.ccp3: 0.96% > tramp3d-v4.cpp.143t.sccp: 0.91% > tramp3d-v4.cpp.185t.forwprop4: 0.82% >tramp3d-v4.cpp.011t.cfg: 0.74% > tramp3d-v4.cpp.096t.forwprop2: 0.50% > tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% > tramp3d-v4.cpp.120t.phicprop1: 0.33% >tramp3d-v4.cpp.133t.pre: 0.32% > tramp3d-v4.cpp.182t.phicprop2: 0.27% > tramp3d-v4.cpp.170t.veclower21: 0.25% >tramp3d-v4.cpp.029t.einline: 0.24% > > I'm suggesting to add new TDF that will be allocated for that. > Patch can bootstrap on ppc64le-redhat-linux and survives regression > tests. > > Thoughts? Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 bit left if you allow negative dump_flags). It'll tickle down on a lot of interfaces so introducing dump_flags_t at the same time might be a good idea. >>> >>> >>> >>> Hello. >>> >>> I've prepared patch that migrates all interfaces and introduces >>> dump_flags_t. >> >> >> Great. >> >>> I've been >>> currently testing that. Apart from that Richi requested to come up with >>> more >>> generic approach >>> of hierarchical structure of options. >> >> >> Didn't really "request" it, it's just something we eventually need to do >> when >> we run out of bits again ;) > > > I know, but it was me who came up with the idea of more fine suboptions :) > >> >>> >>> Can you please take a look at self-contained source file that shows way >>> I've >>> decided to go? >>> Another question is whether we want to implement also "aliases", where >>> for >>> instance >>> current 'all' is equal to union of couple of suboptions? >> >> >> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >> about -all-foo-all. > > > Actually only having 'all' is quite easy to implement. > > Let's imagine following hierarchy: > > (root) > - vops > - folding > - gimple > - ctor > - array_ref > - arithmetic > - generic > - c > - c++ > - ctor > - xyz > > Then '-fdump-passname-folding-all' will be equal to > '-fdump-passname-folding'. Ok, so you envision that sub-options restrict stuff. I thought of -gimple -vops -generic -folding so the other way around. We do not have many options that would be RTL specific but gimple only are -vops -alias -scev -gimple -rhs-only -verbose -memsyms while RTL has -cselib. -eh sounds gimple specific. Then there's the
[PATCH] non-checking pure attribute
Hi, For name-lookup cleanup I introduced some accessors that are pure functions when we're not checking. I noticed we already had a couple of them, so introduced an ATTRIBUTE_NTC_PURE define. It avoid #ifndefs and stray semicolons. ok? nathan -- Nathan Sidwell 2017-05-09 Nathan Sidwell * system.h (ATTRIBUTE_NTC_PURE): New define. * tree.h (tree_fits_shwi_p, tree_fits_uhwi_p): Use it. Index: system.h === --- system.h (revision 247784) +++ system.h (working copy) @@ -744,6 +744,13 @@ extern void fancy_abort (const char *, i #define gcc_checking_assert(EXPR) ((void)(0 && (EXPR))) #endif +/* Some functions are pure when not tree checking. */ +#ifdef ENABLE_TREE_CHECKING +#define ATTRIBUTE_NTC_PURE +#else +#define ATTRIBUTE_NTC_PURE ATTRIBUTE_PURE +#endif + /* Use gcc_unreachable() to mark unreachable locations (like an unreachable default case of a switch. Do not use gcc_assert(0). */ #if (GCC_VERSION >= 4005) && !ENABLE_ASSERT_CHECKING Index: tree.h === --- tree.h (revision 247784) +++ tree.h (working copy) @@ -4109,15 +4109,9 @@ extern int attribute_list_contained (con extern int tree_int_cst_equal (const_tree, const_tree); extern bool tree_fits_shwi_p (const_tree) -#ifndef ENABLE_TREE_CHECKING - ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled. */ -#endif - ; + ATTRIBUTE_NTC_PURE; extern bool tree_fits_uhwi_p (const_tree) -#ifndef ENABLE_TREE_CHECKING - ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled. */ -#endif - ; + ATTRIBUTE_NTC_PURE; extern HOST_WIDE_INT tree_to_shwi (const_tree); extern unsigned HOST_WIDE_INT tree_to_uhwi (const_tree); #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
[PATCH] GTY skip
gengtype doesn't understand about struct-scope for typedefs, and consequently gets confused if multiple classes define the same-named typedef. The regular default_hash_traits is invisible to gengtype, so it doesn't barf on the first specialization, only on the second -- which is what I'll be adding soon. This patch tells gengtype to move along, nothing to see here. Applied as obvious. nathan -- Nathan Sidwell 2017-05-09 Nathan Sidwell * ipa-devirt.c (default_hash_traits): Skip struct-scope typedefs. Index: ipa-devirt.c === --- ipa-devirt.c (revision 247784) +++ ipa-devirt.c (working copy) @@ -138,10 +138,11 @@ struct type_pair }; template <> -struct default_hash_traits : typed_noop_remove +struct default_hash_traits + : typed_noop_remove { - typedef type_pair value_type; - typedef type_pair compare_type; + GTY((skip)) typedef type_pair value_type; + GTY((skip)) typedef type_pair compare_type; static hashval_t hash (type_pair p) {
[gomp5] Depend clause changes
Hi! In OpenMP 5.0, depend clause operands are location list items, which are either lvalue expressions, or array sections. In C++ it is easy to handle it with tentative parsing, in C I have to use similar hacks to what is used for declare simd. Tested on x86_64-linux, committed to gomp-5_0-branch. 2017-05-09 Jakub Jelinek c/ * c-parser.c (c_parser_omp_variable_list): For OMP_CLAUSE_DEPEND, parse clause operands as either an array section, or lvalue assignment expression. * c-typeck.c (c_finish_omp_clauses): Allow any lvalue as OMP_CLAUSE_DEPEND operand (besides array section), adjust diagnostics. cp/ * parser.c (cp_parser_omp_var_list_no_open): For OMP_CLAUSE_DEPEND, parse clause operands as either an array section, or lvalue assignment expression. * semantics.c (finish_omp_clauses): Allow any lvalue as OMP_CLAUSE_DEPEND operand (besides array section), adjust diagnostics. testsuite/ * c-c++-common/gomp/depend-5.c: New test. * c-c++-common/gomp/depend-6.c: New test. --- gcc/c/c-parser.c.jj 2017-05-04 15:27:33.306131900 +0200 +++ gcc/c/c-parser.c2017-05-09 14:05:46.874354097 +0200 @@ -10737,13 +10737,87 @@ c_parser_omp_variable_list (c_parser *pa location_t clause_loc, enum omp_clause_code kind, tree list) { - if (c_parser_next_token_is_not (parser, CPP_NAME) - || c_parser_peek_token (parser)->id_kind != C_ID_ID) + auto_vec tokens; + unsigned int tokens_avail = 0; + + if (kind != OMP_CLAUSE_DEPEND + && (c_parser_next_token_is_not (parser, CPP_NAME) + || c_parser_peek_token (parser)->id_kind != C_ID_ID)) c_parser_error (parser, "expected identifier"); - while (c_parser_next_token_is (parser, CPP_NAME) -&& c_parser_peek_token (parser)->id_kind == C_ID_ID) + while (kind == OMP_CLAUSE_DEPEND +|| (c_parser_next_token_is (parser, CPP_NAME) +&& c_parser_peek_token (parser)->id_kind == C_ID_ID)) { + bool array_section_p = false; + if (kind == OMP_CLAUSE_DEPEND) + { + if (c_parser_next_token_is_not (parser, CPP_NAME) + || c_parser_peek_token (parser)->id_kind != C_ID_ID) + { + struct c_expr expr = c_parser_expr_no_commas (parser, NULL); + if (expr.value != error_mark_node) + { + tree u = build_omp_clause (clause_loc, kind); + OMP_CLAUSE_DECL (u) = expr.value; + OMP_CLAUSE_CHAIN (u) = list; + list = u; + } + + if (c_parser_next_token_is_not (parser, CPP_COMMA)) + break; + + c_parser_consume_token (parser); + continue; + } + + tokens.truncate (0); + unsigned int nesting_depth = 0; + while (1) + { + c_token *token = c_parser_peek_token (parser); + switch (token->type) + { + case CPP_EOF: + case CPP_PRAGMA_EOL: + break; + case CPP_OPEN_BRACE: + case CPP_OPEN_PAREN: + case CPP_OPEN_SQUARE: + ++nesting_depth; + goto add; + case CPP_CLOSE_BRACE: + case CPP_CLOSE_PAREN: + case CPP_CLOSE_SQUARE: + if (nesting_depth-- == 0) + break; + goto add; + case CPP_COMMA: + if (nesting_depth == 0) + break; + goto add; + default: + add: + tokens.safe_push (*token); + c_parser_consume_token (parser); + continue; + } + break; + } + + /* Make sure nothing tries to read past the end of the tokens. */ + c_token eof_token; + memset (&eof_token, 0, sizeof (eof_token)); + eof_token.type = CPP_EOF; + tokens.safe_push (eof_token); + tokens.safe_push (eof_token); + + tokens_avail = parser->tokens_avail; + gcc_assert (parser->tokens == &parser->tokens_buf[0]); + parser->tokens = tokens.address (); + parser->tokens_avail = tokens.length (); + } + tree t = lookup_name (c_parser_peek_token (parser)->value); if (t == NULL_TREE) @@ -10819,6 +10893,7 @@ c_parser_omp_variable_list (c_parser *pa t = error_mark_node; break; } + array_section_p = true; if (!c_parser_next_token_is (parser, CPP_CLOSE_SQUARE)) { location_t expr_loc @@ -10839,6 +10914,30 @@ c_parser_omp_variable_list (c_parser *pa t = tree_cons (low_bound, length, t); } + if (kind == OMP_CL
Re: [PATCH] Fix bootstrap on arm target
> > since a few days the bootstrap of ada fails on a native arm target. > > It is due to a -Werror warning when passing GNAT_EXCEPTION_CLASS > which is a string constant to exception_class_eq, but C++ forbids to cast > that to "char*". > > Not sure what is the smartest solution, I tried the following and it > seems to work for x86_64-pc-linux-gnu and arm-linux-gnueabihf. > > Is it OK for trunk? Patch looks OK to me FWIW. Tristan? > 2017-05-09 Bernd Edlinger > > * raise-gcc.c (exception_class_eq): Make ec parameter const. > > --- gcc/ada/raise-gcc.c.jj2017-04-27 12:00:42.0 +0200 > +++ gcc/ada/raise-gcc.c 2017-05-09 09:45:59.557507045 +0200 > @@ -909,7 +909,8 @@ > /* Return true iff the exception class of EXCEPT is EC. */ > > static int > -exception_class_eq (const _GNAT_Exception *except, > _Unwind_Exception_Class ec) > +exception_class_eq (const _GNAT_Exception *except, > + const _Unwind_Exception_Class ec) > { > #ifdef __ARM_EABI_UNWINDER__ >return memcmp (except->common.exception_class, ec, 8) == 0;
Re: [c++ PATCH] PR c++/80682
On Tue, 9 May 2017, Ville Voutilainen wrote: Tested on Linux-x64, not tested with the full suite yet. 2017-05-09 Ville Voutilainen gcc/ PR c++/80682 * cp/method.c (is_trivially_xible): Reject void types. testsuite/ PR c++/80682 * g++.dg/ext/is_trivially_constructible1.C: Add tests for void target. What happens for "const void" and other variants? -- Marc Glisse
Re: [c++ PATCH] PR c++/80682
On 9 May 2017 at 16:12, Marc Glisse wrote: > On Tue, 9 May 2017, Ville Voutilainen wrote: > >> Tested on Linux-x64, not tested with the full suite yet. >> >> 2017-05-09 Ville Voutilainen >> >>gcc/ >> >>PR c++/80682 >>* cp/method.c (is_trivially_xible): Reject void types. >> >>testsuite/ >> >>PR c++/80682 >>* g++.dg/ext/is_trivially_constructible1.C: Add tests for void target. > > > What happens for "const void" and other variants? They don't work. New patch attached. diff --git a/gcc/cp/method.c b/gcc/cp/method.c index b4c1f60..9b17ef1 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -1207,6 +1207,8 @@ constructible_expr (tree to, tree from) bool is_trivially_xible (enum tree_code code, tree to, tree from) { + if (cv_unqualified (to) == void_type_node) +return false; tree expr; if (code == MODIFY_EXPR) expr = assignable_expr (to, from); diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C index a5bac7b..175eae9 100644 --- a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C +++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C @@ -27,10 +27,18 @@ SA(!__is_trivially_constructible(C,C&)); SA(__is_trivially_assignable(C,C&)); SA(!__is_trivially_assignable(C,C)); SA(!__is_trivially_assignable(C,C&&)); +SA(!__is_trivially_assignable(void,int)); +SA(!__is_trivially_assignable(const void,int)); +SA(!__is_trivially_assignable(volatile void,int)); +SA(!__is_trivially_assignable(const volatile void,int)); SA(__is_trivially_constructible(int,int)); SA(__is_trivially_constructible(int,double)); SA(!__is_trivially_constructible(int,B)); +SA(!__is_trivially_constructible(void,int)); +SA(!__is_trivially_constructible(const void,int)); +SA(!__is_trivially_constructible(volatile void,int)); +SA(!__is_trivially_constructible(const volatile void,int)); SA(!__is_trivially_constructible(D));
Re: [PATCH] non-checking pure attribute
On Tue, May 9, 2017 at 3:01 PM, Nathan Sidwell wrote: > Hi, > For name-lookup cleanup I introduced some accessors that are pure functions > when we're not checking. I noticed we already had a couple of them, so > introduced an ATTRIBUTE_NTC_PURE define. It avoid #ifndefs and stray > semicolons. > > ok? Ok, but ... are they not "pure" enough? That is, do we really care to preserve the checking side-effect for example when doing tree_fits_uhwi (t); (result unused)? > nathan > -- > Nathan Sidwell
Re: [1/2] PR 78736: New warning -Wenum-conversion
ping https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00161.html Thanks, Prathamesh On 3 May 2017 at 11:30, Prathamesh Kulkarni wrote: > On 3 May 2017 at 03:28, Martin Sebor wrote: >> On 05/02/2017 11:11 AM, Prathamesh Kulkarni wrote: >>> >>> Hi, >>> The attached patch attempts to add option -Wenum-conversion for C and >>> objective-C similar to clang, which warns when an enum value of a type >>> is implicitly converted to enum value of another type and is enabled >>> by Wall. >> >> >> It seems quite useful. My only high-level concern is with >> the growing number of specialized warnings and options for each >> and their interaction. >> >> I've been working on -Wenum-assign patch that complains about >> assigning to an enum variables an integer constants that doesn't >> match any of the enumerators of the type. Testing revealed that >> the -Wenum-assign duplicated a subset of warnings already issued >> by -Wconversion enabled with -Wpedantic. I'm debating whether >> to suppress that part of -Wenum-assign altogether or only when >> -Wconversion and -Wpedantic are enabled. >> >> My point is that these dependencies tend to be hard to discover >> and understand, and the interactions tricky to get right (e.g., >> avoid duplicate warnings for similar but distinct problems). >> >> This is not meant to be a negative comment on your patch, but >> rather a comment about a general problem that might be worth >> starting to think about. >> >> One comment on the patch itself: >> >> + warning_at_rich_loc (&loc, 0, "implicit conversion from" >> + " enum type of %qT to %qT", checktype, type); >> >> Unlike C++, the C front end formats an enumerated type E using >> %qT as 'enum E' so the warning prints 'enum type of 'enum E'), >> duplicating the "enum" part. >> >> I would suggest to simplify that to: >> >> warning_at_rich_loc (&loc, 0, "implicit conversion from " >>"%qT to %qT", checktype, ... >> > Thanks for the suggestions. I have updated the patch accordingly. > Hmm the issue you pointed out of warnings interaction is indeed of concern. > I was wondering then if we should merge this warning with -Wconversion > instead of having a separate option -Wenum-conversion ? Although that will not > really help with your example below. >> Martin >> >> PS As an example to illustrate my concern above, consider this: >> >> enum __attribute__ ((packed)) E { e1 = 1 }; >> enum F { f256 = 256 }; >> >> enum E e = f256; >> >> It triggers -Woverflow: >> >> warning: large integer implicitly truncated to unsigned type [-Woverflow] >>enum E e = f256; >> ^~~~ >> >> also my -Wenum-assign: >> >> warning: integer constant ‘256’ converted to ‘0’ due to limited range [0, >> 255] of type ‘‘enum E’’ [-Wassign-enum] >>enum E e = f256; >> ^~~~ >> >> and (IIUC) will trigger your new -Wenum-conversion. > Yep, on my branch it triggered -Woverflow and -Wenum-conversion. > Running the example on clang shows a single warning, which they call > as -Wconstant-conversion, which > I suppose is similar to your -Wassign-enum. > > test-eg.c:3:12: warning: implicit conversion from 'int' to 'enum E' > changes value from 256 to 0 [-Wconstant-conversion] > enum E e = f256; >~ ^~~~ > > Thanks, > Prathamesh >> >> Martin
Re: [2/2] PR 78736: libgomp fallout
ping https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00135.html Thanks, Prathamesh On 2 May 2017 at 22:43, Prathamesh Kulkarni wrote: > Hi, > During gcc bootstrap, there's a couple of places where the warning got > triggered. > I suppose this wasn't a false positive since enum gomp_schedule_type > and enum omp_sched_t > are different types (although they have same set of values) ? > > Bootstrap+tested on x86_64-unknown-linux-gnu. > Is this patch OK to commit ? > > Thanks, > Prathamesh
Re: [c++ PATCH] PR c++/80682
On Tue, May 09, 2017 at 04:17:07PM +0300, Ville Voutilainen wrote: > On 9 May 2017 at 16:12, Marc Glisse wrote: > > On Tue, 9 May 2017, Ville Voutilainen wrote: > > > >> Tested on Linux-x64, not tested with the full suite yet. > >> > >> 2017-05-09 Ville Voutilainen > >> > >>gcc/ > >> > >>PR c++/80682 > >>* cp/method.c (is_trivially_xible): Reject void types. No cp/ in cp/ChangeLog entries. > >>testsuite/ > >> > >>PR c++/80682 > >>* g++.dg/ext/is_trivially_constructible1.C: Add tests for void target. > > > > > > What happens for "const void" and other variants? > > > They don't work. New patch attached. > diff --git a/gcc/cp/method.c b/gcc/cp/method.c > index b4c1f60..9b17ef1 100644 > --- a/gcc/cp/method.c > +++ b/gcc/cp/method.c > @@ -1207,6 +1207,8 @@ constructible_expr (tree to, tree from) > bool > is_trivially_xible (enum tree_code code, tree to, tree from) > { > + if (cv_unqualified (to) == void_type_node) > +return false; Can't this be checked more cheaply as if (TYPE_MAIN_VARIANT (to) == void_type_node) ? Jakub
Re: [2/2] PR 78736: libgomp fallout
On Tue, May 09, 2017 at 06:55:12PM +0530, Prathamesh Kulkarni wrote: > ping https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00135.html The libgomp patch is ok provided the warning is added. Though, should it be in -Wall and not say just -Wextra? Jakub
Re: [CHKP] Fix for PR79990
Hi, Here is the latest version of the patch with all comments addressed: gcc/ChangeLog: 2017-05-09 Alexander Ivchenko * tree-chkp.c (chkp_get_hard_register_var_fake_base_address): New function. (chkp_get_hard_register_fake_addr_expr): Ditto. (chkp_build_addr_expr): Add check for hard reg case. (chkp_parse_array_and_component_ref): Ditto. (chkp_find_bounds_1): Ditto. (chkp_process_stmt): Don't generate bounds store for hard reg case. gcc/testsuite/ChangeLog: 2017-05-09 Alexander Ivchenko * gcc.target/i386/mpx/hard-reg-2-lbv.c: New test. * gcc.target/i386/mpx/hard-reg-2-nov.c: New test. * gcc.target/i386/mpx/hard-reg-2-ubv.c: New test. * gcc.target/i386/mpx/hard-reg-3-1-lbv.c: New test. * gcc.target/i386/mpx/hard-reg-3-1-nov.c: New test. * gcc.target/i386/mpx/hard-reg-3-1-ubv.c: New test. * gcc.target/i386/mpx/hard-reg-3-2-lbv.c: New test. * gcc.target/i386/mpx/hard-reg-3-2-nov.c: New test. * gcc.target/i386/mpx/hard-reg-3-2-ubv.c: New test. * gcc.target/i386/mpx/hard-reg-3-lbv.c: New test. * gcc.target/i386/mpx/hard-reg-3-nov.c: New test. * gcc.target/i386/mpx/hard-reg-3-ubv.c: New test. * gcc.target/i386/mpx/hard-reg-4-1-lbv.c: New test. * gcc.target/i386/mpx/hard-reg-4-1-nov.c: New test. * gcc.target/i386/mpx/hard-reg-4-1-ubv.c: New test. * gcc.target/i386/mpx/hard-reg-4-2-lbv.c: New test. * gcc.target/i386/mpx/hard-reg-4-2-nov.c: New test. * gcc.target/i386/mpx/hard-reg-4-2-ubv.c: New test. diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c new file mode 100644 index 000..319e1ec --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c @@ -0,0 +1,21 @@ +/* { dg-do run } */ +/* { dg-shouldfail "bounds violation" } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ + + +#define SHOULDFAIL + +#include "mpx-check.h" + +typedef int v16 __attribute__((vector_size(16))); + +int foo(int i) { + register v16 u asm("xmm0"); + return u[i]; +} + +int mpx_test (int argc, const char **argv) +{ + printf ("%d\n", foo (-1)); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c new file mode 100644 index 000..3c6d39a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c @@ -0,0 +1,18 @@ +/* { dg-do run } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ + +#include "mpx-check.h" + +typedef int v16 __attribute__((vector_size(16))); + +int foo (int i) { + register v16 u asm ("xmm0"); + return u[i]; +} + +int mpx_test (int argc, const char **argv) +{ + printf ("%d\n", foo (3)); + printf ("%d\n", foo (0)); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c new file mode 100644 index 000..7fe76c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c @@ -0,0 +1,21 @@ +/* { dg-do run } */ +/* { dg-shouldfail "bounds violation" } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ + + +#define SHOULDFAIL + +#include "mpx-check.h" + +typedef int v16 __attribute__((vector_size(16))); + +int foo (int i) { + register v16 u asm ("xmm0"); + return u[i]; +} + +int mpx_test (int argc, const char **argv) +{ + printf ("%d\n", foo (5)); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-lbv.c b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-lbv.c new file mode 100644 index 000..7e4451f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-lbv.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-shouldfail "bounds violation" } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ + + +#define SHOULDFAIL + +#include "mpx-check.h" + +typedef int v8 __attribute__ ((vector_size (8))); + +struct S1 +{ + v8 s1f; +}; + +struct S2 +{ + struct S1 s2f1; + v8 s2f2; +}; + +int foo_s2f1 (int i) +{ + register struct S2 b asm ("xmm0"); + return b.s2f1.s1f[i]; +} + +int mpx_test (int argc, const char **argv) +{ + printf ("%d\n", foo_s2f1 (-1)); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-nov.c b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-nov.c new file mode 100644 index 000..73bd7fb --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-nov.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */ + + +#include "mpx-check.h" + +typedef int v8 __attribute__ ((vector_size (8))); + +struct S1 +{ + v8 s1f; +}; + +struct S2 +{ + struct S1 s2f1; + v8 s2f2; +}; + +int foo_s2f1 (int i) +{ + register struct S2 b asm ("xmm0"); + return b.s2f1.s1f[i]; +} + +int mpx_test (int argc, const char **argv) +{ + printf ("%d\n", foo_s2f1 (0)); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-ubv.c b/gcc/
Re: [PATCH] non-checking pure attribute
On 05/09/2017 09:19 AM, Richard Biener wrote: On Tue, May 9, 2017 at 3:01 PM, Nathan Sidwell wrote: Ok, but ... are they not "pure" enough? That is, do we really care to preserve the checking side-effect for example when doing tree_fits_uhwi (t); (result unused)? I wondered about that. More specifically: if (tree_fits_uhwi (t)) { bool fits = tree_fits_uhwi (t) ...} I wondered if we'd get sane backtraces and what not, if the optimizer thought such functions never barfed. If you're fine with unconditionally saying pure, that works for me. nathan -- Nathan Sidwell
Re: [c++ PATCH] PR c++/80682
On 9 May 2017 at 16:25, Jakub Jelinek wrote: >> >> 2017-05-09 Ville Voutilainen >> >> >> >>gcc/ >> >> >> >>PR c++/80682 >> >>* cp/method.c (is_trivially_xible): Reject void types. > > No cp/ in cp/ChangeLog entries. So perhaps 2017-05-09 Ville Voutilainen cp/ PR c++/80682 * method.c (is_trivially_xible): Reject void types. testsuite/ PR c++/80682 * g++.dg/ext/is_trivially_constructible1.C: Add tests for void target. >> + if (cv_unqualified (to) == void_type_node) >> +return false; > > Can't this be checked more cheaply as if (TYPE_MAIN_VARIANT (to) == > void_type_node) ? Yes. A new patch attached. diff --git a/gcc/cp/method.c b/gcc/cp/method.c index b4c1f60..31ed141 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -1207,6 +1207,8 @@ constructible_expr (tree to, tree from) bool is_trivially_xible (enum tree_code code, tree to, tree from) { + if (TYPE_MAIN_VARIANT (to) == void_type_node) +return false; tree expr; if (code == MODIFY_EXPR) expr = assignable_expr (to, from); diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C index a5bac7b..175eae9 100644 --- a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C +++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C @@ -27,10 +27,18 @@ SA(!__is_trivially_constructible(C,C&)); SA(__is_trivially_assignable(C,C&)); SA(!__is_trivially_assignable(C,C)); SA(!__is_trivially_assignable(C,C&&)); +SA(!__is_trivially_assignable(void,int)); +SA(!__is_trivially_assignable(const void,int)); +SA(!__is_trivially_assignable(volatile void,int)); +SA(!__is_trivially_assignable(const volatile void,int)); SA(__is_trivially_constructible(int,int)); SA(__is_trivially_constructible(int,double)); SA(!__is_trivially_constructible(int,B)); +SA(!__is_trivially_constructible(void,int)); +SA(!__is_trivially_constructible(const void,int)); +SA(!__is_trivially_constructible(volatile void,int)); +SA(!__is_trivially_constructible(const volatile void,int)); SA(!__is_trivially_constructible(D));
[PATCH 2/2] (v2) Use %qH and %qI throughout C++ frontend
Changed in v2: use %qH and %qI rather than %H and %I. This is the second half of the kit, which uses %qH and %qI throughout the C++ frontend whenever describing type mismatches between a pair of %qT. gcc/cp/ChangeLog: * call.c (print_conversion_rejection): Replace pairs of %qT with %qH and %qI in various places. (build_user_type_conversion_1): Likewise. (build_integral_nontype_arg_conv): Likewise. (build_conditional_expr_1): Likewise. (convert_like_real): Likewise. (convert_arg_to_ellipsis): Likewise. (joust): Likewise. (initialize_reference): Likewise. * cvt.c (cp_convert_to_pointer): Likewise. (cp_convert_to_pointer): Likewise. (convert_to_reference): Likewise. (ocp_convert): Likewise. * typeck.c (cp_build_binary_op): Likewise. (convert_member_func_to_ptr): Likewise. (build_reinterpret_cast_1): Likewise. (convert_for_assignment): Likewise. * typeck2.c (check_narrowing): Likewise. --- gcc/cp/call.c| 38 +++--- gcc/cp/cvt.c | 18 +- gcc/cp/typeck.c | 22 +++--- gcc/cp/typeck2.c | 6 +++--- 4 files changed, 42 insertions(+), 42 deletions(-) diff --git a/gcc/cp/call.c b/gcc/cp/call.c index f1b6bf4..a7882e6 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -3402,7 +3402,7 @@ print_conversion_rejection (location_t loc, struct conversion_info *info) from); else inform (loc, " no known conversion for implicit " - "% parameter from %qT to %qT", + "% parameter from %qH to %qI", from, info->to_type); } else if (!TYPE_P (info->from)) @@ -3415,10 +3415,10 @@ print_conversion_rejection (location_t loc, struct conversion_info *info) } else if (info->n_arg == -2) /* Conversion of conversion function return value failed. */ -inform (loc, " no known conversion from %qT to %qT", +inform (loc, " no known conversion from %qH to %qI", from, info->to_type); else -inform (loc, " no known conversion for argument %d from %qT to %qT", +inform (loc, " no known conversion for argument %d from %qH to %qI", info->n_arg + 1, from, info->to_type); } @@ -3925,7 +3925,7 @@ build_user_type_conversion_1 (tree totype, tree expr, int flags, { if (complain & tf_error) { - error ("conversion from %qT to %qT is ambiguous", + error ("conversion from %qH to %qI is ambiguous", fromtype, totype); print_z_candidates (location_of (expr), candidates); } @@ -4052,7 +4052,7 @@ build_integral_nontype_arg_conv (tree type, tree expr, tsubst_flags_t complain) break; if (complain & tf_error) - error_at (loc, "conversion from %qT to %qT not considered for " + error_at (loc, "conversion from %qH to %qI not considered for " "non-type template argument", t, type); /* fall through. */ @@ -4833,14 +4833,14 @@ build_conditional_expr_1 (location_t loc, tree arg1, tree arg2, tree arg3, if (unsafe_conversion_p (loc, stype, arg2, false)) { if (complain & tf_error) - error_at (loc, "conversion of scalar %qT to vector %qT " + error_at (loc, "conversion of scalar %qH to vector %qI " "involves truncation", arg2_type, vtype); return error_mark_node; } if (unsafe_conversion_p (loc, stype, arg3, false)) { if (complain & tf_error) - error_at (loc, "conversion of scalar %qT to vector %qT " + error_at (loc, "conversion of scalar %qH to vector %qI " "involves truncation", arg3_type, vtype); return error_mark_node; } @@ -5229,7 +5229,7 @@ build_conditional_expr_1 (location_t loc, tree arg1, tree arg2, tree arg3, arg3_type); if (complain & tf_warning) do_warn_double_promotion (result_type, arg2_type, arg3_type, - "implicit conversion from %qT to %qT to " + "implicit conversion from %qH to %qI to " "match other result of conditional", loc); @@ -6603,7 +6603,7 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, from std::nullptr_t requires direct-initialization. */ if (NULLPTR_TYPE_P (TREE_TYPE (expr)) && TREE_CODE (totype) == BOOLEAN_TYPE) - complained = permerror (loc, "converting to %qT from %qT requires " + complained = permerror (loc, "converting to %qH from %qI requires " "direct-initialization", totype, TREE
[PATCH 1/2] (v2) C++ template type diff printing
Changes in v2: - pass "quote" as an extra bool argument to the pp_format_decoder callback, so that we can optionally quote the delayed chunks (with %qH and %qI) - use %qH and %qI rather than %H and %I. - use CLASS_TYPE_P. - use TYPE_P rather than EXPR_P in arg_to_string to handle non-type template arguments; added test cases for this - in type_to_string_with_compare, drop the "foo_a" and "foo_b" naming convention in favor of "foo" vs "foo_peer", since here we could be dealing with either %H or %I, either way around. Remove logic for returning NULL types, and clairy behavior for handling of mixtures of default and non-default args, adding test coverage for this. For example, given: template struct s {}; void takes_s (s<> ); then: takes_s (s<0, 2>()); is reported as: can't convert from 's<[...],2>' to 's<[...], 1>' highlighting the "2" and "1", rather than can't convert from 's<0,2>' to 's<>' since these are the arguments of interest. The template tree comparison for this case is printed as: s< [...] [2 != 1]> - renamed "defer_half_of_type_diff" to "defer_phase_2_of_type_diff", and rewrite to avoid a switch statement Would it make sense to rename "m_type_a"/"m_type_b" to "m_type_H"/m_type_I"? (we normally don't go for uppercase in fieldnames, but given the codes are case-sensitive, does it make sense here?) Successfully bootstrapped®rtested the combination of the two patches on x86_64-pc-linux-gnu. gcc/c-family/ChangeLog: * c-format.c (gcc_cxxdiag_char_table): Add 'H' and 'I' to format_chars. * c.opt (fdiagnostics-show-template-tree): New option. (felide-type): New option. * c-format.c (static): Likewise. gcc/c/ChangeLog: * c-objc-common.c (c_tree_printer): Gain bool and const char ** parameters. gcc/cp/ChangeLog: * call.c (perform_implicit_conversion_flags): Convert "from %qT to %qT" to "from %qH to %qI" in diagnostic. * error.c (struct deferred_printed_type): New struct. (class cxx_format_postprocessor): New class. (cxx_initialize_diagnostics): Wire up a cxx_format_postprocessor to pp->m_format_postprocessor. (comparable_template_types_p): New function. (newline_and_indent): New function. (arg_to_string): New function. (print_nonequal_arg): New function. (type_to_string_with_compare): New function. (print_template_tree_comparison): New function. (append_formatted_chunk): New function. (add_quotes): New function. (cxx_format_postprocessor::handle): New function. (defer_phase_2_of_type_diff): New function. (cp_printer): Add "quoted" and "buffer_ptr" params. Implement %H and %I. gcc/ChangeLog: * diagnostic-color.c (color_dict): Add "type-diff". (parse_gcc_colors): Update comment. * doc/invoke.texi (Diagnostic Message Formatting Options): Add -fdiagnostics-show-template-tree and -fno-elide-type. (GCC_COLORS): Add type-diff to example. (type-diff=): New. (-fdiagnostics-show-template-tree): New. (-fno-elide-type): New. * pretty-print.c (pp_format): Pass formatters[argno] to the pp_format_decoder callback. Call any m_format_postprocessor's "handle" method. (pretty_printer::pretty_printer): Initialize m_format_postprocessor. (pretty_printer::~pretty_printer): Delete any m_format_postprocessor. * pretty-print.h (printer_fn): Add bool and const char ** parameters. (class format_postprocessor): New class. (struct pretty_printer::format_decoder): Document the new parameters. (struct pretty_printer::m_format_postprocessor): New field. * tree-diagnostic.c (default_tree_printer): Update for new bool and const char ** params. * tree-diagnostic.h (default_tree_printer): Likewise. gcc/fortran/ChangeLog: * error.c (gfc_format_decoder): Update for new bool and const char ** params. gcc/testsuite/ChangeLog: * g++.dg/plugin/plugin.exp (plugin_test_list): Add... * g++.dg/plugin/show-template-tree-color-no-elide-type.C: New test case. * g++.dg/plugin/show-template-tree-color.C: New test case. * g++.dg/plugin/show_template_tree_color_plugin.c: New plugin. * g++.dg/template/show-template-tree-2.C: New test case. * g++.dg/template/show-template-tree-3.C: New test case. * g++.dg/template/show-template-tree-4.C: New test case. * g++.dg/template/show-template-tree-no-elide-type.C: New test case. * g++.dg/template/show-template-tree.C: New test case. --- gcc/c-family/c-format.c| 2 +- gcc/c-family/c.opt | 8 + gcc/c/c-objc-common.c | 5 +- gcc/cp/call.c | 2 +- gcc/cp/error.c
Re: [PATCH] non-checking pure attribute
On Tue, May 9, 2017 at 3:33 PM, Nathan Sidwell wrote: > On 05/09/2017 09:19 AM, Richard Biener wrote: >> >> On Tue, May 9, 2017 at 3:01 PM, Nathan Sidwell wrote: > > >> Ok, but ... are they not "pure" enough? That is, do we really care to >> preserve >> the checking side-effect for example when doing >> >>tree_fits_uhwi (t); >> >> (result unused)? > > > I wondered about that. More specifically: > if (tree_fits_uhwi (t)) { bool fits = tree_fits_uhwi (t) ...} > > I wondered if we'd get sane backtraces and what not, if the optimizer > thought such functions never barfed. Well, I think you'd either ICE in the first check or can safely CSE the second. > If you're fine with unconditionally saying pure, that works for me. I'd be ok with that. > nathan > > -- > Nathan Sidwell
C++ PATCH for c++/70167, array prvalue treated as lvalue
The issue here was that we've been trying to treat C++ list-initialized temporaries (which are rvalues) the same as C99 compound literals (which are lvalues). This patch distinguishes between them so we can treat them each correctly. This leaves open the question of how compound literals ought to work in C++; currently we mostly treat them as rvalues, like other temporaries, except for constant arrays, which we treat as lvalues, as in C. Now that we distinguish between C and C++ syntax for list-initialization of a temporary object, we could go back to treating all C-style compound literals as lvalues for better C compatibility. Or we could decide that following the C++ temporary model consistently is more important. But this patch doesn't do either. Tested x86_64-pc-linux-gnu, applying to trunk. commit 929e7c870d235fea4a475928001f073b76829580 Author: Jason Merrill Date: Mon Apr 17 10:30:15 2017 -0400 PR c++/70167 - array prvalue treated as lvalue * cp-tree.h (CONSTRUCTOR_C99_COMPOUND_LITERAL): New. (enum fcl_t): New. * semantics.c (finish_compound_literal): Add fcl_context parameter. Only make a static variable for C99 syntax. * parser.c (cp_parser_postfix_expression): Pass it. * pt.c (tsubst_copy_and_build): Likewise. * call.c (extend_ref_init_temps): Set DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index d9accd1..dee236e 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -10516,6 +10516,9 @@ extend_ref_init_temps (tree decl, tree init, vec **cleanups) FOR_EACH_VEC_SAFE_ELT (elts, i, p) p->value = extend_ref_init_temps (decl, p->value, cleanups); } + recompute_constructor_flags (ctor); + if (decl_maybe_constant_var_p (decl) && TREE_CONSTANT (ctor)) + DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = true; } } diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index b64fa6d..100f85c 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -369,6 +369,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX]; DECL_NON_TRIVIALLY_INITIALIZED_P (in VAR_DECL) CALL_EXPR_ORDERED_ARGS (in CALL_EXPR, AGGR_INIT_EXPR) DECLTYPE_FOR_REF_CAPTURE (in DECLTYPE_TYPE) + CONSTUCTOR_C99_COMPOUND_LITERAL (in CONSTRUCTOR) 4: TREE_HAS_CONSTRUCTOR (in INDIRECT_REF, SAVE_EXPR, CONSTRUCTOR, CALL_EXPR, or FIELD_DECL). IDENTIFIER_TYPENAME_P (in IDENTIFIER_NODE) @@ -3898,6 +3899,11 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter) #define CONSTRUCTOR_MUTABLE_POISON(NODE) \ (TREE_LANG_FLAG_2 (CONSTRUCTOR_CHECK (NODE))) +/* True if this typed CONSTRUCTOR represents C99 compound-literal syntax rather + than C++11 functional cast syntax. */ +#define CONSTRUCTOR_C99_COMPOUND_LITERAL(NODE) \ + (TREE_LANG_FLAG_3 (CONSTRUCTOR_CHECK (NODE))) + #define DIRECT_LIST_INIT_P(NODE) \ (BRACE_ENCLOSED_INITIALIZER_P (NODE) && CONSTRUCTOR_IS_DIRECT_INIT (NODE)) @@ -6483,7 +6489,10 @@ extern tree finish_this_expr (void); extern tree finish_pseudo_destructor_expr (tree, tree, tree, location_t); extern cp_expr finish_unary_op_expr(location_t, enum tree_code, cp_expr, tsubst_flags_t); -extern tree finish_compound_literal(tree, tree, tsubst_flags_t); +/* Whether this call to finish_compound_literal represents a C++11 functional + cast or a C99 compound literal. */ +enum fcl_t { fcl_functional, fcl_c99 }; +extern tree finish_compound_literal(tree, tree, tsubst_flags_t, fcl_t = fcl_functional); extern tree finish_fname (tree); extern void finish_translation_unit(void); extern tree finish_template_type_parm (tree, tree); diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index ab56f12..1951452 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -6770,7 +6770,7 @@ cp_parser_postfix_expression (cp_parser *parser, bool address_p, bool cast_p, /* Form the representation of the compound-literal. */ postfix_expression = finish_compound_literal (type, initializer, -tf_warning_or_error); +tf_warning_or_error, fcl_c99); postfix_expression.set_location (initializer.get_location ()); break; } @@ -26834,7 +26834,7 @@ cp_parser_functional_cast (cp_parser* parser, tree type) type = TREE_TYPE (type); cast = finish_compound_literal (type, expression_list, - tf_warning_or_error); + tf_warning_or_error, fcl_functional); /* Create a location of the form: type_name{i, f} ^~~ diff --git a/gcc/
Re: [PATCH] Improve vectorizer peeling for alignment costmodel
On Fri, 5 May 2017, Christophe Lyon wrote: > Hi Richard, > > > On 3 May 2017 at 10:19, Richard Biener wrote: > > > > The following extends the very simplistic cost modeling I added somewhen > > late in the release process to, for all unknown misaligned refs, also > > apply this model for loops containing stores. > > > > The model basically says it's useless to peel for alignment if there's > > only a single DR that is affected or if, in case we'll end up using > > hw-supported misaligned loads, the cost of misaligned loads is the same > > as of aligned ones. Previously we'd usually align one of the stores > > with the theory that this improves (precious) store-bandwith. > > > > Note this is only a so slightly conservative (aka less peeling). We'll > > still apply peeling for alignment if you make the testcase use += > > because then we'll align both the load and the store from v1. > > > > Bootstrap / regtest running on x86_64-unknown-linux-gnu. > > > > Richard. > > > > 2017-05-03 Richard Biener > > > > * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): > > When all DRs have unknown misaligned do not always peel > > when there is a store but apply the same costing model as if > > there were only loads. > > > > * gcc.dg/vect/costmodel/x86_64/costmodel-alignpeel.c: New testcase. > > > > This patch (r247544) caused regressions on aarch64 and arm: > - PASS now FAIL [PASS => FAIL]: > > Executed from: gcc.dg/vect/vect.exp > gcc.dg/vect/vect-44.c -flto -ffat-lto-objects > scan-tree-dump-times vect "Alignment of access forced using peeling" 1 > gcc.dg/vect/vect-44.c -flto -ffat-lto-objects > scan-tree-dump-times vect "Vectorizing an unaligned access" 2 > gcc.dg/vect/vect-44.c scan-tree-dump-times vect "Alignment of > access forced using peeling" 1 > gcc.dg/vect/vect-44.c scan-tree-dump-times vect "Vectorizing an > unaligned access" 2 > gcc.dg/vect/vect-50.c -flto -ffat-lto-objects > scan-tree-dump-times vect "Alignment of access forced using peeling" 1 > gcc.dg/vect/vect-50.c -flto -ffat-lto-objects > scan-tree-dump-times vect "Vectorizing an unaligned access" 2 > gcc.dg/vect/vect-50.c scan-tree-dump-times vect "Alignment of > access forced using peeling" 1 > gcc.dg/vect/vect-50.c scan-tree-dump-times vect "Vectorizing an > unaligned access" 2 Ok, so the reason is that we no longer peel for alignment for for (i = 0; i < N; i++) { pa[i] = pb[i] * pc[i]; } which is probably good. This is because the generic aarch64 cost model (and probaby also arm) has 1, /* vec_align_load_cost */ 1, /* vec_unalign_load_cost */ 1, /* vec_unalign_store_cost */ 1, /* vec_store_cost */ so there's no benefit in aligning. x86 generic tuning has 1,/* vec_align_load_cost. */ 2,/* vec_unalign_load_cost. */ 1,/* vec_store_cost. */ and vec_unalign_store_cost sharing with vec_unalign_load_cost. That makes us still apply peeling. Fixing this with vect_ testsuite conditions is going to be tricky so the easiest is to simply disable peeling here. Tested on aarch64 and x86_64, committed. Richard. 2017-05-09 Richard Biener * gcc.dg/vect/vect-44.c: Add --param vect-max-peeling-for-alignment=0 and adjust. * gcc.dg/vect/vect-50.c: Likewise. Index: gcc/testsuite/gcc.dg/vect/vect-44.c === --- gcc/testsuite/gcc.dg/vect/vect-44.c (revision 247782) +++ gcc/testsuite/gcc.dg/vect/vect-44.c (working copy) @@ -1,6 +1,7 @@ +/* { dg-do compile } */ /* { dg-require-effective-target vect_float } */ +/* { dg-additional-options "--param vect-max-peeling-for-alignment=0" } */ -#include #include "tree-vect.h" #define N 256 @@ -65,7 +66,7 @@ int main (void) two loads to be aligned). */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */ -/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || {! vector_alignment_reachable} } } } } */ +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */ +/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || {! vector_alignment_reachable} } } } } */ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { vect_no_align && { ! vect_hw_misalign } } } } } */ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable}
Re: [PATCH] Improve vectorizer peeling for alignment costmodel
On Tue, 9 May 2017, Richard Biener wrote: > On Fri, 5 May 2017, Christophe Lyon wrote: > > > Hi Richard, > > > > > > On 3 May 2017 at 10:19, Richard Biener wrote: > > > > > > The following extends the very simplistic cost modeling I added somewhen > > > late in the release process to, for all unknown misaligned refs, also > > > apply this model for loops containing stores. > > > > > > The model basically says it's useless to peel for alignment if there's > > > only a single DR that is affected or if, in case we'll end up using > > > hw-supported misaligned loads, the cost of misaligned loads is the same > > > as of aligned ones. Previously we'd usually align one of the stores > > > with the theory that this improves (precious) store-bandwith. > > > > > > Note this is only a so slightly conservative (aka less peeling). We'll > > > still apply peeling for alignment if you make the testcase use += > > > because then we'll align both the load and the store from v1. > > > > > > Bootstrap / regtest running on x86_64-unknown-linux-gnu. > > > > > > Richard. > > > > > > 2017-05-03 Richard Biener > > > > > > * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): > > > When all DRs have unknown misaligned do not always peel > > > when there is a store but apply the same costing model as if > > > there were only loads. > > > > > > * gcc.dg/vect/costmodel/x86_64/costmodel-alignpeel.c: New > > > testcase. > > > > > > > This patch (r247544) caused regressions on aarch64 and arm: > > - PASS now FAIL [PASS => FAIL]: > > > > Executed from: gcc.dg/vect/vect.exp > > gcc.dg/vect/vect-44.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "Alignment of access forced using peeling" 1 > > gcc.dg/vect/vect-44.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "Vectorizing an unaligned access" 2 > > gcc.dg/vect/vect-44.c scan-tree-dump-times vect "Alignment of > > access forced using peeling" 1 > > gcc.dg/vect/vect-44.c scan-tree-dump-times vect "Vectorizing an > > unaligned access" 2 > > gcc.dg/vect/vect-50.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "Alignment of access forced using peeling" 1 > > gcc.dg/vect/vect-50.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "Vectorizing an unaligned access" 2 > > gcc.dg/vect/vect-50.c scan-tree-dump-times vect "Alignment of > > access forced using peeling" 1 > > gcc.dg/vect/vect-50.c scan-tree-dump-times vect "Vectorizing an > > unaligned access" 2 > > Ok, so the reason is that we no longer peel for alignment for > > for (i = 0; i < N; i++) > { > pa[i] = pb[i] * pc[i]; > } > > which is probably good. This is because the generic aarch64 cost model > (and probaby also arm) has > > 1, /* vec_align_load_cost */ > 1, /* vec_unalign_load_cost */ > 1, /* vec_unalign_store_cost */ > 1, /* vec_store_cost */ > > so there's no benefit in aligning. x86 generic tuning has > > 1,/* vec_align_load_cost. */ > 2,/* vec_unalign_load_cost. */ > 1,/* vec_store_cost. */ > > and vec_unalign_store_cost sharing with vec_unalign_load_cost. > That makes us still apply peeling. > > Fixing this with vect_ testsuite conditions is going to be tricky > so the easiest is to simply disable peeling here. > > Tested on aarch64 and x86_64, committed. > > Richard. > > 2017-05-09 Richard Biener > > * gcc.dg/vect/vect-44.c: Add --param vect-max-peeling-for-alignment=0 > and adjust. > * gcc.dg/vect/vect-50.c: Likewise. > > Index: gcc/testsuite/gcc.dg/vect/vect-44.c > === > --- gcc/testsuite/gcc.dg/vect/vect-44.c (revision 247782) > +++ gcc/testsuite/gcc.dg/vect/vect-44.c (working copy) > @@ -1,6 +1,7 @@ > +/* { dg-do compile } */ Without these changes. Those were for aarch64 cross testing. Richard. > /* { dg-require-effective-target vect_float } */ > +/* { dg-additional-options "--param vect-max-peeling-for-alignment=0" } */ > > -#include > #include "tree-vect.h" > > #define N 256 > @@ -65,7 +66,7 @@ int main (void) > two loads to be aligned). */ > > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ > -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 > "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */ > -/* { dg-final { scan-tree-dump-times "Alignment of access forced using > peeling" 1 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || {! > vector_alignment_reachable} } } } } */ > +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 > "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */ > +/* { dg-final { scan-tree-dump-times "Alignment of access forced using > peeling" 0 "vect" { xfail { { vect_no_align &
Re: [PATCH] disable -Walloc-size-larger-than and -Wstringop-overflow for non-C front ends (PR 80545)
On 05/09/2017 02:57 AM, Richard Biener wrote: On Mon, May 8, 2017 at 4:31 PM, Martin Sebor wrote: On 05/04/2017 10:13 PM, Jeff Law wrote: On 04/28/2017 04:02 PM, Martin Sebor wrote: The two options were included in -Wall and enabled for all front ends but only made to be recognized by the driver for the C family of compilers. That made it impossible to suppress those warnings when compiling code for those other front ends (like Fortran). The attached patch adjusts the warnings so that they are only enabled for the C family of front ends and not for any others, as per Richard's suggestion. (The other solution would have been to make the warnings available to all front ends. Since non-C languages don't have a way of calling the affected functions -- or do they? -- this is probably not necessary.) Martin gcc-80545.diff PR driver/80545 - option -Wstringop-overflow not recognized by Fortran gcc/c-family/ChangeLog: PR driver/80545 * c.opt (-Walloc-size-larger-than, -Wstringop-overflow): Enable and make available for the C family only. OK. jeff It turns out that this is not the right fix. I overlooked that -Wstringop-overflow is meant to be enabled by default and while removing the Init(2) bit and replacing it with LangEnabledBy (C ObjC C++ ObjC++, Wall, 2, 0) suppresses the warning in Fortran it also disables it by default in C/C++ unless -Wall is used. By my reading of the Option properties part of the GCC Internals manual there is no way to initialize a warning to on by default while making it available only in a subset of languages. The only way I can think of is to initialize it in the .opt file to something like -1 and then change it at some point to 2 somewhere in the C/C++ front ends. That seems pretty cumbersome. Am I missing some trick? Maybe just enhance the machinery to allow LangEnabledBy (C ObjC C++ ObjC++, , 2, 0) (note empty "by") ? Yes, I was thinking of something along these lines as well. Generalizing it for all options sounds like the right approach. Thanks! Martin
Re: [c++ PATCH] PR c++/80682
On 05/09/2017 08:06 AM, Ville Voutilainen wrote: Tested on Linux-x64, not tested with the full suite yet. 2017-05-09 Ville Voutilainen gcc/ PR c++/80682 * cp/method.c (is_trivially_xible): Reject void types. testsuite/ PR c++/80682 * g++.dg/ext/is_trivially_constructible1.C: Add tests for void target. + if (to == void_type_node) +return false; VOID_TYPE_P. ok with that change -- Nathan Sidwell
RE: [PATCH 1/2] Automatic context save/restore for regular interrupts.
Committed r247795 including the indicated fixes. Thank you for your review, Claudiu
RE: [PATCH 2/2] Fast interrupts support.
Committed r247796. Thank you for your review, Claudiu
RE: [PATCH 3/3] [ARC] Add support for advanced mpy/mac instructions.
Added two new tests. Committed r247797. Thank you for your review, Claudiu
Re: [PATCH] Pretty-printing of some unsupported expressions (PR c/35441)
On Mon, 8 May 2017, Volker Reichelt wrote: > 2017-05-07 Volker Reichelt > > PR c/35441 > * c-pretty-print.c (c_pretty_printer::expression): Handle MAX_EXPR, > MIN_EXPR, EXACT_DIV_EXPR, RDIV_EXPR, LROTATE_EXPR, RROTATE_EXPR. > (c_pretty_printer::postfix_expression): Handle MAX_EXPR, MIN_EXPR. > (c_pretty_printer::multiplicative_expression): Handle EXACT_DIV_EXPR, > RDIV_EXPR. > (pp_c_shift_expression): Handle LROTATE_EXPR, RROTATE_EXPR. OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 1/2] (v2) C++ template type diff printing
On 05/09/2017 10:07 AM, David Malcolm wrote: +static const char * +type_to_string_with_compare (tree type, tree peer, bool verbose, +bool show_color) +{ + for (int idx = 0; idx < len_max; idx++) +{ ... + if (idx + 1 < len_max) + pp_character (pp, ','); +} My favorite idiom for this is to place the test at the beginning of the loop and avoid an extra loop value check. (perhaps optimizers are smart enough to tell that 'idx + 1 < max' and 'idx++, idx < max' are the same these days) if (idx) pp_character (pp, ','); +static void +print_template_tree_comparison (pretty_printer *pp, tree type_a, tree type_b, + bool verbose, int indent) It looks to me that type_to_string_with_compare and print_template_tree_comparison are doing very nearly the same thing, with a little formatting difference. How hard would it be to have them forward to a worker function? + unsigned int chunk_idx; + for (chunk_idx = 0; args[chunk_idx]; chunk_idx++) +; I have a fondness for putting 'continue;' as the body of such empty loops. Dunno if that's style-compliant though. +void +cxx_format_postprocessor::handle (pretty_printer *pp) +{ + /* If we have one of %H and %I, the other should have + been present. */ + if (m_type_a.m_tree || m_type_b.m_tree) +{ + gcc_assert (m_type_a.m_tree); + gcc_assert (m_type_b.m_tree); +} + if (m_type_a.m_tree && m_type_b.m_tree) As you fall into this. Why not simply if (m_type_a.m_tree || m_type_b.m_tree) { do stuff that will seg fault if one's null } + gcc_assert (type_a.m_tree); And these asserts are confusing, because some, at least, seem to be checking the if condition. + gcc_assert (type_a.m_buffer_ptr); + gcc_assert (type_b.m_tree); + gcc_assert (type_b.m_buffer_ptr); Generally the C++ bits look good, and my style comments are FWIW not obligatory. 1) is it possible to commonize the two functions I mention 2) cleanup the unnecessary asserts in cxx_format_postprocessor::handle non-c++ bits not reviewed. nathan -- Nathan Sidwell
Re: [PATCH, rs6000] Fix vec_xl and vec_xst intrinsics for P8
Hi! On Thu, May 04, 2017 at 04:35:10PM -0500, Bill Schmidt wrote: > In an earlier patch, I changed vec_xl and vec_xst to make use of new > POWER9 instructions when loading or storing vector short/char values. > In so doing, I failed to enable the existing instruction use for > -mcpu=power8, so these were no longer considered valid by the compiler. > Not good. > > This patch fixes the problem by using other existing built-in definitions > when the POWER9 instructions are not available. I've added a test case > to improve coverage and demonstrate that the problem is fixed. > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no > regressions. Is this ok for trunk? Yes, thanks! One nit: > --- gcc/config/rs6000/rs6000.c(revision 247560) > +++ gcc/config/rs6000/rs6000.c(working copy) > @@ -18183,6 +18183,17 @@ altivec_init_builtins (void) >def_builtin ("__builtin_vsx_st_elemrev_v16qi", > void_ftype_v16qi_long_pvoid, VSX_BUILTIN_ST_ELEMREV_V16QI); > } > + else > +{ > + rs6000_builtin_decls[(int)VSX_BUILTIN_LD_ELEMREV_V8HI] > + = rs6000_builtin_decls[(int)VSX_BUILTIN_LXVW4X_V8HI]; There should be a space after the cast operators. Segher
[gomp4] OpenACC update if_present runtime support
This patch adds runtime support for the OpenACC update if_present clause. It turned out to require significantly less work to implement if_present in the runtime. Instead of creating a new API for GOACC_updated, I exploited the fact that OpenACC no longer uses GOMP_MAP_FORCE_* data mappings. This allowed me to encode the if_present update data mappings as GOMP_MAP_{TO,FROM} for device, and host/self, respectively, during gimplification. The actual runtime changes themselves were minor; the runtime only needs to call acc_is_present prior to actually updating the device/host data. I've applied this patch to gomp-4_0-branch. Cesar 2017-05-09 Cesar Philippidis gcc/ * gimplify.c (gimplify_omp_target_update): Relax OpenACC update data mappings to GOMP_MAP_{TO,FROM} when the user specifies if_present. gcc/testsuite/ * c-c++-common/goacc/update-if_present-1.c: Update test case. libgomp/ * oacc-parallel.c (GOACC_update): Handle GOMP_MAP_{TO,FROM} for the if_present data clauses. * testsuite/libgomp.oacc-c-c++-common/update-2.c: New test. * testsuite/libgomp.oacc-fortran/update-3.f90: New test. diff --git a/gcc/gimplify.c b/gcc/gimplify.c index af908f4..47fe9ee 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -10034,6 +10034,25 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p) ort, TREE_CODE (expr)); gimplify_adjust_omp_clauses (pre_p, NULL, &OMP_STANDALONE_CLAUSES (expr), TREE_CODE (expr)); + if (TREE_CODE (expr) == OACC_UPDATE + && find_omp_clause (OMP_STANDALONE_CLAUSES (expr), OMP_CLAUSE_IF_PRESENT)) +{ + /* The runtime uses GOMP_MAP_{TO,FROM} to denote the if_present + clause. */ + for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c)) + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP) + switch (OMP_CLAUSE_MAP_KIND (c)) + { + case GOMP_MAP_FORCE_TO: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_TO); + break; + case GOMP_MAP_FORCE_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FROM); + break; + default: + break; + } +} stmt = gimple_build_omp_target (NULL, kind, OMP_STANDALONE_CLAUSES (expr)); gimplify_seq_add_stmt (pre_p, stmt); diff --git a/gcc/testsuite/c-c++-common/goacc/update-if_present-1.c b/gcc/testsuite/c-c++-common/goacc/update-if_present-1.c index 519393cf..5a19ee1 100644 --- a/gcc/testsuite/c-c++-common/goacc/update-if_present-1.c +++ b/gcc/testsuite/c-c++-common/goacc/update-if_present-1.c @@ -12,6 +12,18 @@ t () #pragma acc update device(b) async if_present #pragma acc update host(c[1:3]) wait(4) if_present #pragma acc update self(c) device(b) host (a) async(10) if (a == 5) if_present + +#pragma acc update self(a) +#pragma acc update device(b) async +#pragma acc update host(c[1:3]) wait(4) +#pragma acc update self(c) device(b) host (a) async(10) if (a == 5) } -/* { dg-final { scan-tree-dump-times "pragma omp target oacc_update if_present" 4 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update if_present map.from:a .len: 4.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update if_present async.-1. map.to:b .len: 4.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update if_present wait.4. map.from:c.1. .len: 12.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update if_present if.D.. async.10. map.from:a .len: 4.. map.to:b .len: 4.. map.from:c .len: 40.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update map.force_from:a .len: 4.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update async.-1. map.force_to:b .len: 4.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update wait.4. map.force_from:c.1. .len: 12.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "omp target oacc_update if.D.. async.10. map.force_from:a .len: 4.. map.force_to:b .len: 4.. map.force_from:c .len: 40.." 1 "omplower" } } */ diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c index 66acdf6..de70ac0 100644 --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -683,14 +683,29 @@ GOACC_update (int device, size_t mapnum, /* Restore the host pointer. */ *(uintptr_t *) hostaddrs[i] = t; + update_device = false; } break; + case GOMP_MAP_TO: + if (!acc_is_present (hostaddrs[i], sizes[i])) + { + update_device = false; + break; + } + /* Fallthru */ case GOMP_MAP_FORCE_TO: update_device = true; acc_update_device (hostaddrs[i], sizes[i]); break; + case GOMP_MAP_FROM: + if (!acc_is_present (hostaddrs[i], sizes[i])) + { + update_device = false; + break; + } + /* Fallthru */ case GOMP_MAP_FORCE_FROM: update_device = false; acc_update_self (hostaddrs[i], sizes[i]); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/update-2.c b/libgomp
Re: PR translation/80280
Hi Martin, I am currently unable to build gcc for the x86_64-pc-cygwin target because gcc/config/i386/msformat-c.c uses the format_flag_spec struct but it has not been updated with the new field. :-( For example: gcc/config/i386/msformat-c.c gcc/config/i386/msformat-c.c:52:1: error: cannot convert 'format_std_version' to 'const char*' in initialization gcc/config/i386/msformat-c.c:52:1: warning: missing initializer for member 'format_flag_spec::std' gcc/current/gcc/config/i386/msformat-c.c:52:1: error: cannot convert 'format_std_version' to 'const char*' in initial Do you have time to fix this ? If not, please could you tell me which of the fields in the struct is new, and how it ought to be initialised. Cheers Nick
Re: C++ PATCH for c++/70167, array prvalue treated as lvalue
On Tue, May 9, 2017 at 9:46 AM, Jason Merrill wrote: > The issue here was that we've been trying to treat C++ > list-initialized temporaries (which are rvalues) the same as C99 > compound literals (which are lvalues). This patch distinguishes > between them so we can treat them each correctly. This introduced a failure in the libstdc++ testsuite, which was incorrectly relying on the temporary array backing a std::initializer_list being static even if the std::initializer_list is not; in fact they should have the same storage duration. Fixing thus. commit 0e75edc1775f5afbb3113d840ebb500d091a7003 Author: Jason Merrill Date: Tue May 9 11:06:09 2017 -0400 * testsuite/24_iterators/container_access.cc (test03): Make il3 static. diff --git a/libstdc++-v3/testsuite/24_iterators/container_access.cc b/libstdc++-v3/testsuite/24_iterators/container_access.cc index 42ecb41..7f60d2b 100644 --- a/libstdc++-v3/testsuite/24_iterators/container_access.cc +++ b/libstdc++-v3/testsuite/24_iterators/container_access.cc @@ -55,7 +55,7 @@ test03() std::initializer_list il2{}; VERIFY(std::size(il2) == 0); VERIFY(std::empty(il2)); - constexpr std::initializer_list il3{1,2,3}; + static constexpr std::initializer_list il3{1,2,3}; constexpr auto d = std::data(il3); static_assert(d == il3.begin()); constexpr auto s = std::size(il3);
Re: [PATCH] non-checking pure attribute
On 05/09/2017 09:39 AM, Richard Biener wrote: On Tue, May 9, 2017 at 3:33 PM, Nathan Sidwell wrote: I wondered if we'd get sane backtraces and what not, if the optimizer thought such functions never barfed. Well, I think you'd either ICE in the first check or can safely CSE the second. Done -- Nathan Sidwell 2017-05-09 Nathan Sidwell * tree.h (tree_fits_shwi_p, tree_fits_uhwi_p): Unconditionally pure. Index: tree.h === --- tree.h (revision 247784) +++ tree.h (working copy) @@ -4109,15 +4109,9 @@ extern int attribute_list_contained (con extern int tree_int_cst_equal (const_tree, const_tree); extern bool tree_fits_shwi_p (const_tree) -#ifndef ENABLE_TREE_CHECKING - ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled. */ -#endif - ; + ATTRIBUTE_PURE; extern bool tree_fits_uhwi_p (const_tree) -#ifndef ENABLE_TREE_CHECKING - ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled. */ -#endif - ; + ATTRIBUTE_PURE; extern HOST_WIDE_INT tree_to_shwi (const_tree); extern unsigned HOST_WIDE_INT tree_to_uhwi (const_tree); #if !defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 4003)
Re: [PATCH, rs6000] gcc mainline, add builtin support for vec_neg()
Hi Carl, On Fri, May 05, 2017 at 02:22:14PM -0700, Carl E. Love wrote: > This patch adds support for the various vec_neg() builtins. > > The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE) > with no regressions. > > Is the patch OK for gcc mainline? Yes please, thanks! Segher > 2017-04-05 Carl Love > >* config/rs6000/rs6000-c: Add support for built-in functions >vector signed charvec_neg (vector signed char) >vector signed short int vec_neg (vector short int) >vector signed int vec_neg (vector signed int) >vector signed long long vec_neg (vector signed long long) >vector float vec_neg (vector float) >vector double vec_neg (vector double) >* config/rs6000/rs6000-builtin.def: Add definitions for NEG function >overload. >* config/rs6000/altivec.h: Add define for vec_neg >* doc/extend.texi: Update the built-in documentation for the >new built-in functions. > > gcc/testsuite/ChangeLog: > > 2017-04-05 Carl Love >* gcc.target/powerpc/builtins-3.c: Add tests for the new built-ins to >to the test suite file. >* gcc.target/powerpc/builtins-3-p8.c: Add tests for the new built-ins to >to the test suite file.
Re: PR translation/80280
On 05/09/2017 09:08 AM, Nick Clifton wrote: Hi Martin, I am currently unable to build gcc for the x86_64-pc-cygwin target because gcc/config/i386/msformat-c.c uses the format_flag_spec struct but it has not been updated with the new field. :-( For example: gcc/config/i386/msformat-c.c gcc/config/i386/msformat-c.c:52:1: error: cannot convert 'format_std_version' to 'const char*' in initialization gcc/config/i386/msformat-c.c:52:1: warning: missing initializer for member 'format_flag_spec::std' gcc/current/gcc/config/i386/msformat-c.c:52:1: error: cannot convert 'format_std_version' to 'const char*' in initial Do you have time to fix this ? If not, please could you tell me which of the fields in the struct is new, and how it ought to be initialised. Rats! I ran into this when building for sparcv9-solaris2.11 but didn't look at the cause of the error carefully enough to recognize it was caused by my change so I just raised 80673 for it. It didn't occur to me that targets defined their own format extensions like this. What a tangled mess. The fix should be trivial. My change added just a single member: format_flag_spec::quoting. I can take care of it today, along with bug 80673. Sorry about that! Things are never as simple as they seem. Martin
Re: PR translation/80280
Hi Martin, > Rats! I ran into this when building for sparcv9-solaris2.11 but > didn't look at the cause of the error carefully enough to recognize > it was caused by my change so I just raised 80673 for it. It didn't > occur to me that targets defined their own format extensions like > this. What a tangled mess. Oh go on - if it was easy you would be bored... > The fix should be trivial. My change added just a single member: > format_flag_spec::quoting. I can take care of it today, along > with bug 80673. Thanks very much! Cheers Nick
[PATCH, Fortran] PR 80668: wrong error message with -finit-derived
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80668 All, The following fixes a bug exposed in PR 80668 in which the compiler attempted to generate an initializer for components of derived types with the POINTER attribute when compiling with -finit-derived and -finit-*=*. It is nonsensical for components with the POINTER attribute to have initializers generated in this way. The resolution phase caught the resulting contradictions, resulting in an error for the testcase given in the PR. With the patch all regression tests pass except for the following, which already fail in the current trunk (r247800): Running /data/midas/foreese/src/gcc-dev/gcc/testsuite/gfortran.dg/dg.exp ... FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times original "_gfortran_caf_lock \\(caf_token.., \\(3 - \\(integer\\(kind=4\\)\\) parm...dim\\[0\\].lbound\\) \\+ \\(integer\\(kind=4\\)\\) MAX_EXPR <\\(parm...dim\\[0\\].ubound - parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - \\(integer\\(kind=4\\)\\) parm...dim\\[1\\].lbound\\), 0, 0B, &ii, 0B, 0\\);|_gfortran_caf_lock \\(caf_token.1, \\(3 - parm...dim\\[0\\].lbound\\) \\+ MAX_EXPR <\\(parm...dim\\[0\\].ubound - parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - parm...dim\\[1\\].lbound\\), 0, 0B, &ii, 0B, 0\\);" 1 FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times original "_gfortran_caf_unlock \\(caf_token.., \\(2 - \\(integer\\(kind=4\\)\\) parm...dim\\[0\\].lbound\\) \\+ \\(integer\\(kind=4\\)\\) MAX_EXPR <\\(parm...dim\\[0\\].ubound - parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - \\(integer\\(kind=4\\)\\) parm...dim\\[1\\].lbound\\), 0, &ii, 0B, 0\\);|_gfortran_caf_unlock \\(caf_token.., \\(2 - parm...dim\\[0\\].lbound\\) \\+ MAX_EXPR <\\(parm...dim\\[0\\].ubound - parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - parm...dim\\[1\\].lbound\\), 0, &ii, 0B, 0\\);" 1 FAIL: gfortran.dg/mvbits_7.f90 -O0 (test for warnings, line 28) OK for trunk? --- Fritz Reese 2017-05-09 Fritz Reese PR fortran/80668 gcc/fortran/ChangeLog: PR fortran/80668 * expr.c (component_initializer): Don't generate initializers for pointer components. * invoke.texi (-finit-derived): Document. gcc/testsuite/ChangeLog: PR fortran/80668 * gfortran.dg/pr80668.f90: New. From bd14dcb2026636052a50bb5a7f7075de0fc3b856 Mon Sep 17 00:00:00 2001 From: Fritz Reese Date: Tue, 9 May 2017 08:42:19 -0400 Subject: [PATCH] 2017-05-09 Fritz Reese PR fortran/80668 gcc/fortran/ * expr.c (component_initializer): Don't generate initializers for pointer components. * invoke.texi (-finit-derived): Document. * gcc/testsuite/gfortran.dg/pr80668.f90: New. --- gcc/fortran/expr.c| 8 ++-- gcc/fortran/invoke.texi | 2 ++ gcc/testsuite/gfortran.dg/pr80668.f90 | 32 3 files changed, 40 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr80668.f90 diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index c8be9513af5..7ea9d8233a9 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -4280,9 +4280,13 @@ component_initializer (gfc_typespec *ts, gfc_component *c, bool generate) { gfc_expr *init = NULL; - /* See if we can find the initializer immediately. */ + /* See if we can find the initializer immediately. + Some components should never get initializers. */ if (c->initializer || !generate - || (ts->type == BT_CLASS && !c->attr.allocatable)) + || (ts->type == BT_CLASS && !c->attr.allocatable) + || c->attr.pointer + || c->attr.class_pointer + || c->attr.proc_pointer) return c->initializer; /* Recursively handle derived type components. */ diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi index 636432fead8..8a1d09dd5e5 100644 --- a/gcc/fortran/invoke.texi +++ b/gcc/fortran/invoke.texi @@ -1665,6 +1665,8 @@ according to these flags only with @option{-finit-derived}. These options do not initialize @itemize @bullet @item +objects with the POINTER attribute +@item allocatable arrays @item variables that appear in an @code{EQUIVALENCE} statement. diff --git a/gcc/testsuite/gfortran.dg/pr80668.f90 b/gcc/testsuite/gfortran.dg/pr80668.f90 new file mode 100644 index 000..f69327c3302 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr80668.f90 @@ -0,0 +1,32 @@ +! { dg-do compile } +! { dg-options "-finit-derived -finit-integer=12345678" } +! +! PR fortran/80668 +! +! Test a regression where structure constructor expressions were created for +! POINTER components with -finit-derived. +! + +MODULE pr80668 + IMPLICIT NONE + TYPE :: dist_t + INTEGER :: TYPE,nblks_loc,nblks + INTEGER,DIMENSION(:),POINTER :: dist ! regression: "element in structure + ! constructor... should be a POINTER" + END TYPE dist_t + +CONTAINS + + SUBROUTINE hfx_new() +TYPE(dist_t) :: dist +integer,pointer
Re: Fix for PR79987
If we use chkp_generate_extern_var_bounds for void variable just as for arrays with unknown size, we will create the following gimple seq: # VUSE <.MEM> __size_tmp.0 = __builtin_ia32_sizeof (foo); __size_tmp.1_3 = __size_tmp.0; However, this will fail in verify_gimple_call: tree arg = gimple_call_arg (stmt, i); if ((is_gimple_reg_type (TREE_TYPE (arg)) && !is_gimple_val (arg)) || (!is_gimple_reg_type (TREE_TYPE (arg)) && !is_gimple_lvalue (arg))) { error ("invalid argument to gimple call"); debug_generic_expr (arg); return true; } ..here the TREE_TYPE(arg)==void. Any ideas for a good workaround ? Alexander 2017-04-08 21:59 GMT+02:00 Ilya Enkovich : > 2017-04-04 18:34 GMT+03:00 Jeff Law : >> On 04/04/2017 09:07 AM, Alexander Ivchenko wrote: >>> >>> Hi, >>> >>> When creating static bounds for foo below we end up with: >>> >>> *((unsigned long *) &__chkp_bounds_of_foo + 8) = >>> ~(__builtin_ia32_sizeof (foo) + ((long unsigned int) &foo + >>> 18446744073709551615)); >>> >>> This fails in gimplify_function_tree with gcc_assert (!VOID_TYPE_P >>> (TREE_TYPE (*expr_p))); >>> >>> Is it OK? >>> >>> gcc/ChangeLog: >>> >>> 2017-04-04 Alexander Ivchenko >>> >>> * tree-chkp.c (chkp_get_bounds_for_decl_addr): >>> assigning zero bounds to void variables >>> >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2017-04-04 Alexander Ivchenko >>> >>> * gcc.target/i386/mpx/PR79987.c: New test. >> >> I've put this (and other CHKP fixes) in the queue for gcc-8 as AFAICT it's >> not a regression. >> >> Jeff >> > > Hi, > > If we delay it for GCC8 anyway then I think we may fix it in a better way. If > we > cannot detect size of a variable then size relocations may be used. It is done > already for arrays with unknown size and also can be done for void vars. > > Thanks, > Ilya
[PATCH] Kill -fdump-translation-unit
-fdump-translation-unit is an inscrutably opaque dump. It turned out that most of the uses of the tree-dump header file was to indirectly get at dumpfile.h, and the dump_function entry point it had forwarded to a dumper in tree-cfg.c. The gimple dumper would use its node dumper when asked for a raw dump, but that was about it. We have prettier printers now. This patch nukes the tu dumper. ok? nathan -- Nathan Sidwell 2017-05-09 Nathan Sidwell Remove -fdump-translation-unit. gcc/ * Makefile.in (TREE_DUMP_H): Delete. (C_COMMON_OBJS): Remove c-family/c-dump.o. (OBJS): Remove tree-dump.o. (PLUGIN_HEADERS): Remove $(TREE_DUMP_H). * cgraphclones.c: Include dumpfile.h not tree-dump.h. * doc/invoke.texi: Remove -fdump-translation-unit. * dumpfile.h (TDI_tu): Delete. (dump_function): Declare. (dump_node): Delete. * dumpfile.c: Include tree-cfg.h. (dump_files): Remove ".tu" line. (FIRST_AUTO_NUMBERED_DUMP): Decrement. (dump_function): New, from tree-dump.c. * tree-dump.h: Delete. * tree-dump.c: Delete. * gimplify.h: Include splay-tree.h not tree-dump.h. * graphite-poly.c: Don't include tree-dump.h * langhooks-def.h (lhd_tree_dump_dump_tree, lhd_tree_dump_type_quals): Don't declare. (LANG_HOOKS_TREE_DUMP_DUMP_TREE_FN, LANG_HOOKS_TREE_DUMP_TYPE_QUALS_FN, LANG_HOOKS_TREE_DUMP_INITIALIZER): Delete. * langhooks.c (lhd_tree_dump_dump_tree, lhd_tree_dump_type_quals): Delete. * langhooks.h (lang_hooks_for_tree_dump): Delete. (lang_hooks): Remove tree_dump field. * print-tree.c: Include dumpfile.h not tree-dump.h. * stor-layout.c: Likewise. * tree-nested.c: Likewise. * tree-cfg.c (dump_function_to_file): Don't dump raw node. gcc/c/ * c-decl.c (c_parse_final_cleanups): Don't dump cleanup nodes. * gimple-parser.c: Don't #include tree-dump.h gcc/c-family/ * c-dump.c: Delete. * c-gimplify.c (c_genericize): Don't raw dump the saved tree. gcc/cp/ * Make-lang.in (CXX_AND_OBJCXX_OBJS): Remove cp/dump.o. * decl2.c (dump_tu): Delete. (c_parse_final_cleanups): Don't dump_tu. * dump.c: Delete. gcc/fortran/ * trans-decl.c: Include dumpfile.h, not tree-dump.h. Index: Makefile.in === --- Makefile.in (revision 247784) +++ Makefile.in (working copy) @@ -970,7 +970,6 @@ OPTS_H = $(INPUT_H) $(VEC_H) opts.h $(OB SYMTAB_H = $(srcdir)/../libcpp/include/symtab.h $(OBSTACK_H) CPP_ID_DATA_H = $(CPPLIB_H) $(srcdir)/../libcpp/include/cpp-id-data.h CPP_INTERNAL_H = $(srcdir)/../libcpp/internal.h $(CPP_ID_DATA_H) -TREE_DUMP_H = tree-dump.h $(SPLAY_TREE_H) $(DUMPFILE_H) TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H) TREE_SSA_H = tree-ssa.h tree-ssa-operands.h \ $(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \ @@ -1181,7 +1180,7 @@ GCC_OBJS = gcc.o gcc-main.o ggc-none.o c-family-warn = $(STRICT_WARN) # Language-specific object files shared by all C-family front ends. -C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \ +C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o \ c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o \ c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o \ c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \ @@ -1483,7 +1482,6 @@ OBJS = \ tree-data-ref.o \ tree-dfa.o \ tree-diagnostic.o \ - tree-dump.o \ tree-eh.o \ tree-emutls.o \ tree-if-conv.o \ @@ -3409,7 +3407,7 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $ toplev.h $(DIAGNOSTIC_CORE_H) $(BASIC_BLOCK_H) $(HASH_TABLE_H) \ tree-ssa-alias.h $(INTERNAL_FN_H) gimple-fold.h tree-eh.h gimple-expr.h \ gimple.h is-a.h memmodel.h $(TREE_PASS_H) $(GCC_PLUGIN_H) \ - $(GGC_H) $(TREE_DUMP_H) $(PRETTY_PRINT_H) $(OPTS_H) $(PARAMS_H) \ + $(GGC_H) $(PRETTY_PRINT_H) $(OPTS_H) $(PARAMS_H) \ $(tm_file_list) $(tm_include_list) $(tm_p_file_list) $(tm_p_include_list) \ $(host_xm_file_list) $(host_xm_include_list) $(xm_include_list) \ intl.h $(PLUGIN_VERSION_H) $(DIAGNOSTIC_H) ${C_TREE_H} \ Index: c/c-decl.c === --- c/c-decl.c (revision 247784) +++ c/c-decl.c (working copy) @@ -11240,18 +11240,6 @@ c_parse_final_cleanups (void) dump_ada_specs (collect_all_refs, NULL); } - if (ext_block) -{ - tree tmp = BLOCK_VARS (ext_block); - int flags; - FILE * stream = dump_begin (TDI_tu, &flags); - if (stream && tmp) - { - dump_node (tmp, flags & ~TDF_SLIM, stream); - dump_end (TDI_tu, stream); - } -} - /* Process all file scopes in this compilation, and the external_scope, through wrapup_global_declarations. */ FOR_EACH_VEC_ELT (*all_translation_units, i, t) Index: c/gimple-parser.c === --- c/gimple-parser.c (revision 247784) +++ c/gimple-parser.c (working copy) @@ -53,7 +53,6 @@ along with GCC; see the file COPYING3. #include "tree-ssanames.h" #include "gimpl
Re: PR translation/80280
On 05/09/2017 09:33 AM, Nick Clifton wrote: Hi Martin, Rats! I ran into this when building for sparcv9-solaris2.11 but didn't look at the cause of the error carefully enough to recognize it was caused by my change so I just raised 80673 for it. It didn't occur to me that targets defined their own format extensions like this. What a tangled mess. Oh go on - if it was easy you would be bored... The fix should be trivial. My change added just a single member: format_flag_spec::quoting. I can take care of it today, along with bug 80673. Thanks very much! I just committed the Cygwin fix. It bootstraps for me but I didn't/couldn't run any tests so please give it a try and and let me know if there's anything else. Martin
Re: [PATCH 1/2] C++ template type diff printing
On Thu, May 04, 2017 at 07:44:58PM -0400, David Malcolm wrote: > This patch kit implements two new options to make it easier > to read diagnostics involving mismatched template types: > -fdiagnostics-show-template-tree and > -fno-elide-type. > > It adds two new formatting codes: %H and %I which are > equivalent to %qT, but are to be used together for type > comparisons e.g. > "can't convert from %H to %I". Just a note, I believe we need to teach gettext about all the extensions we add, so that xgettext doesn't give up on those. Dunno if it has been adjusted for the additions done in the last few years. $ grep gcc-internal-format /usr/share/doc/gettext/NEWS * Updated the meaning of 'gcc-internal-format' to match GCC 4.3. * Updated the meaning of 'gcc-internal-format' to match GCC 4.1. * Updated the meaning of 'gcc-internal-format' to match GCC 4.0. as 'gcc-internal-format'. doesn't look good. The source file is format-gcc-internal.c. Jakub
[committed] initialize target-specific format_flag_spec members for Cygwin and Solaris
As a result of adding a data member to the format_flag_spec struct my change to enhance -Wformat to warn about poorly quoted GCC diagnostics managed to break bootstrap for the Cygwin and Solaris targets, both of which make use of the struct to define their own printf format extensions. I committed r247801 to fix the Cygwin bootstrap, and r247804 to fix Solaris. Martin
Re: [PATCH, Fortran] PR 80668: wrong error message with -finit-derived
Fritz, Thanks for the quick fix, IMO on the obvious side. Two minor nits: (1) you don’t need the lines (as in "please don’t use them") + +! { dg-final { cleanup-modules "pr80668" } } (2) I don’t understand the comments after 'INTEGER,DIMENSION(:),POINTER :: dist’. TIA Dominique PS The failures for gfortran.dg/coarray_lock_7.f90 should be fixed now.
PR77644
Hi, The attached patch adds the following pattern to match.pd sqrt(x) cmp sqrt(y) -> x cmp y. and is enabled with -funsafe-math-optimization and -fno-math-errno. Bootstrapped+tested on x86_64-unknown-linux-gnu. Cross-tested on arm*-*-*, aarch64*-*-*. OK for trunk ? Thanks, Prathamesh 2017-05-09 Prathamesh Kulkarni PR tree-optimization/77644 * match.pd (sqrt(x) cmp sqrt(y) -> x cmp y): New pattern. testsuite/ * gcc.dg/tree-ssa/pr77644.c: New test-case. diff --git a/gcc/match.pd b/gcc/match.pd index e3d98baa12f..9929f5a1c16 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2633,7 +2633,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (GENERIC) (truth_andif (ge @0 { build_real (TREE_TYPE (@0), dconst0); }) - (cmp @0 { build_real (TREE_TYPE (@0), c2); } + (cmp @0 { build_real (TREE_TYPE (@0), c2); }) + + /* PR77644: Transform sqrt(x) cmp sqrt(y) -> x cmp y. */ + (simplify +(cmp (sq @0) (sq @1)) + (if (! HONOR_NANS (type)) + (cmp @0 @1)) /* Fold A /[ex] B CMP C to A CMP B * C. */ (for cmp (eq ne) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77644.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77644.c new file mode 100644 index 000..30da66374e6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77644.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target c99_runtime } */ +/* { dg-options "-O2 -fdump-tree-optimized -funsafe-math-optimizations -fno-math-errno" } */ + +#define FOO(type, cmp, suffix, no) \ +int f_##no(type x, type y) \ +{ \ + type gen_##no(); \ + type xs = __builtin_sqrt##suffix((gen_##no())); \ + type xy = __builtin_sqrt##suffix((gen_##no())); \ + return (xs cmp xy); \ +} + +#define GEN_FOO(type, suffix) \ +FOO(type, <, suffix, suffix##1) \ +FOO(type, <=, suffix, suffix##2) \ +FOO(type, >, suffix, suffix##3) \ +FOO(type, >=, suffix, suffix##4) \ +FOO(type, ==, suffix, suffix##5) \ +FOO(type, !=, suffix, suffix##6) + +GEN_FOO(float, f) +GEN_FOO(double, ) +GEN_FOO(long double, l) + +/* { dg-final { scan-tree-dump-not "__builtin_sqrtf" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "__builtin_sqrt" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "__builtin_sqrtl" "optimized" } } */
[PATCH, committed] PR80611 [8 regression] test case gfortran.dg/coarray_lock_7.f90 fails starting with r247495
I just committed a patch at revision r247803 with updated dg-final regexps (approved by Rich in bugzilla). Dominique
Re: [PATCH], Fix PR 68163, PowerPC power8 sometimes generating move direct to GPR to store 32-bit float
Hi Mike, On Fri, May 05, 2017 at 07:43:25PM -0400, Michael Meissner wrote: > This code does stores in Altivec registers by moving the value to FPR and > using > the traditional STFS instruction. However, in looking at the code, I came to > the conclusion that we could do better (PR 80510) by using a peephole2 to > load the offset value into a GPR and doing an indexed store. I have code for > PR 80510 that I will submit after this patch. That patch needs this patch to > prevent using direct move to do a store. > > Is this patch ok for GCC 8? How about GCC 7.2? > +;; Originally, we tried to keep movsf and movsd common, but the differences > +;; addressing was making it rather difficult to hide with mode attributes. > In "difference in addressing", maybe? The patch is okay for trunk; please let it simmer there for a bit before putting it on 7 as well. Thanks, Segher
[RFC PATCH, i386]: Enable post-reload compare elimination pass
Hello! Attached patch enables post-reload compare elimination pass by providing expected patterns (duplicates of existing patterns with setters of reg and flags switched in the parallel) for flag setting arithmetic instructions. The merge triggers more than 3000 times during the gcc bootstrap, mostly in cases where intervening memory load or store prevents combine from merging the arithmetic insn and the following compare. Also, some recent linux x86_64 defconfig build results in ~200 merges, removing ~200 test/cmp insns. Not much, but I think the results still warrant the pass to be enabled. 2017-05-09 Uros Bizjak * config/i386/i386-protos.h (ix86_match_ccmode_last): New prototype. * config/i386/i386.c (ix86_match_ccmode_1): Rename from ix86_match_ccmode. Add "last" argument. Make function static inline. (ix86_match_ccmode): New function. (ix86_match_ccmode_last): Ditto. (TARGET_FLAGS_REGNUM): Define. * config/i386/i386.md (*add_2b): New insn pattern. (*sub_2b): Ditto. (*and_2b): Ditto. (*_2b): Ditto. Patch was bootstrapped and regression tested on x86_64-linux-gnu. Uros. Index: i386-protos.h === --- i386-protos.h (revision 247793) +++ i386-protos.h (working copy) @@ -125,6 +125,7 @@ extern void ix86_split_copysign_const (rtx []); extern void ix86_split_copysign_var (rtx []); extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[]); extern bool ix86_match_ccmode (rtx, machine_mode); +extern bool ix86_match_ccmode_last (rtx, machine_mode); extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx); extern void ix86_expand_setcc (rtx, enum rtx_code, rtx, rtx); extern bool ix86_expand_int_movcc (rtx[]); Index: i386.c === --- i386.c (revision 247793) +++ i386.c (working copy) @@ -22337,12 +22337,8 @@ ix86_split_copysign_var (rtx operands[]) emit_insn (gen_rtx_SET (dest, x)); } -/* Return TRUE or FALSE depending on whether the first SET in INSN - has source and destination with matching CC modes, and that the - CC mode is at least as constrained as REQ_MODE. */ - -bool -ix86_match_ccmode (rtx insn, machine_mode req_mode) +static inline bool +ix86_match_ccmode_1 (rtx insn, machine_mode req_mode, bool last) { rtx set; machine_mode set_mode; @@ -22349,7 +22345,7 @@ ix86_split_copysign_var (rtx operands[]) set = PATTERN (insn); if (GET_CODE (set) == PARALLEL) -set = XVECEXP (set, 0, 0); +set = XVECEXP (set, 0, last ? XVECLEN (set, 0) - 1 : 0); gcc_assert (GET_CODE (set) == SET); gcc_assert (GET_CODE (SET_SRC (set)) == COMPARE); @@ -22393,6 +22389,26 @@ ix86_split_copysign_var (rtx operands[]) return GET_MODE (SET_SRC (set)) == set_mode; } +/* Return TRUE or FALSE depending on whether the first SET in INSN + has source and destination with matching CC modes, and that the + CC mode is at least as constrained as REQ_MODE. */ + +bool +ix86_match_ccmode (rtx insn, machine_mode req_mode) +{ + return ix86_match_ccmode_1 (insn, req_mode, false); +} + +/* Return TRUE or FALSE depending on whether the last SET in INSN + has source and destination with matching CC modes, and that the + CC mode is at least as constrained as REQ_MODE. */ + +bool +ix86_match_ccmode_last (rtx insn, machine_mode req_mode) +{ + return ix86_match_ccmode_1 (insn, req_mode, true); +} + /* Generate insn patterns to do an integer compare of OPERANDS. */ static rtx @@ -52043,6 +52059,8 @@ ix86_run_selftests (void) #undef TARGET_ADDRESS_COST #define TARGET_ADDRESS_COST ix86_address_cost +#undef TARGET_FLAGS_REGNUM +#define TARGET_FLAGS_REGNUM FLAGS_REG #undef TARGET_FIXED_CONDITION_CODE_REGS #define TARGET_FIXED_CONDITION_CODE_REGS ix86_fixed_condition_code_regs #undef TARGET_CC_MODES_COMPATIBLE Index: i386.md === --- i386.md (revision 247793) +++ i386.md (working copy) @@ -5917,6 +5917,52 @@ (const_string "*"))) (set_attr "mode" "")]) +(define_insn "*add_2b" + [(set (match_operand:SWI 0 "nonimmediate_operand" "=,m,") + (plus:SWI + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,") + (match_operand:SWI 2 "" ",,0"))) + (set (reg FLAGS_REG) + (compare + (plus:SWI (match_dup 1) (match_dup 2)) + (const_int 0)))] + "ix86_match_ccmode_last (insn, CCGOCmode) + && ix86_binary_operator_ok (PLUS, mode, operands) + && reload_completed" +{ + switch (get_attr_type (insn)) +{ +case TYPE_INCDEC: + if (operands[2] == const1_rtx) + return "inc{}\t%0"; + else + { + gcc_assert (operands[2] == constm1_rtx); + return "dec{}\t%0"; + } + +default: + if (which_alternative == 2) + std::swap (operands[1], operands[2]); + + gcc_assert (rtx_equal_p (operands[0], operands[1
Re: [PATCH, Fortran] PR 80668: wrong error message with -finit-derived
> (1) you don’t need the lines (as in "please don’t use them") > > + > +! { dg-final { cleanup-modules "pr80668" } } I will remove the "dg-final" lines. > > (2) I don’t understand the comments after 'INTEGER,DIMENSION(:),POINTER :: > dist’. (2) Sorry, the comments were meant to show the error that would appear if the testcase is regressed; i.e., the error that appeared in the original PR report. If it is too esoteric I can just remove the comments. Version 2 attached. OK? --- Fritz Reese 2017-05-09 Fritz Reese PR fortran/80668 gcc/fortran/ChangeLog: PR fortran/80668 * expr.c (component_initializer): Don't generate initializers for pointer components. * invoke.texi (-finit-derived): Document. gcc/testsuite/ChangeLog: PR fortran/80668 * gfortran.dg/pr80668.f90: New. From 10a0bdcf222db45208242d0530bc2c69530787c0 Mon Sep 17 00:00:00 2001 From: Fritz Reese Date: Tue, 9 May 2017 12:13:06 -0400 Subject: [PATCH] 2017-05-09 Fritz Reese PR fortran/80668 gcc/fortran/ * expr.c (component_initializer): Don't generate initializers for pointer components. * invoke.texi (-finit-derived): Document. PR fortran/80668 * gcc/testsuite/gfortran.dg/pr80668.f90: New. --- gcc/fortran/expr.c| 8 ++-- gcc/fortran/invoke.texi | 2 ++ gcc/testsuite/gfortran.dg/pr80668.f90 | 29 + 3 files changed, 37 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr80668.f90 diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index c8be9513af5..7ea9d8233a9 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -4280,9 +4280,13 @@ component_initializer (gfc_typespec *ts, gfc_component *c, bool generate) { gfc_expr *init = NULL; - /* See if we can find the initializer immediately. */ + /* See if we can find the initializer immediately. + Some components should never get initializers. */ if (c->initializer || !generate - || (ts->type == BT_CLASS && !c->attr.allocatable)) + || (ts->type == BT_CLASS && !c->attr.allocatable) + || c->attr.pointer + || c->attr.class_pointer + || c->attr.proc_pointer) return c->initializer; /* Recursively handle derived type components. */ diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi index 636432fead8..8a1d09dd5e5 100644 --- a/gcc/fortran/invoke.texi +++ b/gcc/fortran/invoke.texi @@ -1665,6 +1665,8 @@ according to these flags only with @option{-finit-derived}. These options do not initialize @itemize @bullet @item +objects with the POINTER attribute +@item allocatable arrays @item variables that appear in an @code{EQUIVALENCE} statement. diff --git a/gcc/testsuite/gfortran.dg/pr80668.f90 b/gcc/testsuite/gfortran.dg/pr80668.f90 new file mode 100644 index 000..63bd0d3d5cd --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr80668.f90 @@ -0,0 +1,29 @@ +! { dg-do compile } +! { dg-options "-finit-derived -finit-integer=12345678" } +! +! PR fortran/80668 +! +! Test a regression where structure constructor expressions were created for +! POINTER components with -finit-derived. +! + +MODULE pr80668 + IMPLICIT NONE + TYPE :: dist_t + INTEGER :: TYPE,nblks_loc,nblks + INTEGER,DIMENSION(:),POINTER :: dist + END TYPE dist_t + +CONTAINS + + SUBROUTINE hfx_new() +TYPE(dist_t) :: dist +integer,pointer :: bob +CALL release_dist(dist, bob) + END SUBROUTINE hfx_new + + SUBROUTINE release_dist(dist,p) +TYPE(dist_t) :: dist +integer, pointer, intent(in) :: p + END SUBROUTINE release_dist +END MODULE -- 2.12.2
Re: [Patch, Fortran, OOP] PR 79311: ICE in generate_finalization_wrapper, at fortran/class.c:1992
On 05/08/2017 02:16 PM, Janus Weil wrote: > Hi all, > > the attached patch fixes an ICE-on-valid problem with finalization by > making sure that the finalization procedures are properly resolved. > > In the test case, the finalizer of the component type was not being > resolved if the superordinate type had a finalizer itself. > > The patch also fixes a small error that had no actual impact on the > test case ('has_final2' could never become true). > > Regtesting went well, except for these three failure which also seems > to occur on a clean trunk currently: > > FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times original > FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times original > FAIL: gfortran.dg/mvbits_7.f90 -O0 (test for warnings, line 28) > > Ok for trunk? > > Cheers, > Janus > Yes Ok and thanks for patch. The lock_7 is fixed today and I did not see the mvbits fail yesterday. Regardlessm unrelated. Cheers, Jerry
Re: [PATCH, Fortran] PR 80668: wrong error message with -finit-derived
On 05/09/2017 08:37 AM, Fritz Reese wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80668 > > All, > > The following fixes a bug exposed in PR 80668 in which the compiler > attempted to generate an initializer for components of derived types > with the POINTER attribute when compiling with -finit-derived and > -finit-*=*. It is nonsensical for components with the POINTER > attribute to have initializers generated in this way. The resolution > phase caught the resulting contradictions, resulting in an error for > the testcase given in the PR. > > With the patch all regression tests pass except for the following, > which already fail in the current trunk (r247800): > > Running /data/midas/foreese/src/gcc-dev/gcc/testsuite/gfortran.dg/dg.exp ... > FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times > original "_gfortran_caf_lock \\(caf_token.., \\(3 - > \\(integer\\(kind=4\\)\\) parm...dim\\[0\\].lbound\\) \\+ > \\(integer\\(kind=4\\)\\) MAX_EXPR <\\(parm...dim\\[0\\].ubound - > parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - > \\(integer\\(kind=4\\)\\) parm...dim\\[1\\].lbound\\), 0, 0B, &ii, 0B, > 0\\);|_gfortran_caf_lock \\(caf_token.1, \\(3 - > parm...dim\\[0\\].lbound\\) \\+ MAX_EXPR <\\(parm...dim\\[0\\].ubound > - parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - > parm...dim\\[1\\].lbound\\), 0, 0B, &ii, 0B, 0\\);" 1 > FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times > original "_gfortran_caf_unlock \\(caf_token.., \\(2 - > \\(integer\\(kind=4\\)\\) parm...dim\\[0\\].lbound\\) \\+ > \\(integer\\(kind=4\\)\\) MAX_EXPR <\\(parm...dim\\[0\\].ubound - > parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - > \\(integer\\(kind=4\\)\\) parm...dim\\[1\\].lbound\\), 0, &ii, 0B, > 0\\);|_gfortran_caf_unlock \\(caf_token.., \\(2 - > parm...dim\\[0\\].lbound\\) \\+ MAX_EXPR <\\(parm...dim\\[0\\].ubound > - parm...dim\\[0\\].lbound\\) \\+ 1, 0> \\* \\(3 - > parm...dim\\[1\\].lbound\\), 0, &ii, 0B, 0\\);" 1 > FAIL: gfortran.dg/mvbits_7.f90 -O0 (test for warnings, line 28) > > > OK for trunk? > Looks OK, thanks, Jerry
Re: [PATCH, GCC/ARM, Stage 1] Enable Purecode for ARMv8-M Baseline
Hi, On 4 May 2017 at 11:05, Ramana Radhakrishnan wrote: > On Thu, May 04, 2017 at 09:50:42AM +0100, Prakhar Bahuguna wrote: > > >> > >> > Otherwise ok. Please respin and test with an armhf thumb32 bootstrap >> > and regression test run. >> > >> > regards >> > Ramana >> >> I've respun this patch with the suggested changes, along with a new changelog >> for docs: > > And tested hopefully with a bootstrap and regression test run on armhf on > GNU/Linux ? > > Ok if no regressions. > I've noticed regressions :( - PASS now FAIL [PASS => FAIL]: Executed from: gcc.target/arm/arm.exp gcc.target/arm/scd42-2.c scan-assembler mov[ \t].*272 on arm* targets generating arm code as opposed to thumb and in g++: Executed from: g++.dg/torture/dg-torture.exp g++.dg/torture/vshuf-v4si.C -O2 (internal compiler error) g++.dg/torture/vshuf-v4si.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) g++.dg/torture/vshuf-v4si.C -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error) when using -march=armv5t or using the default cpu/fpu/mode settings. For more details, see: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/247585/report-build-info.html Thanks, Christophe > regards > Ramana > >> >> doc/ChangeLog: >> >> 2017-01-11 Prakhar Bahuguna >> Andre Simoes Dias Vieira >> >> * invoke.texi (-mpure-code): Change "ARMv7-M targets" for >> "M-profile targets with the MOVT instruction". >> >> -- >> >> Prakhar Bahuguna > >> From e0f62c9919ceb9cfc6b4cc49615fb7188ae50519 Mon Sep 17 00:00:00 2001 >> From: Prakhar Bahuguna >> Date: Wed, 15 Mar 2017 10:25:03 + >> Subject: [PATCH] Enable Purecode for ARMv8-M Baseline. >> >> --- >> gcc/config/arm/arm.c | 78 >> ++ >> gcc/config/arm/arm.md | 6 +- >> gcc/doc/invoke.texi| 3 +- >> .../gcc.target/arm/pure-code/pure-code.exp | 5 +- >> 4 files changed, 58 insertions(+), 34 deletions(-) >> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >> index 83914913433..e0a7cabcb2e 100644 >> --- a/gcc/config/arm/arm.c >> +++ b/gcc/config/arm/arm.c >> @@ -2833,16 +2833,16 @@ arm_option_check_internal (struct gcc_options *opts) >>flag_pic = 0; >> } >> >> - /* We only support -mslow-flash-data on armv7-m targets. */ >> - if (target_slow_flash_data >> - && ((!(arm_arch7 && !arm_arch_notm) && !arm_arch7em) >> - || (TARGET_THUMB1_P (flags) || flag_pic || TARGET_NEON))) >> -error ("-mslow-flash-data only supports non-pic code on armv7-m >> targets"); >> - >> - /* We only support pure-code on Thumb-2 M-profile targets. */ >> - if (target_pure_code >> - && (!arm_arch_thumb2 || arm_arch_notm || flag_pic || TARGET_NEON)) >> -error ("-mpure-code only supports non-pic code on armv7-m targets"); >> + /* We only support -mpure-code and -mslow-flash-data on M-profile targets >> + with MOVT. */ >> + if ((target_pure_code || target_slow_flash_data) >> + && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON)) >> +{ >> + const char *flag = (target_pure_code ? "-mpure-code" : >> + "-mslow-flash-data"); >> + error ("%s only supports non-pic code on M-profile targets with the " >> + "MOVT instruction", flag); >> +} >> >> } >> >> @@ -4077,7 +4077,7 @@ const_ok_for_arm (HOST_WIDE_INT i) >> || (i & ~0xfc03) == 0)) >> return TRUE; >> } >> - else >> + else if (TARGET_THUMB2) >> { >>HOST_WIDE_INT v; >> >> @@ -4093,6 +4093,14 @@ const_ok_for_arm (HOST_WIDE_INT i) >>if (i == v) >> return TRUE; >> } >> + else if (TARGET_HAVE_MOVT) >> +{ >> + /* Thumb-1 Targets with MOVT. */ >> + if (i > 0x) >> + return FALSE; >> + else >> + return TRUE; >> +} >> >>return FALSE; >> } >> @@ -7736,6 +7744,32 @@ arm_legitimate_address_outer_p (machine_mode mode, >> rtx x, RTX_CODE outer, >>return 0; >> } >> >> +/* Return true if we can avoid creating a constant pool entry for x. */ >> +static bool >> +can_avoid_literal_pool_for_label_p (rtx x) >> +{ >> + /* Normally we can assign constant values to target registers without >> + the help of constant pool. But there are cases we have to use constant >> + pool like: >> + 1) assign a label to register. >> + 2) sign-extend a 8bit value to 32bit and then assign to register. >> + >> + Constant pool access in format: >> + (set (reg r0) (mem (symbol_ref (".LC0" >> + will cause the use of literal pool (later in function arm_reorg). >> + So here we mark such format as an invalid format, then the compiler >> + will adjust it into: >> + (set (reg r0) (symbol_ref (".LC0"))) >> + (set (reg r0) (mem (reg r0))). >> + No extra register is required,
Re: [PATCH, Fortran] PR 80668: wrong error message with -finit-derived
> Le 9 mai 2017 à 18:15, Fritz Reese a écrit : > >> (1) you don’t need the lines (as in "please don’t use them") >> >> + >> +! { dg-final { cleanup-modules "pr80668" } } > > I will remove the "dg-final" lines. > >> >> (2) I don’t understand the comments after 'INTEGER,DIMENSION(:),POINTER :: >> dist’. > > (2) Sorry, the comments were meant to show the error that would appear > if the testcase is regressed; i.e., the error that appeared in the > original PR report. If it is too esoteric I can just remove the > comments. > > Version 2 attached. OK? > > --- > Fritz Reese > OK for me. Dominique
Re: [Patch, Fortran, OOP] PR 79311: ICE in generate_finalization_wrapper, at fortran/class.c:1992
On Tue, May 09, 2017 at 09:18:42AM -0700, Jerry DeLisle wrote: > > Yes Ok and thanks for patch. The lock_7 is fixed today and I did not see the > mvbits fail yesterday. Regardlessm unrelated. > mvbits_7 is a victim of stupid by non-Fortran committers. /home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/mvbits_7.f90:28:0: \ Warning: '__builtin_memcpy' reading 192 bytes from a region of \ size 144 [-Wstringop-overflow=] This might be fixed by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80669 -- Steve 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4 20161221 https://www.youtube.com/watch?v=IbCHE-hONow
Re: [patch, fortran] Reduce stack use in blocked matmul
Am 09.05.2017 um 12:43 schrieb Andreas Schwab: On Mai 05 2017, Thomas Koenig wrote: @@ -227,6 +226,17 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl if (m == 0 || n == 0 || k == 0) return; + /* Adjust size of t1 to what is needed. */ + index_type t1_dim; + t1_dim = (a_dim1-1) * 256 + b_dim1; + if (t1_dim > 65536) + t1_dim = 65536; What happens if (a_dim1-1) * 256 + b_dim1 > 65536? t1 is an auxiliary variable for blocking. If that condition is true, blocking starts to happen. Regards Thomas
Re: [PATCH, rs6000] Add x86 instrinsic headers to GCC PPC64LE taget
Hi! On Mon, May 08, 2017 at 09:49:57AM -0500, Steven Munroe wrote: > Thus I would like to restrict this support to PowerPC > targets that support VMX/VSX and PowerISA-2.07 (power8) and later. What happens if you run it on an older machine, or as BE or 32-bit, or with vectors disabled? > So I propose to submit a series of patches to implement the PowerPC64LE > equivalent of a useful subset of the x86 intrinsics. The final size and > usefulness of this effort is to be determined. The proposal is to > incrementally port intrinsic header files from the ./config/i386 tree to > the ./config/rs6000 tree. This naturally provides the same header > structure and intrinsic names which will simplify code porting. Yeah. I'd still like to see these headers moved into some subdir (both in the source tree and in the installed headers tree), to reduce clutter, but I understand it's not trivial to do. > To get the ball rolling I include the BMI intrinsics ported to PowerPC > for review as they are reasonable size (31 intrinsic implementations). This is okay for trunk. Thanks! > --- gcc/config.gcc(revision 247616) > +++ gcc/config.gcc(working copy) > @@ -444,7 +444,7 @@ nvptx-*-*) > ;; > powerpc*-*-*) > cpu_type=rs6000 > - extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h > spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h" > + extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h > spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h bmi2intrin.h > bmiintrin.h x86intrin.h" (Your mail client wrapped this). Write this on a separate line? Like extra_headers="${extra_headers} htmintrin.h htmxlintrin.h bmi2intrin.h" (You cannot use += here, pity). Segher
[patch, libfortran] Fix PR 80687l build failure on nvptx
Hello world, the attached patch hopefully fixes the build failure on nvptx introduced by my recent matmul library patch. It uses malloc/free if VLAs do not work. Thomas S., does this fix the problem? Tested on x86_64 to make sure that the matmul tests still pass; full regression test still in progress. OK for trunk if the nvptx problem is fixed and regression-tests pass? Regards Thomas 2017-05-09 Thomas Koenig PR fortran/80687 * acinclude.m4 (LIBGFOR_CHECK_VLA): New macro. * configure.ac: Use it. * config.h.in: Regenerated. * configure: Regenerated. * m4/matmul_internal.m4: 'matmul_name`: Use malloc/free instead of VLA if HAVE_VLA is undefined. t1 to a VLA of the required size. * generated/matmul_c10.c: Regenerated. * generated/matmul_c16.c: Regenerated. * generated/matmul_c4.c: Regenerated. * generated/matmul_c8.c: Regenerated. * generated/matmul_i1.c: Regenerated. * generated/matmul_i16.c: Regenerated. * generated/matmul_i2.c: Regenerated. * generated/matmul_i4.c: Regenerated. * generated/matmul_i8.c: Regenerated. * generated/matmul_r10.c: Regenerated. * generated/matmul_r16.c: Regenerated. * generated/matmul_r4.c: Regenerated. * generated/matmul_r8.c: Regenerated. Index: acinclude.m4 === --- acinclude.m4 (Revision 247566) +++ acinclude.m4 (Arbeitskopie) @@ -452,3 +452,20 @@ AC_DEFUN([LIBGFOR_CHECK_AVX512F], [ []) CFLAGS="$ac_save_CFLAGS" ]) + +dnl Check if VLAs work + +AC_DEFUN([LIBGFOR_CHECK_VLA], [ + ac_save_CFLAGS="$CFLAGS" + CFLAGS="-Wno-vla" + AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ + void foo(int n) + { + int a[n]; +}]], [[]])], + AC_DEFINE(HAVE_VLA, 1, + [Define if VLAs can be compiled]), + []) + CFLAGS="$ac_save_CFLAGS" +]) + Index: config.h.in === --- config.h.in (Revision 247566) +++ config.h.in (Arbeitskopie) @@ -807,6 +807,9 @@ /* Define to 1 if you have the `uselocale' function. */ #undef HAVE_USELOCALE +/* Define if VLAs can be compiled */ +#undef HAVE_VLA + /* Define to 1 if you have the `vsnprintf' function. */ #undef HAVE_VSNPRINTF Index: configure === --- configure (Revision 247566) +++ configure (Arbeitskopie) @@ -26363,6 +26363,34 @@ rm -f core conftest.err conftest.$ac_objext confte CFLAGS="$ac_save_CFLAGS" +# Check wether VLAs work + + ac_save_CFLAGS="$CFLAGS" + CFLAGS="-Wno-vla" + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + + void foo(int n) + { + int a[n]; +} +int +main () +{ + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + +$as_echo "#define HAVE_VLA 1" >>confdefs.h + +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext + CFLAGS="$ac_save_CFLAGS" + + # Determine what GCC version number to use in filesystem paths. get_gcc_base_ver="cat" Index: configure.ac === --- configure.ac (Revision 247566) +++ configure.ac (Arbeitskopie) @@ -624,6 +624,9 @@ LIBGFOR_CHECK_AVX2 # Check wether we support AVX512f extensions LIBGFOR_CHECK_AVX512F +# Check wether VLAs work +LIBGFOR_CHECK_VLA + # Determine what GCC version number to use in filesystem paths. GCC_BASE_VER Index: generated/matmul_c10.c === --- generated/matmul_c10.c (Revision 247753) +++ generated/matmul_c10.c (Arbeitskopie) @@ -316,11 +316,15 @@ matmul_c10_avx (gfc_array_c10 * const restrict ret if (t1_dim > 65536) t1_dim = 65536; +#ifdef HAVE_VLA #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wvla" GFC_COMPLEX_10 t1[t1_dim]; /* was [256][256] */ #pragma GCC diagnostic pop - +#else + GFC_COMPLEX_10 *t1; + t1 = malloc (t1_dim * sizeof(GFC_COMPLEX_10)); +#endif /* Empty c first. */ for (j=1; j<=n; j++) for (i=1; i<=m; i++) @@ -535,6 +539,9 @@ matmul_c10_avx (gfc_array_c10 * const restrict ret } } } +#ifndef HAVE_VLA + free(t1); +#endif return; } else if (rxstride == 1 && aystride == 1 && bxstride == 1) @@ -869,11 +876,15 @@ matmul_c10_avx2 (gfc_array_c10 * const restrict re if (t1_dim > 65536) t1_dim = 65536; +#ifdef HAVE_VLA #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wvla" GFC_COMPLEX_10 t1[t1_dim]; /* was [256][256] */ #pragma GCC diagnostic pop - +#else + GFC_COMPLEX_10 *t1; + t1 = malloc (t1_dim * sizeof(GFC_COMPLEX_10)); +#endif /* Empty c first. */ for (j=1; j<=n; j++) for (i=1; i<=m; i++) @@ -1088,6 +1099,9 @@ matmul_c10_avx2 (gfc_array_c10 * const restrict re } } } +#ifndef HAVE_VLA + free(t1); +#endif return; } else
Re: [patch, libfortran] Fix PR 80687l build failure on nvptx
On Tue, May 9, 2017 at 8:35 PM, Thomas Koenig wrote: > Hello world, > > the attached patch hopefully fixes the build failure on nvptx introduced > by my recent matmul library patch. It uses malloc/free if VLAs do not > work. > > Thomas S., does this fix the problem? > > Tested on x86_64 to make sure that the matmul tests still pass; > full regression test still in progress. > > OK for trunk if the nvptx problem is fixed and regression-tests pass? I'd prefer to get rid of VLA's altogether. Which is why I added -Werror=vla in https://gcc.gnu.org/ml/fortran/2014-11/msg00052.html . In that message I also helpfully point out that avoiding VLA's can help limited targets like nvptx that lack VLA's.. :) VLA's don't come for free. Functions using VLA's require an extra register for the frame pointer, and AFAIK can't be inlined. Also, calculating variable offsets might not be possible at compile time as the offset from the stack/frame pointer might not be known statically. -- Janne Blomqvist
Re: [1/2] PR 78736: New warning -Wenum-conversion
On 05/09/2017 07:24 AM, Prathamesh Kulkarni wrote: ping https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00161.html Thanks, Prathamesh On 3 May 2017 at 11:30, Prathamesh Kulkarni wrote: On 3 May 2017 at 03:28, Martin Sebor wrote: On 05/02/2017 11:11 AM, Prathamesh Kulkarni wrote: Hi, The attached patch attempts to add option -Wenum-conversion for C and objective-C similar to clang, which warns when an enum value of a type is implicitly converted to enum value of another type and is enabled by Wall. It seems quite useful. My only high-level concern is with the growing number of specialized warnings and options for each and their interaction. I've been working on -Wenum-assign patch that complains about assigning to an enum variables an integer constants that doesn't match any of the enumerators of the type. Testing revealed that the -Wenum-assign duplicated a subset of warnings already issued by -Wconversion enabled with -Wpedantic. I'm debating whether to suppress that part of -Wenum-assign altogether or only when -Wconversion and -Wpedantic are enabled. My point is that these dependencies tend to be hard to discover and understand, and the interactions tricky to get right (e.g., avoid duplicate warnings for similar but distinct problems). This is not meant to be a negative comment on your patch, but rather a comment about a general problem that might be worth starting to think about. One comment on the patch itself: + warning_at_rich_loc (&loc, 0, "implicit conversion from" + " enum type of %qT to %qT", checktype, type); Unlike C++, the C front end formats an enumerated type E using %qT as 'enum E' so the warning prints 'enum type of 'enum E'), duplicating the "enum" part. I would suggest to simplify that to: warning_at_rich_loc (&loc, 0, "implicit conversion from " "%qT to %qT", checktype, ... Thanks for the suggestions. I have updated the patch accordingly. Hmm the issue you pointed out of warnings interaction is indeed of concern. I was wondering then if we should merge this warning with -Wconversion instead of having a separate option -Wenum-conversion ? Although that will not really help with your example below. Martin PS As an example to illustrate my concern above, consider this: enum __attribute__ ((packed)) E { e1 = 1 }; enum F { f256 = 256 }; enum E e = f256; It triggers -Woverflow: warning: large integer implicitly truncated to unsigned type [-Woverflow] enum E e = f256; ^~~~ also my -Wenum-assign: warning: integer constant ‘256’ converted to ‘0’ due to limited range [0, 255] of type ‘‘enum E’’ [-Wassign-enum] enum E e = f256; ^~~~ and (IIUC) will trigger your new -Wenum-conversion. Yep, on my branch it triggered -Woverflow and -Wenum-conversion. Running the example on clang shows a single warning, which they call as -Wconstant-conversion, which I suppose is similar to your -Wassign-enum. -Wassign-enum is a Clang warning too, it just isn't included in either -Wall or -Wextra. It warns when a constant is assigned to a variable of an enumerated type and is not representable in it. I enhanced it for GCC to also warn when the constant doesn't correspond to an enumerator in the type, but I'm starting to think that rather than adding yet another option to GCC it might be better to extend your -Wenum-conversion once it's committed to cover those cases (and also to avoid issuing multiple warnings for essentially the same problem). Let me ponder that some more. I can't approve patches but it looks good to me for the most part. There is one minor issue that needs to be corrected: + gcc_rich_location loc (location); + warning_at_rich_loc (&loc, 0, "implicit conversion from" + " %qT to %qT", checktype, type); Here the zero should be replaced with OPT_Wenum_conversion, otherwise the warning option won't be included in the message. Martin test-eg.c:3:12: warning: implicit conversion from 'int' to 'enum E' changes value from 256 to 0 [-Wconstant-conversion] enum E e = f256; ~ ^~~~ Thanks, Prathamesh Martin
Re: [PATCH, rs6000] Fix vec_xl and vec_xst intrinsics for P8
On May 9, 2017, at 9:58 AM, Segher Boessenkool wrote: > > Hi! > > On Thu, May 04, 2017 at 04:35:10PM -0500, Bill Schmidt wrote: >> In an earlier patch, I changed vec_xl and vec_xst to make use of new >> POWER9 instructions when loading or storing vector short/char values. >> In so doing, I failed to enable the existing instruction use for >> -mcpu=power8, so these were no longer considered valid by the compiler. >> Not good. >> >> This patch fixes the problem by using other existing built-in definitions >> when the POWER9 instructions are not available. I've added a test case >> to improve coverage and demonstrate that the problem is fixed. >> >> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no >> regressions. Is this ok for trunk? > > Yes, thanks! One nit: > >> --- gcc/config/rs6000/rs6000.c (revision 247560) >> +++ gcc/config/rs6000/rs6000.c (working copy) >> @@ -18183,6 +18183,17 @@ altivec_init_builtins (void) >> def_builtin ("__builtin_vsx_st_elemrev_v16qi", >> void_ftype_v16qi_long_pvoid, VSX_BUILTIN_ST_ELEMREV_V16QI); >> } >> + else >> +{ >> + rs6000_builtin_decls[(int)VSX_BUILTIN_LD_ELEMREV_V8HI] >> += rs6000_builtin_decls[(int)VSX_BUILTIN_LXVW4X_V8HI]; > > There should be a space after the cast operators. OK, will fix. Thanks for the review! I forgot to ask -- this fix is needed for GCC 6 and 7 as well. Is this ok for backport after the usual burn-in? Thanks, Bill > > > Segher >
[PATCH] make RTL/TREE/IPA dump kind an index
Currently, the TDF_foo flags serve 3 purposes: 1) what kind of dump 2) how detailed to print it 3) auxiliary message control This addresses #1, which currently uses a bit mask of TDF_{TREE,RTL,IPA}, of which exactly one must be set. The patch changes things so that these are now an index value (I hesitate to say enumeration, because they're still raw ints). A TDF_KIND(X) accessor extracts this value. (I left the spare bit between the TDF_KIND_MASK and TDF_ADDRESS for the moment.) In addition I added 'TDF_LANG' for language-specific dump control, of which -fdump-translation-unit and -fdump-class-hierarchy become. And can also be controlled by -fdump-lang-all. (rather than -fdump-tree-all) Next move will be to move -fdump-class-hierarchy into a more generic structure (and -fdump-translation-unit, if my patch to remove it is not accepted). ok? nathan -- Nathan Sidwell 2017-05-09 Nathan Sidwell * dumpfile.h (TDI_lang_all): New. (TDF_KIND): New. Renumber others (TDF_LANG, TDF_TREE, TDF_RTL, TDF_IPA): Enumerate value, rather than bits. * dumpfile.c (dump_files): Mark language dumps as TDF_LANG. add lang-all. (get_dump_file_name): Adjust suffix generation. (dump_enable_all): Use TDF_KIND. * doc/invoke.texi (-fdump-lang-all): Document. Index: doc/invoke.texi === --- doc/invoke.texi (revision 247809) +++ doc/invoke.texi (working copy) @@ -543,6 +543,7 @@ Objective-C and Objective-C++ Dialects}. -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol -fdump-final-insns@r{[}=@var{file}@r{]} -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol +-fdump-lang-all @gol -fdump-passes @gol -fdump-rtl-@var{pass} -fdump-rtl-@var{pass}=@var{filename} @gol -fdump-statistics @gol @@ -12970,6 +12971,10 @@ Dump after function inlining. @end table +@item -fdump-lang-all +@opindex fdump-lang-all +Control the dumping of language-specific information. + @item -fdump-passes @opindex fdump-passes Print on @file{stderr} the list of optimization passes that are turned Index: dumpfile.c === --- dumpfile.c (revision 247809) +++ dumpfile.c (working copy) @@ -57,9 +57,9 @@ static struct dump_file_info dump_files[ 0, 0, 0, 0, 0, false, false}, {".ipa-clones", "ipa-clones", NULL, NULL, NULL, NULL, NULL, TDF_IPA, 0, 0, 0, 0, 0, false, false}, - {".tu", "translation-unit", NULL, NULL, NULL, NULL, NULL, TDF_TREE, + {".tu", "translation-unit", NULL, NULL, NULL, NULL, NULL, TDF_LANG, 0, 0, 0, 0, 1, false, false}, - {".class", "class-hierarchy", NULL, NULL, NULL, NULL, NULL, TDF_TREE, + {".class", "class-hierarchy", NULL, NULL, NULL, NULL, NULL, TDF_LANG, 0, 0, 0, 0, 2, false, false}, {".original", "tree-original", NULL, NULL, NULL, NULL, NULL, TDF_TREE, 0, 0, 0, 0, 3, false, false}, @@ -69,6 +69,8 @@ static struct dump_file_info dump_files[ 0, 0, 0, 0, 5, false, false}, #define FIRST_AUTO_NUMBERED_DUMP 6 + {NULL, "lang-all", NULL, NULL, NULL, NULL, NULL, TDF_LANG, + 0, 0, 0, 0, 0, false, false}, {NULL, "tree-all", NULL, NULL, NULL, NULL, NULL, TDF_TREE, 0, 0, 0, 0, 0, false, false}, {NULL, "rtl-all", NULL, NULL, NULL, NULL, NULL, TDF_RTL, @@ -115,7 +117,7 @@ static const struct dump_option_value_in {"missed", MSG_MISSED_OPTIMIZATION}, {"note", MSG_NOTE}, {"optall", MSG_ALL}, - {"all", ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_TREE | TDF_RTL | TDF_IPA + {"all", ~(TDF_KIND_MASK | TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_STMTADDR | TDF_GRAPH | TDF_DIAGNOSTIC | TDF_VERBOSE | TDF_RHS_ONLY | TDF_NOUID | TDF_ENUMERATE_LOCALS | TDF_SCEV | TDF_GIMPLE)}, @@ -282,15 +284,11 @@ get_dump_file_name (struct dump_file_inf dump_id[0] = '\0'; else { - char suffix; - if (dfi->pflags & TDF_TREE) - suffix = 't'; - else if (dfi->pflags & TDF_IPA) - suffix = 'i'; - else - suffix = 'r'; - - if (snprintf (dump_id, sizeof (dump_id), ".%03d%c", dfi->num, suffix) < 0) + /* LANG, TREE, RTL, IPA. */ + char suffix = "ltri"[TDF_KIND (dfi->pflags)]; + + if (snprintf (dump_id, sizeof (dump_id), ".%03d%c", dfi->num, suffix) + < 0) dump_id[0] = '\0'; } @@ -657,13 +655,13 @@ int gcc::dump_manager:: dump_enable_all (int flags, const char *filename) { - int ir_dump_type = (flags & (TDF_TREE | TDF_RTL | TDF_IPA)); + int ir_dump_type = TDF_KIND (flags); int n = 0; size_t i; for (i = TDI_none + 1; i < (size_t) TDI_end; i++) { - if ((dump_files[i].pflags & ir_dump_type)) + if (TDF_KIND (dump_files[i].pflags) == ir_dump_type) { const char *old_filename = dump_files[i].pfilename; dump_files[i].pstate = -1; @@ -684,7 +682,7 @@ dump_enable_all (int flags, const char * for (i = 0; i < m_extra_dump_files_in_use; i++) { - if ((m_extra_dump_files[i].pflags & ir_dump_type)) + if (TDF_KIND (m_extra_dump_fil
Re: [PATCH, rs6000] Add x86 instrinsic headers to GCC PPC64LE taget
On Tue, 2017-05-09 at 12:23 -0500, Segher Boessenkool wrote: > Hi! > > On Mon, May 08, 2017 at 09:49:57AM -0500, Steven Munroe wrote: > > Thus I would like to restrict this support to PowerPC > > targets that support VMX/VSX and PowerISA-2.07 (power8) and later. > > What happens if you run it on an older machine, or as BE or 32-bit, > or with vectors disabled? > Well I hope that I set the dg-require-effective-target correctly because while some of these intrinsics might work on the BE or 32-bit machine, most will not. For example; many of the BMI intrinsic implementations depend on 64-bit instructions and so I use { dg-require-effective-target lp64 }. The BMI2 intrinsic _pext exploits the Bit Permute Doubleword instruction. There is no Bit Permute Word instruction. So for BMI2 I use { dg-require-effective-target powerpc_vsx_ok } as bpermd was introduced in PowerISA 2.06 along with the Vector Scalar Extension facility. The situation gets more complicated when we start looking at the SSE/SSE2. These headers define many variants of load and store instructions that are decidedly LE and many unaligned forms. While powerpc64le handles this with ease, implementing LE semantics in BE mode gets seriously tricky. I think it is better to avoid this and only support these headers for LE. And while some SSE instrinsics can be implemented with VMX instructions all the SSE2 double float intrinsics require VSX. And some PowerISA 2.07 instructions simplify implementation if available. As power8 is also the first supported powerpc64le system it seems the logical starting point for most of this work. I don't plan to spend effort on supporting Intel intrinsic functions on older PowerPC machines (before power8) or BE. > > So I propose to submit a series of patches to implement the PowerPC64LE > > equivalent of a useful subset of the x86 intrinsics. The final size and > > usefulness of this effort is to be determined. The proposal is to > > incrementally port intrinsic header files from the ./config/i386 tree to > > the ./config/rs6000 tree. This naturally provides the same header > > structure and intrinsic names which will simplify code porting. > > Yeah. > > I'd still like to see these headers moved into some subdir (both in > the source tree and in the installed headers tree), to reduce clutter, > but I understand it's not trivial to do. > > > To get the ball rolling I include the BMI intrinsics ported to PowerPC > > for review as they are reasonable size (31 intrinsic implementations). > > This is okay for trunk. Thanks! > Thank you > > --- gcc/config.gcc (revision 247616) > > +++ gcc/config.gcc (working copy) > > @@ -444,7 +444,7 @@ nvptx-*-*) > > ;; > > powerpc*-*-*) > > cpu_type=rs6000 > > - extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h > > spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h" > > + extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h > > spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h bmi2intrin.h > > bmiintrin.h x86intrin.h" > > (Your mail client wrapped this). > > Write this on a separate line? Like > extra_headers="${extra_headers} htmintrin.h htmxlintrin.h bmi2intrin.h" > (You cannot use += here, pity). > > > Segher >