Re: [PATCH] Fix ipa-pure-const.c bug (PR ipa/77905)
On Mon, 12 Dec 2016, Jakub Jelinek wrote: > Hi! > > The local pure const pass notices the static ctor is looping const, > so sets TREE_READONLY and DECL_LOOPING_CONST_OR_PURE_P and does not > clear DECL_STATIC_CONSTRUCTOR bit because it is looping. > Then the ipa pure const pass notices the static ctor is non-looping const, > clears DECL_LOOPING_CONST_OR_PURE_P and DECL_STATIC_CONSTRUCTOR bit, > but because it has been TREE_READONLY already, it doesn't set > TODO_remove_functions from the pass and ipa-comdat pass assumes that > unreachable cgraph nodes have been pruned. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk? Ok. Richard. > 2016-12-12 Jakub Jelinek > > PR ipa/77905 > * ipa-pure-const.c (cdtor_p): Return true for > DECL_STATIC_{CON,DE}STRUCTOR even when it is > DECL_LOOPING_CONST_OR_PURE_P. > > * g++.dg/ipa/pr77905.C: New test. > > --- gcc/ipa-pure-const.c.jj 2016-09-02 19:09:33.0 +0200 > +++ gcc/ipa-pure-const.c 2016-12-12 11:51:24.808518093 +0100 > @@ -1195,7 +1195,8 @@ static bool > cdtor_p (cgraph_node *n, void *) > { >if (DECL_STATIC_CONSTRUCTOR (n->decl) || DECL_STATIC_DESTRUCTOR (n->decl)) > -return !TREE_READONLY (n->decl) && !DECL_PURE_P (n->decl); > +return ((!TREE_READONLY (n->decl) && !DECL_PURE_P (n->decl)) > + || DECL_LOOPING_CONST_OR_PURE_P (n->decl)); >return false; > } > > --- gcc/testsuite/g++.dg/ipa/pr77905.C.jj 2016-12-12 11:53:42.645770248 > +0100 > +++ gcc/testsuite/g++.dg/ipa/pr77905.C2016-12-12 11:53:35.0 > +0100 > @@ -0,0 +1,21 @@ > +// PR ipa/77905 > +// { dg-do compile } > +// { dg-options "-O2" } > + > +struct A { > + A(int); > +}; > +struct B : A { > + B(); > +} A; > +struct C : virtual A { > + C(int); > +}; > +A::A(int x) { > + if (x) > +A(0); > +} > + > +B::B() : A(1) {} > + > +C::C(int) : A(1) {} > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] [x86] Don't use builtins for SIMD shifts
On Sat, Dec 10, 2016 at 7:59 PM, Marc Glisse wrote: > On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote: > >> On Saturday 10 December 2016, Marc Glisse wrote: >>> >>> On Sat, 10 Dec 2016, Marc Glisse wrote: On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote: > > Replaces the definitions of the shift intrinsics with GCC extension > syntax to > allow GCC to reason about what the instructions does. Tests are added > to > ensure the intrinsics still produce the right instructions, and that a > few basic optimizations now work. I don't think we can do it in such a straightforward way. Those intrinsics are well defined for any shift argument, while operator<< and operator>> are considered undefined for many input values. I believe you may need to use an unsigned type for the lhs of left shifts, and write a << (b & 31) to match the semantics for the rhs (I haven't checked Intel's doc). Which might require adding a few patterns in sse.md to avoid generating code for that AND. >>> >>> >>> Oups, apparently I got that wrong, got confused with the scalar case. >>> Left >>> shift by more than precision is well defined for vectors, but it gives 0, >>> it doesn't take the shift count modulo the precision. Which is even >>> harder >>> to explain to gcc (maybe ((unsigned)b<=31)?a<>> model operations in gcc is often inconvenient. >>> >>> There was a similar issue for LLVM (https://reviews.llvm.org/D3353). >>> >> Well, it is undefined behaviour by the C standard, but is it also >> undefined >> inside GCC (since this specifically uses a GCC extension)? I would assume >> it >> just produces 0. > > > In the optimizers, we tend to handle vectors the same as scalars. I don't > remember if we have any optimization that takes advantage of the limited > range for the second operand of shifts (mostly we disable optimizations when > there is a risk they might produce a shift amount larger than prec), but I > don't think we have specified the behavior for larger-than-prec shifts, so > it would be dangerous. Note that depending on the platform, gcc may lower > vector operations to smaller vectors or even scalars, which makes it > inconvenient to have different semantics for vectors and scalars (not > impossible, we would for instance need to teach the lowering pass that vec1 > << vec2 lowers to { ((unsigned)vec2[0]<=31)?vec1[0]< ((unsigned)vec2[1]<=31)?vec1[1]< shifting vectors does use the modulo behavior, or handles negative shift > values as shifts in the other direction, we can't look at just one target. > > We could have a middle-end policy of vec << 1234 is not undefined (we cannot > assume it doesn't happen) but we don't know what it is defined to, so we > avoid any optimization that may have it as input or output. That might mean > disabling all shift optimizations though, which would be counter productive. > >> The non-immediate case is simpler then, because it produces >> the instructions which will inherently return the right thing. > > > It will only generate the instructions you expect if it hasn't already > optimized it to something else, which might not even involve a shift > anymore. > > I expect reviewers (I am only commenting) will need convincing arguments > that this is safe (I could be wrong, maybe they'll find reasons that I > missed that make it obviously safe). I agree that for all the reasons you stated it is important to have vector (and complex!) integer ops behave the same with respect to overflow. That we do not yet take advantage of out-of-bound shifts is merely a missed optimization. Richard. > -- > Marc Glisse
[PATCH] Add pretty printer for ASAN_MARK and add a helper fn
It would be easier to read ASAN_MARK (UNPOISON, &a, 4); instead of ASAN_MARK (2, &a, 4); Apart from that new helper function asan_mark_p is added. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin >From ff843db17ba284d2e99bab567d76d4b779f9f1d6 Mon Sep 17 00:00:00 2001 From: marxin Date: Fri, 9 Dec 2016 16:24:16 +0100 Subject: [PATCH 1/2] Add pretty printer for ASAN_MARK and add a helper fn gcc/ChangeLog: 2016-12-09 Martin Liska * asan.c (asan_mark_poison_p): Remove. (asan_mark_p): New function. (transform_statements): Use the function. (asan_expand_mark_ifn): Do not use masked enum. * asan.h (enum asan_mark_flags): Declare it via a macro. * gimple-pretty-print.c (dump_gimple_call_args): Dump first argument of ASAN_MARK. * gimplify.c (build_asan_poison_call_expr): Use new enum values. (asan_poison_variable): Likewise. gcc/testsuite/ChangeLog: 2016-12-12 Martin Liska * gcc.dg/asan/use-after-scope-goto-1.c: Update first argument of scanned pattern ASAN_MARK. * gcc.dg/asan/use-after-scope-goto-2.c: Likewise. * gcc.dg/asan/use-after-scope-switch-1.c: Likewise. * gcc.dg/asan/use-after-scope-switch-2.c: Likewise. * gcc.dg/asan/use-after-scope-switch-3.c: Likewise. --- gcc/asan.c | 28 ++ gcc/asan.h | 11 ++--- gcc/gimple-pretty-print.c | 9 +++ gcc/gimple.c | 2 +- gcc/gimplify.c | 4 ++-- gcc/testsuite/gcc.dg/asan/use-after-scope-goto-1.c | 6 ++--- gcc/testsuite/gcc.dg/asan/use-after-scope-goto-2.c | 2 +- .../gcc.dg/asan/use-after-scope-switch-1.c | 4 ++-- .../gcc.dg/asan/use-after-scope-switch-2.c | 12 +- .../gcc.dg/asan/use-after-scope-switch-3.c | 4 ++-- 10 files changed, 47 insertions(+), 35 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index 5af95472db9..e297784270d 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -245,15 +245,6 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value; static bool asan_shadow_offset_computed; static vec sanitized_sections; -/* Return true if STMT is ASAN_MARK poisoning internal function call. */ -static inline bool -asan_mark_poison_p (gimple *stmt) -{ - return (gimple_call_internal_p (stmt, IFN_ASAN_MARK) - && tree_to_uhwi (gimple_call_arg (stmt, 0)) == ASAN_MARK_CLOBBER); - -} - /* Set of variable declarations that are going to be guarded by use-after-scope sanitizer. */ @@ -304,6 +295,13 @@ set_sanitized_sections (const char *sections) } bool +asan_mark_p (gimple *stmt, enum asan_mark_flags flag) +{ + return (gimple_call_internal_p (stmt, IFN_ASAN_MARK) + && tree_to_uhwi (gimple_call_arg (stmt, 0)) == flag); +} + +bool asan_sanitize_stack_p (void) { return ((flag_sanitize & SANITIZE_ADDRESS) @@ -,7 +2220,8 @@ transform_statements (void) miss some instrumentation opportunities. Do the same for a ASAN_MARK poisoning internal function. */ if (is_gimple_call (s) - && (!nonfreeing_call_p (s) || asan_mark_poison_p (s))) + && (!nonfreeing_call_p (s) + || asan_mark_p (s, ASAN_MARK_POISON))) empty_mem_ref_hash_table (); gsi_next (&i); @@ -2777,9 +2776,8 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter) { gimple *g = gsi_stmt (*iter); location_t loc = gimple_location (g); - HOST_WIDE_INT flags = tree_to_shwi (gimple_call_arg (g, 0)); - gcc_assert (flags < ASAN_MARK_LAST); - bool is_clobber = (flags & ASAN_MARK_CLOBBER) != 0; + HOST_WIDE_INT flag = tree_to_shwi (gimple_call_arg (g, 0)); + bool is_poison = ((asan_mark_flags)flag) == ASAN_MARK_POISON; tree base = gimple_call_arg (g, 1); gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR); @@ -2827,7 +2825,7 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter) if (s > size_in_bytes) last_chunk_size = ASAN_SHADOW_GRANULARITY - (s - size_in_bytes); - asan_store_shadow_bytes (iter, loc, shadow, offset, is_clobber, + asan_store_shadow_bytes (iter, loc, shadow, offset, is_poison, size, last_chunk_size); offset += size; } @@ -2840,7 +2838,7 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter) gsi_insert_before (iter, g, GSI_SAME_STMT); tree sz_arg = gimple_assign_lhs (g); - tree fun = builtin_decl_implicit (is_clobber ? BUILT_IN_ASAN_CLOBBER_N + tree fun = builtin_decl_implicit (is_poison ? BUILT_IN_ASAN_CLOBBER_N : BUILT_IN_ASAN_UNCLOBBER_N); g = gimple_build_call (fun, 2, base_addr, sz_arg); gimple_set_location (g, loc); diff --git a/gcc/asan.h b/gcc/asan.h index 9cf5904618b..355a350bfeb 100644 --- a/gcc/asan.h +++ b/gcc/asan.h @@ -75,13 +75,18 @@ enum asan_check_flags }; /* Flags for Asan check builtins. */ +#define IFN_ASAN_MARK_FLAGS DEF(POISON), DEF(UNPOISON) + enum asan_mark_flags { - ASAN_MARK_CLOBBER = 1 << 0, - ASAN_
Re: [PATCH, middle-end]: Fix PR78738, unrecognized insn with -ffast-math
On Sun, Dec 11, 2016 at 5:16 PM, Uros Bizjak wrote: > On Fri, Dec 9, 2016 at 11:09 AM, Richard Biener > wrote: >> On Thu, Dec 8, 2016 at 10:44 PM, Uros Bizjak wrote: >>> Hello! >>> >>> Attached patch fixes fall-out from excess-precision improvements >>> patch. As shown in the PR, the code throughout the compiler assumes >>> FLAG_PRECISION_FAST when flag_unsafe_math_optimizations flag is in >>> effect. The patch puts back two lines, removed by excess-precision >>> improvements patch. >>> >>> 2016-12-08 Uros Bizjak >>> >>> PR middle-end/78738 >>> * toplev.c (init_excess_precision): Initialize flag_excess_precision >>> to EXCESS_PRECISION_FAST for flag_unsafe_math_optimizations. >>> >>> testsuite/ChangeLog: >>> >>> 2016-12-08 Uros Bizjak >>> >>> PR middle-end/78738 >>> * gcc.target/i386/pr78738.c: New test. >>> >>> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. >>> >>> OK for mainline? >> >> Hmm, I think it belongs to set_unsafe_math_optimization_flags instead >> (and be consistent if -fexcess-precision was manually specified). >> >> Also where do we assume connection between -funsafe-math-optimizations >> and FLAG_PRECISION_FAST? We have two flags so we should fix any >> user that looks at one but means the other. > > The failure is caused by the call to ix86_emit_i387_round in > "lround2" expander. This expander is > enabled for x87 math when flag_unsafe_math_optimizations is enabled. > In the called ix86_emit_i387_round, DFmode PLUS pattern is manually > emitted: > > emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (inmode, e1, half))); > > but due to the definition of X87_ENABLE_ARITH, DFmode fadd pattern > remains disabled. > > It is possible to fix the failure by enabling X87_ENABLE_ARITH (and > X87_ENABLE_FLOAT) for flag_unsafe_math_optimizations (as is the case > in the attached v2 patch), but since gcc-6.x does > > if (flag_unsafe_math_optimizations) >flag_excess_precision = EXCESS_PRECISION_FAST; > > I though it was worth mentioning the difference between gcc-6 and > gcc-7. Probably, x87 is the only target that cares for it, in which > case the attached patch is sufficient. I think that patch is the correct correctness fix. With respect to GCC 6 vs. GCC 7 behavior I believe it would be more reasonable to set EXCESS_PRECISION_FAST by -ffast-math/-Ofast rather than from simply -funsafe-math-optimizations (which is an option directly controlling "legacy" controlled stuff which should be moved under a more specific option umbrella). Richard. > Uros. > > > --cut here-- > Index: i386.h > === > --- i386.h (revision 243523) > +++ i386.h (working copy) > @@ -693,13 +693,16 @@ > /* Whether to allow x87 floating-point arithmetic on MODE (one of > SFmode, DFmode and XFmode) in the current excess precision > configuration. */ > -#define X87_ENABLE_ARITH(MODE) \ > - (flag_excess_precision == EXCESS_PRECISION_FAST || (MODE) == XFmode) > +#define X87_ENABLE_ARITH(MODE) \ > + (flag_unsafe_math_optimizations \ > + || flag_excess_precision == EXCESS_PRECISION_FAST \ > + || (MODE) == XFmode) > > /* Likewise, whether to allow direct conversions from integer mode > IMODE (HImode, SImode or DImode) to MODE. */ > #define X87_ENABLE_FLOAT(MODE, IMODE) \ > - (flag_excess_precision == EXCESS_PRECISION_FAST \ > + (flag_unsafe_math_optimizations \ > + || flag_excess_precision == EXCESS_PRECISION_FAST \ > || (MODE) == XFmode \ > || ((MODE) == DFmode && (IMODE) == SImode) \ > || (IMODE) == HImode) > --cut here--
Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions
I'm sending updated version that uses asan_mark_p predicate and the patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin >From d0ff310f636e8281198d5b549d00e6e5d94972e0 Mon Sep 17 00:00:00 2001 From: marxin Date: Mon, 12 Dec 2016 15:22:05 +0100 Subject: [PATCH 2/2] Add sanopt for ASAN_MARK poison and unpoison. gcc/ChangeLog: 2016-12-12 Martin Liska * sanopt.c (sanopt_optimize_walker): Set contains_asan_mark. (sanopt_optimize): Add new argument. (sanitize_asan_mark_unpoison): New function. (maybe_contains_asan_check): Likewise. (sanitize_asan_mark_poison): Likewise. (pass_sanopt::execute): Call the new functions. --- gcc/sanopt.c | 206 ++- 1 file changed, 203 insertions(+), 3 deletions(-) diff --git a/gcc/sanopt.c b/gcc/sanopt.c index 320e14e9421..f27039a6cf8 100644 --- a/gcc/sanopt.c +++ b/gcc/sanopt.c @@ -160,8 +160,10 @@ struct sanopt_ctx /* Number of IFN_ASAN_CHECK statements. */ int asan_num_accesses; -}; + /* True when the current functions constains an ASAN_MARK. */ + bool contains_asan_mark; +}; /* Return true if there might be any call to free/munmap operation on any path in between DOM (which should be imm(BB)) and BB. */ @@ -582,6 +584,9 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) if (!remove) ctx->asan_num_accesses++; break; + case IFN_ASAN_MARK: + ctx->contains_asan_mark = true; + break; default: break; } @@ -620,10 +625,11 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) /* Try to remove redundant sanitizer checks in function FUN. */ static int -sanopt_optimize (function *fun) +sanopt_optimize (function *fun, bool *contains_asan_mark) { struct sanopt_ctx ctx; ctx.asan_num_accesses = 0; + ctx.contains_asan_mark = false; /* Set up block info for each basic block. */ alloc_aux_for_blocks (sizeof (sanopt_info)); @@ -638,6 +644,7 @@ sanopt_optimize (function *fun) free_aux_for_blocks (); + *contains_asan_mark = ctx.contains_asan_mark; return ctx.asan_num_accesses; } @@ -671,18 +678,203 @@ public: }; // class pass_sanopt +static void +sanitize_asan_mark_unpoison (void) +{ + /* 1) Find all BBs that contain an ASAN_MARK poison call. */ + auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1); + bitmap_clear (with_poison); + basic_block bb; + + FOR_EACH_BB_FN (bb, cfun) +{ + if (bitmap_bit_p (with_poison, bb->index)) + continue; + + gimple_stmt_iterator gsi; + for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (asan_mark_p (stmt, ASAN_MARK_POISON)) + { + bitmap_set_bit (with_poison, bb->index); + break; + } + } +} + + auto_sbitmap poisoned (last_basic_block_for_fn (cfun) + 1); + bitmap_clear (poisoned); + auto_sbitmap worklist (last_basic_block_for_fn (cfun) + 1); + bitmap_copy (worklist, with_poison); + + /* 2) Propagate the information to all reachable blocks. */ + while (!bitmap_empty_p (worklist)) +{ + unsigned i = bitmap_first_set_bit (worklist); + bitmap_clear_bit (worklist, i); + basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); + gcc_assert (bb); + + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) + if (!bitmap_bit_p (poisoned, e->dest->index)) + { + bitmap_set_bit (poisoned, e->dest->index); + bitmap_set_bit (worklist, e->dest->index); + } +} + + /* 3) Iterate all BBs not included in POISONED BBs and remove unpoison + ASAN_MARK preceding an ASAN_MARK poison (which can still happen). */ + FOR_EACH_BB_FN (bb, cfun) +{ + if (bitmap_bit_p (poisoned, bb->index)) + continue; + + gimple_stmt_iterator gsi; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) + { + bool next = true; + gimple *stmt = gsi_stmt (gsi); + if (gimple_call_internal_p (stmt, IFN_ASAN_MARK)) + { + if (asan_mark_p (stmt, ASAN_MARK_POISON)) + break; + else + { + if (dump_file) + fprintf (dump_file, "Removing ASAN_MARK unpoison\n"); + unlink_stmt_vdef (stmt); + gsi_remove (&gsi, true); + next = false; + } + } + + if (next) + gsi_next (&gsi); + } +} +} + +/* Return true when STMT is either ASAN_CHECK call or a call of a function + that can contain an ASAN_CHECK. */ + +static bool +maybe_contains_asan_check (gimple *stmt) +{ + if (is_gimple_call (stmt)) +{ + if (gimple_call_internal_p (stmt, IFN_ASAN_MARK)) + return false; + else + return !(gimple_call_flags (stmt) & ECF_CONST); +} + else if (is_a (stmt)) +return true; + + return false; +} + +static void +sanitize_asan_mark_poison (void) +{ + /* 1) Find all BBs that possibly contain an ASAN_CHECK. */ + auto_sbitmap with_check (last_basic_block_for_fn (cfun) + 1); + bitmap_clear (with_check); + basic_block bb; + + FOR_EACH
Re: [PATCH] Fill bitregion_{start,end} in store_constructor (PR, tree-optimization/78428).
On 12/12/2016 12:10 PM, Eric Botcazou wrote: >> Ok. I'm sending a patch that put gcc_unreachable to places where either size >> or (and) offset is a non-constant. This survives regression tests >> (including ada) on x86_64-linux-gnu. Apart from that normal bootstrap + >> regression tests works fine on ppc64le-redhat-linux. > > I didn't manage to break it so it is OK by me. > Out of curiosity, I blame the commit which added the if (offset) { machine_mode address_mode; rtx offset_rtx; offset = SUBSTITUTE_PLACEHOLDER_IN_EXPR (offset, make_tree (TREE_TYPE (exp), target)); and it comes to commit done in 1993: commit c869557a9ccc1bd3e5474b144bcb84065db23549 Author: kenner Date: Mon Oct 4 01:48:03 1993 + (store_expr): Use expr_size value, not size_int. (store_constructor): Handle case of variable position and allow it to contain a PLACEHOLDER_EXPR. (get_inner_reference): Make a WITH_RECORD_EXPR if required. (expand_expr, case PLACEHOLDER_EXPR, WITH_RECORD_EXPR): New cases. (expand_expr, case ARRAY_REF): Make WITH_RECORD_EXPR expressions when needed. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@5584 138bc75d-0d04-0410-961f-82ee72b054a4 May I install the patch? Martin
Re: [PATCH] Add pretty printer for ASAN_MARK and add a helper fn
On Tue, Dec 13, 2016 at 09:51:13AM +0100, Martin Liška wrote: > >From ff843db17ba284d2e99bab567d76d4b779f9f1d6 Mon Sep 17 00:00:00 2001 > From: marxin > Date: Fri, 9 Dec 2016 16:24:16 +0100 > Subject: [PATCH 1/2] Add pretty printer for ASAN_MARK and add a helper fn > > gcc/ChangeLog: > > 2016-12-09 Martin Liska > > * asan.c (asan_mark_poison_p): Remove. > (asan_mark_p): New function. > (transform_statements): Use the function. > (asan_expand_mark_ifn): Do not use masked enum. > * asan.h (enum asan_mark_flags): Declare it via a macro. > * gimple-pretty-print.c (dump_gimple_call_args): Dump first > argument of ASAN_MARK. > * gimplify.c (build_asan_poison_call_expr): Use new enum values. > (asan_poison_variable): Likewise. > > gcc/testsuite/ChangeLog: > > 2016-12-12 Martin Liska > > * gcc.dg/asan/use-after-scope-goto-1.c: Update first argument of > scanned pattern ASAN_MARK. > * gcc.dg/asan/use-after-scope-goto-2.c: Likewise. > * gcc.dg/asan/use-after-scope-switch-1.c: Likewise. > * gcc.dg/asan/use-after-scope-switch-2.c: Likewise. > * gcc.dg/asan/use-after-scope-switch-3.c: Likewise. Ok. But the builtins should be renamed too (incrementally), BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory", should really be BUILT_IN_ASAN_POISON_STACK_MEMORY etc. Jakub
Re: [PATCH] Avoid creating unbounded complexity debug insns in valtrack (PR debug/77844)
On Mon, 12 Dec 2016, Jakub Jelinek wrote: > Hi! > > On the following testcase (where I've failed to reduce it without the > header, got to over 70KB with both delta and creduce) we end up with huge > RTL expressions in debug insns that just slow down combiner as well as > var-tracking etc. too much. > Generally, at GIMPLE level we try to keep the debug stmts using debug > temporaries very simple, then during expansion we have a function that uses > debug temporaries in case TER creates larger expressions, and finally during > var-tracking we also punt on too large expressions. But valtrack APIs, > used e.g. by the combiner, when substituting lots of pseudos for expressions > that e.g. contain each two other pseudos, is able to create many KB sized > expressions. > > The following patch just throws away expressions that have 32 register > references in there, I think that is so huge that it will be really very > unlikely to be beneficial in the debug info and var-tracking would likely > throw it away anyway. Or shall I use some --param instead of the constant > (any suggestions on how to call it)? Another option (with or without param) > is to create debug temporaries in those cases and split the expression into > smaller expressions. Not sure if it is worth it though. Hmm, so the "easier" fix would be to always create a debug insn right after INSN setting a new debug reg to SRC and then doing the replacement with the new debug reg? With the "optimization" to not do that for some single use and/or "simple" SRC cases (I think "simple" SRC would be sth with a single/zero register reference). You are fixing it at the wrong end IMHO, by throwing away the thing after propagating (and doing all the work and doing the check on each individual propagation result). Richard. > Bootstrapped/regtested on x86_64-linux and i686-linux. > > 2016-12-12 Jakub Jelinek > > PR debug/77844 > * valtrack.c: Include rtl-iter.h. > (propagate_for_debug): Reset debug insns which after simplification > contain more than 32 regs. > > * g++.dg/opt/pr77844.C: New test. > > --- gcc/valtrack.c.jj 2016-12-02 16:41:05.0 +0100 > +++ gcc/valtrack.c2016-12-12 15:12:39.637451127 +0100 > @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. > #include "regs.h" > #include "memmodel.h" > #include "emit-rtl.h" > +#include "rtl-iter.h" > > /* gen_lowpart_no_emit hook implementation for DEBUG_INSNs. In DEBUG_INSNs, > all lowpart SUBREGs are valid, despite what the machine requires for > @@ -197,6 +198,16 @@ propagate_for_debug (rtx_insn *insn, rtx >dest, propagate_for_debug_subst, &p); > if (loc == INSN_VAR_LOCATION_LOC (insn)) > continue; > + /* Avoid propagation from growing DEBUG_INSN expressions > + too much. */ > + int cnt = 0; > + subrtx_iterator::array_type array; > + FOR_EACH_SUBRTX (iter, array, loc, ALL) > + if (REG_P (*iter) && ++cnt > 32) > + { > + loc = gen_rtx_UNKNOWN_VAR_LOC (); > + break; > + } > INSN_VAR_LOCATION_LOC (insn) = loc; > df_insn_rescan (insn); > } > --- gcc/testsuite/g++.dg/opt/pr77844.C.jj 2016-12-12 17:28:25.146711803 > +0100 > +++ gcc/testsuite/g++.dg/opt/pr77844.C2016-12-12 17:27:43.0 > +0100 > @@ -0,0 +1,32 @@ > +// PR debug/77844 > +// { dg-do compile } > +// { dg-options "-O3 -g" } > + > +#include > + > +void > +foo (std::vector& v, int i, int j) > +{ > + for (int k = 0; k < 5; ++k) > +v[5 * i + k] = v[5 * i + k] | v[5 * j + k]; > +} > + > +void > +bar (std::vector& v, int i, int j) > +{ > + for (int k = 0; k < 5; ++k) > +v[5 * i + k] = v[5 * i + k] ^ v[5 * j + k]; > +} > + > +void > +baz (std::vector& v) > +{ > + foo (v, 4, 1); > + foo (v, 4, 2); > + foo (v, 4, 3); > + foo (v, 4, 4); > + bar (v, 4, 1); > + bar (v, 4, 2); > + bar (v, 4, 3); > + bar (v, 4, 4); > +} > > Jakub > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] [x86] Don't use builtins for SIMD shifts
On Tue, 13 Dec 2016, Richard Biener wrote: On Sat, Dec 10, 2016 at 7:59 PM, Marc Glisse wrote: On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote: On Saturday 10 December 2016, Marc Glisse wrote: On Sat, 10 Dec 2016, Marc Glisse wrote: On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote: Replaces the definitions of the shift intrinsics with GCC extension syntax to allow GCC to reason about what the instructions does. Tests are added to ensure the intrinsics still produce the right instructions, and that a few basic optimizations now work. I don't think we can do it in such a straightforward way. Those intrinsics are well defined for any shift argument, while operator<< and operator>> are considered undefined for many input values. I believe you may need to use an unsigned type for the lhs of left shifts, and write a << (b & 31) to match the semantics for the rhs (I haven't checked Intel's doc). Which might require adding a few patterns in sse.md to avoid generating code for that AND. Oups, apparently I got that wrong, got confused with the scalar case. Left shift by more than precision is well defined for vectors, but it gives 0, it doesn't take the shift count modulo the precision. Which is even harder to explain to gcc (maybe ((unsigned)b<=31)?a
Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions
On Tue, Dec 13, 2016 at 09:56:00AM +0100, Martin Liška wrote: > @@ -671,18 +678,203 @@ public: > > }; // class pass_sanopt > Please add a short function comment here... > +static void > +sanitize_asan_mark_unpoison (void) > +{ > + /* 1) Find all BBs that contain an ASAN_MARK poison call. */ > + auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1); > + bitmap_clear (with_poison); > + basic_block bb; ... and here. > +static void > +sanitize_asan_mark_poison (void) > +{ > + /* 1) Find all BBs that possibly contain an ASAN_CHECK. */ > + auto_sbitmap with_check (last_basic_block_for_fn (cfun) + 1); > + bitmap_clear (with_check); > + basic_block bb; > + FOR_EACH_BB_FN (bb, cfun) > +{ > + if (bitmap_bit_p (with_check, bb->index)) > + continue; When would this happen? It seems you only set it for the current bb, it doesn't seem that you'd try to process the same bb multiple times. > + gimple_stmt_iterator gsi; > + bool finish = false; > + for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) > + { > + gimple *stmt = gsi_stmt (gsi); > + if (maybe_contains_asan_check (stmt)) > + { > + bitmap_set_bit (with_check, bb->index); > + finish = true; Why the finish var and separate if (finish) break; ? Perhaps from the first iteration when you used a switch? Just doing break; should be enough. > + else if (asan_mark_p (stmt, ASAN_MARK_POISON)) > + { > + if (dump_file) > + fprintf (dump_file, "Removing ASAN_MARK poison\n"); > + unlink_stmt_vdef (stmt); > + gsi_remove (&gsi, true); Don't you need also release_defs (stmt); here (and in the other gsi_remove spot)? Jakub
[PATCH] S/390: Run md tests with -march=native instead of -march=z13.
The attached patch fixes an md test execution problem on S/390. The tests would be built with -march=z13 but executed even on older machines. Build with -march=native instead, so executing the tests should work on any machine generation. Tested on s390x biarch, but not bootstrapped or regression tested. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany gcc/ChangeLog-s390exp * gcc.target/s390/s390.exp: Run md tests with -march=native instead of -march=z13. >From 3d7b78393a1e1859b4f453f01194ed0ff8fa57c8 Mon Sep 17 00:00:00 2001 From: Dominik Vogt Date: Tue, 13 Dec 2016 10:21:08 +0100 Subject: [PATCH] S/390: Run md tests with -march=native instead of -march=z13. Otherwise the tests are built and _run_ with -march=z13 on machines prior to z13, possibly causing failures due to illegal instructions. --- gcc/testsuite/gcc.target/s390/s390.exp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/testsuite/gcc.target/s390/s390.exp b/gcc/testsuite/gcc.target/s390/s390.exp index 450dcaf..9393478 100644 --- a/gcc/testsuite/gcc.target/s390/s390.exp +++ b/gcc/testsuite/gcc.target/s390/s390.exp @@ -112,11 +112,11 @@ torture-finish # Additional md torture tests. torture-init set MD_TEST_OPTS [list \ - {-Os -march=z900} {-Os -march=z13} \ - {-O0 -march=z900} {-O0 -march=z13} \ - {-O1 -march=z900} {-O1 -march=z13} \ - {-O2 -march=z900} {-O2 -march=z13} \ - {-O3 -march=z900} {-O3 -march=z13}] + {-Os -march=z900} {-Os -march=native} \ + {-O0 -march=z900} {-O0 -march=native} \ + {-O1 -march=z900} {-O1 -march=native} \ + {-O2 -march=z900} {-O2 -march=native} \ + {-O3 -march=z900} {-O3 -march=native}] set-torture-options $MD_TEST_OPTS gcc-dg-runtest [lsort [glob -nocomplain $md_tests]] "" $DEFAULT_CFLAGS torture-finish -- 2.3.0
Re: [PATCH] Avoid creating unbounded complexity debug insns in valtrack (PR debug/77844)
On Tue, Dec 13, 2016 at 10:05:58AM +0100, Richard Biener wrote: > > The following patch just throws away expressions that have 32 register > > references in there, I think that is so huge that it will be really very > > unlikely to be beneficial in the debug info and var-tracking would likely > > throw it away anyway. Or shall I use some --param instead of the constant > > (any suggestions on how to call it)? Another option (with or without param) > > is to create debug temporaries in those cases and split the expression into > > smaller expressions. Not sure if it is worth it though. > > Hmm, so the "easier" fix would be to always create a debug insn > right after INSN setting a new debug reg to SRC and then doing the > replacement with the new debug reg? With the "optimization" Well, that would IMHO kill the simplification effect of the combiner for debug insn, nothing would be simplified in debug insns anymore. But perhaps we can try to simplify and after simplification using some metrics compare if the new expression is more complex than before or not. If it is more complex, then (if not emitted already) emit the extra debug insn with a debug temp (next to the insn being removed, aka insn - first argument to propage_for_debug) and replace again from DEST to the debug temp. so it means we will often create uselessly large debug info. > The to not do that for some single use and/or "simple" SRC cases > (I think "simple" SRC would be sth with a single/zero register reference). But simplify-rtx will often do something even for the more complex SRC cases. > You are fixing it at the wrong end IMHO, by throwing away the > thing after propagating (and doing all the work and doing the > check on each individual propagation result). Jakub
Ping #1: [patch,testsuite,avr]: Filter-out -mmcu= from options for tests that set -mmcu=
Ping #1 As I explained below, the solution taken be arm (pruning output) does not work here. Johann On 02.12.2016 11:21, Georg-Johann Lay wrote: On 01.12.2016 17:40, Mike Stump wrote: On Dec 1, 2016, at 3:54 AM, Georg-Johann Lay wrote: This patch moves the compile tests that have a hard coded -mmcu=MCU in their dg-options to a new folder. The exp driver filters out -mmcu= from the command line options that are provided by, say, board description files or --tool-opts. This is needed because otherwise conflicting -mmcu= will FAIL respective test cases because of "specified option '-mmcu' more than once" errors from avr-gcc. Ok for trunk? So, it would be nice if different ports can use roughly similar schemes to handle the same problems. I think arm is one of the more complex ports at this point in this regard with a lot of people and a lot of years time to contemplate and implement solutions to the problem. They in particular don't have to move test cases around to handle the difference like this, I think it would be best to avoid that requirement if possible. Glancing around, two starting points for how the arm achieves what it does: lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.* conflicts with -m(cpu|arch)=.* switch" in arm.exp, and they use something like: /* { dg-require-effective-target arm_crypto_ok } */ |crypto-vsha256hq_u32.c:2:/* { dg-require-effective-target arm_crypto_ok } */ /* { dg-add-options arm_crypto } */ |crypto-vsha256su0q_u32.c:2:/* { dg-require-effective-target arm_crypto_ok } */ to validate the requirements of the test case, and to ensure that optional things are selected. Nice, simple, extensible, handles multilibs, dejagnu arguments and different cpu defaults as I recall. You won't need all the hair the arm folks have, but if you stub in support in that direction, you then have simple, easy expansion room to match all complexities that the arm folks have already hit and solved. I tried this approach, but it does not work as expected. Notice that avr-gcc throws an error if conflicting -mmcu options are supplied. Pruning the output will make some tests "PASS", but the compiler didn't actually do anything but producing an error message... And one test FAILs because it should produce some specific diagnostic, but again the compiler just throws a different error, the output is pruned and the expected message is missing, hence fail. Also one test case is for ATmega8, but you won't run the whole test suite against ATmega8 because that device is too restricted to give reasonable results... Hence for some tests -mmcu=atmega8 is added by hand. Johann
Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known
On 9 December 2016 at 17:59, Jakub Jelinek wrote: > On Fri, Dec 09, 2016 at 05:36:41PM +0530, Prathamesh Kulkarni wrote: >> --- a/gcc/tree-ssa-strlen.c >> +++ b/gcc/tree-ssa-strlen.c >> @@ -2302,7 +2302,81 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi) >> else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) >> handle_pointer_plus (gsi); >> } >> - else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs)) >> + >> + /* Fold strstr (s, t) == s to memcmp (s, t, strlen (t)) == 0. >> + if strlen (t) is known and var holding return value of strstr >> + has single use. */ >> + >> + else if (TREE_CODE (lhs) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE >> (lhs))) >> + { >> + enum tree_code code = gimple_assign_rhs_code (stmt); >> + if (code == EQ_EXPR || code == NE_EXPR) > > This way you handle _8 = _5 == _7;, but not if (_5 == _7) bar ();. Shouldn't > you > also handle GIMPLE_COND similarly (of course, the rhs1 and rhs2 grabbing > and code grabbing is different for GIMPLE_COND. But the rest should be > the same, except again that you don't want to replace the GIMPLE_COND, but > adjust it. Maybe also COND_EXPR in gimple_assign (_9 = _5 == _7 ? _10 : > _11;). > >> + { >> + tree rhs1 = gimple_assign_rhs1 (stmt); >> + tree rhs2 = gimple_assign_rhs2 (stmt); >> + if (TREE_CODE (rhs1) == SSA_NAME >> + && TREE_CODE (rhs2) == SSA_NAME) >> + { >> + gcall *call_stmt = dyn_cast (SSA_NAME_DEF_STMT >> (rhs1)); >> + if (!call_stmt) >> + { >> + call_stmt = dyn_cast (SSA_NAME_DEF_STMT (rhs2)); >> + tree tmp = rhs1; >> + rhs1 = rhs2; >> + rhs2 = tmp; > > We use std::swap (rhs1, rhs2); in this case these days. > >> + } >> + >> + tree call_lhs; >> + if (call_stmt >> + && gimple_call_builtin_p (call_stmt, BUILT_IN_STRSTR) >> + && (call_lhs = gimple_call_lhs (call_stmt)) >> + && has_single_use (call_lhs)) > > This might not optimize if you have: > _5 = foo (); > _7 = __builtin_strstr (_5, "abcd"); > _8 = _5 == _7; > > Or even you could have: > _5 = __builtin_strstr (...); > _7 = __builtin_strstr (_5, "abcd"); > _8 = _5 == _7; > > So I wonder if you shouldn't do: > gimple *call_stmt = NULL; > for (int pass = 0; pass < 2; pass++) > { > gimple *g = SSA_NAME_DEF_STMT (rhs1); > if (gimple_call_builtin_p (g, BUILT_IN_STRSTR) > && gimple_call_lhs (g) == rhs1 > && has_single_use (rhs1) > && gimple_call_arg (g, 0) == rhs2) > { > call_stmt = g; > break; > } > std::swap (rhs1, rhs2); > } > if (call_stmt) > ... > > I think you don't need operand_equal_p, because SSA_NAMEs should just > be the same pointer if they are the same thing. > The above way you handle both orderings. Perhaps also it is big enough to > be done in a separate function, which you call with the code/rhs1/rhs2 and > stmt for the EQ/NE_EXPR is_gimple_assign as well as for COND_EXPR and > GIMPLE_COND. Hi Jakub, Thanks for the suggestions. It didn't occur to me to check for gimple_cond. I have tried to do the changes in the attached version. I am not sure if I have handled cond_expr correctly. IIUC, if gimple_assign has code cond_expr, then the condition is stored in gimple_assign_rhs1, however it's not a single operand but a tree of the form "op1 cond_code op2". Is that correct ? However I am not able to write a test-case that generates cond_expr in the IL. I tried: t1 = strstr (s, t); (t1 == s) ? foo() : bar (); and other such variants but it seems the ?: operator is getting lowered to gimple_cond instead. Bootstrap+tested on x86_64-unknown-linux-gnu and cross-tested on arm*-*-*, aarch64*-*-*. Does it look OK ? Thanks, Prathamesh > > Jakub 2016-12-13 Jakub Jelinek Prathamesh Kulkarni * tree-ssa-strlen.c (fold_strstr_to_memcmp): New function. (strlen_optimize_stmt): Call fold_strstr_to_memcmp. testsuite/ * gcc.dg/strlenopt-30.c: New test-case. diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c b/gcc/testsuite/gcc.dg/strlenopt-30.c new file mode 100644 index 000..089b3a2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/strlenopt-30.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +__attribute__((no_icf)) +_Bool f1(char *s) +{ + return __builtin_strstr (s, "hello") == s; +} + +__attribute__((no_icf)) +_Bool f2(char *s) +{ + return s == __builtin_strstr (s, "hello"); +} + +__attribut
Re: [PATCH, GCC/ARM, gcc-5/6-branch, ping2] Fix PR77933: stack corruption on ARM when using high registers and lr
On 09/12/16 15:27, Thomas Preudhomme wrote: On 09/12/16 14:30, Kyrill Tkachov wrote: On 09/12/16 14:24, Thomas Preudhomme wrote: [Seeing as an RC for GCC 6.3 was suggested on IRC for mid next week] Ping? backport for 6 bootstraps on Thumb-1 and testsuite shows no regression for either 5 or 6. Bootstrap for 5 is ongoing. Ok. Thanks, Kyrill I've committed the backport for 6 but am waiting for the bootstrap of the backport for 5 to complete before committing it as well. Will probably only be Monday. I've finally committed the backport to gcc-5-branch. The bootstrap failed when using in-tree gmp, mpfr, mpc and isl so I had to restart it. Bootstrap succeeded. PR77933 is now fixed in all supported branches, but note that it still affects gcc-4_7-branch, gcc-4_8-branch and gcc-4_9-branch. Best regards, Thomas
Re: [PATCH] S/390: Run md tests with -march=native instead of -march=z13.
On Tue, Dec 13, 2016 at 10:28:29AM +0100, Dominik Vogt wrote: > The attached patch fixes an md test execution problem on S/390. > The tests would be built with -march=z13 but executed even on > older machines. Build with -march=native instead, so executing > the tests should work on any machine generation. > > Tested on s390x biarch, but not bootstrapped or regression tested. I think this isn't a very good idea. Then the testresults from one box to another one will mean something different. IMHO you want something like x86 avx_runtime effective target (z13_runtime?), which would stand for running on z13 capable hw and with z13 assembler support. Or choose what options to include based on such effective target tests, and perhaps also select a default action, like we do on some targets e.g. in the vectorizer. Some tests are dg-do run by default if the hw supports the needed ISA, or dg-do assemble, if the hw doesn't support that, but at least the assembler can assemble those, otherwise dg-do compile or something similar. With -march=native, you find some results in gcc-testresults and the exact underlying hw will be part of the needed info to see what you've actually tested. While usually just some tests are UNSUPPORTED if hw or assembler doesn't have the needed features. Jakub
Re: [PATCH, middle-end]: Fix PR78738, unrecognized insn with -ffast-math
On Tue, Dec 13, 2016 at 09:52:40AM +0100, Richard Biener wrote: > On Sun, Dec 11, 2016 at 5:16 PM, Uros Bizjak wrote: > > On Fri, Dec 9, 2016 at 11:09 AM, Richard Biener > > wrote: > >> On Thu, Dec 8, 2016 at 10:44 PM, Uros Bizjak wrote: > >>> Hello! > >>> > >>> Attached patch fixes fall-out from excess-precision improvements > >>> patch. As shown in the PR, the code throughout the compiler assumes > >>> FLAG_PRECISION_FAST when flag_unsafe_math_optimizations flag is in > >>> effect. The patch puts back two lines, removed by excess-precision > >>> improvements patch. > >>> > >>> 2016-12-08 Uros Bizjak > >>> > >>> PR middle-end/78738 > >>> * toplev.c (init_excess_precision): Initialize flag_excess_precision > >>> to EXCESS_PRECISION_FAST for flag_unsafe_math_optimizations. > >>> > >>> testsuite/ChangeLog: > >>> > >>> 2016-12-08 Uros Bizjak > >>> > >>> PR middle-end/78738 > >>> * gcc.target/i386/pr78738.c: New test. > >>> > >>> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > >>> > >>> OK for mainline? > >> > >> Hmm, I think it belongs to set_unsafe_math_optimization_flags instead > >> (and be consistent if -fexcess-precision was manually specified). > >> > >> Also where do we assume connection between -funsafe-math-optimizations > >> and FLAG_PRECISION_FAST? We have two flags so we should fix any > >> user that looks at one but means the other. > > > > The failure is caused by the call to ix86_emit_i387_round in > > "lround2" expander. This expander is > > enabled for x87 math when flag_unsafe_math_optimizations is enabled. > > In the called ix86_emit_i387_round, DFmode PLUS pattern is manually > > emitted: > > > > emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (inmode, e1, half))); > > > > but due to the definition of X87_ENABLE_ARITH, DFmode fadd pattern > > remains disabled. > > > > It is possible to fix the failure by enabling X87_ENABLE_ARITH (and > > X87_ENABLE_FLOAT) for flag_unsafe_math_optimizations (as is the case > > in the attached v2 patch), but since gcc-6.x does > > > > if (flag_unsafe_math_optimizations) > >flag_excess_precision = EXCESS_PRECISION_FAST; > > > > I though it was worth mentioning the difference between gcc-6 and > > gcc-7. Probably, x87 is the only target that cares for it, in which > > case the attached patch is sufficient. > > I think that patch is the correct correctness fix. > > With respect to GCC 6 vs. GCC 7 behavior I believe it would be more > reasonable to set EXCESS_PRECISION_FAST by -ffast-math/-Ofast > rather than from simply -funsafe-math-optimizations (which is an option > directly controlling "legacy" controlled stuff which should be moved > under a more specific option umbrella). Do note that EXCESS_PRECISION_FAST is the default under the GNU dialects, and we're only in EXCESS_PRECISION_STANDARD in the testcase because of -std=c99. That said, relaxing to EXCESS_PRECISION_FAST with -Ofast does make sense to me. Thanks, James
Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode
On 12/12/16 21:17, Christophe Lyon wrote: Hi Thomas, Thanks for working on this, On 12 December 2016 at 18:52, Thomas Preudhomme wrote: Hi, The logic to make -mthumb optional for Thumb-only devices is only executed when no -marm or -mthumb is given on the command-line. This includes configuring GCC with --with-mode= because this makes the option to be passed before any others. The optional_mthumb-* testcases are skipped when -marm or -mthumb is passed on the command line but not when GCC was configured with --with-mode. Not only are the tests meaningless in these configurations, they also spuriously FAIL if --with-mode=arm was used since the test are built for Thumb-only devices, as reported by Christophe Lyon in [1]. [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html This patch adds logic to target-support.exp to check how was GCC configured and changes the optional_mthumb testcases to be skipped if there is a default mode (--with-mode=). It also fixes a couple of typo on the selectors. How hard would it be to skip these tests only if --with-mode=arm, such that they would still pass in configurations --with-mode=thumb? It seems easy to extend what you propose here, doesn't it? It is but IMO it gives a false sense of quality since it does not test the optional -mthumb logic. What is the motivation to make it run with --with-mode=thumb? Best regards, Thomas
Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known
On Tue, Dec 13, 2016 at 03:08:17PM +0530, Prathamesh Kulkarni wrote: > Thanks for the suggestions. It didn't occur to me to check for gimple_cond. > I have tried to do the changes in the attached version. > I am not sure if I have handled cond_expr correctly. > IIUC, if gimple_assign has code cond_expr, then the condition is > stored in gimple_assign_rhs1, > however it's not a single operand but a tree of the form "op1 cond_code op2". > Is that correct ? Yes. gimple_assign_rhs1 will be in what you are looking for EQ_EXPR or NE_EXPR tree, its TREE_CODE will be this code you want to check, and TREE_OPERAND (exp, 0) and TREE_OPERAND (exp, 1) the rhs1 and rhs2 you use elsewhere. > However I am not able to write a test-case that generates cond_expr in the IL. > I tried: > t1 = strstr (s, t); > (t1 == s) ? foo() : bar (); > and other such variants but it seems the ?: operator is getting > lowered to gimple_cond instead. It is, but in some cases tree-if-conv.c turns them back into COND_EXPRs. I guess you need -ftree-loop-if-convert now, and it has to be in some loop where the addition of cond_expr would likely turn it into a single bb loop. You probably want constants or vars, not function calls in the ? : expressions though. > +/* Try to fold strstr (s, t) == s to memcmp (s, t, strlen (t)) == 0. */ > + > +static void > +fold_strstr_to_memcmp(enum tree_code code, tree rhs1, tree rhs2, gimple > *stmt) Formatting, space before (. > +{ > + gimple *call_stmt = NULL; > + for (int pass = 0; pass < 2; pass++) > +{ > + gimple *g = SSA_NAME_DEF_STMT (rhs1); > + if (g I think g should be always non-NULL (except for invalid IL), so probably no need to check it. > + && gimple_call_builtin_p (g, BUILT_IN_STRSTR) > + && has_single_use (rhs1) > + && gimple_call_arg (as_a (g), 0) == rhs2) I think gimple_call_arg works fine even with just gimple * argument. So you can avoid the as_a (g) uglification and just use g. > + if (is_gimple_assign (stmt)) > + { > + if (gimple_assign_rhs_code (stmt) == COND_EXPR) > + { > + tree cond = gimple_assign_rhs1 (stmt); > + TREE_SET_CODE (cond, EQ_EXPR); This looks weird. You are hardcoding EQ_EXPR, while for the other case below you use code. So, do you handle properly both EQ_EXPR and NE_EXPR for this and gimple_cond cases? Also, for non-COND_EXPR assign you build a new stmt instead of reusing the existing one, why? > + TREE_OPERAND (cond, 0) = memcmp_lhs; > + TREE_OPERAND (cond, 1) = zero; > + update_stmt (stmt); > + } > + else > + { > + gsi = gsi_for_stmt (stmt); > + tree lhs = gimple_assign_lhs (stmt); > + gassign *ga = gimple_build_assign (lhs, code, memcmp_lhs, > + zero); > + gsi_replace (&gsi, ga, false); > + } > + } > + else > + { > + gcond *cond = as_a (stmt); > + gimple_cond_set_lhs (cond, memcmp_lhs); > + gimple_cond_set_rhs (cond, zero); > + gimple_cond_set_code (cond, EQ_EXPR); Likewise here. Jakub
Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode
On 13 December 2016 at 10:54, Thomas Preudhomme wrote: > On 12/12/16 21:17, Christophe Lyon wrote: >> >> Hi Thomas, >> >> Thanks for working on this, >> >> >> On 12 December 2016 at 18:52, Thomas Preudhomme >> wrote: >>> >>> Hi, >>> >>> The logic to make -mthumb optional for Thumb-only devices is only >>> executed >>> when no -marm or -mthumb is given on the command-line. This includes >>> configuring GCC with --with-mode= because this makes the option to be >>> passed >>> before any others. The optional_mthumb-* testcases are skipped when -marm >>> or >>> -mthumb is passed on the command line but not when GCC was configured >>> with >>> --with-mode. Not only are the tests meaningless in these configurations, >>> they also spuriously FAIL if --with-mode=arm was used since the test are >>> built for Thumb-only devices, as reported by Christophe Lyon in [1]. >>> >>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html >>> >>> This patch adds logic to target-support.exp to check how was GCC >>> configured >>> and changes the optional_mthumb testcases to be skipped if there is a >>> default mode (--with-mode=). It also fixes a couple of typo on the >>> selectors. >>> >> >> How hard would it be to skip these tests only if --with-mode=arm, >> such that they would still pass in configurations --with-mode=thumb? >> >> It seems easy to extend what you propose here, doesn't it? > > > It is but IMO it gives a false sense of quality since it does not test the > optional -mthumb logic. > > What is the motivation to make it run with --with-mode=thumb? > With your patch, they will appear as unsupported, right? It's not a big deal. Do you know how most people configure their toolchains? Linaro and Ubuntu (I think) use --with-mode=thumb. Do most people use no --with-mode configure flag? In the validations I run against trunk, I always have --with-mode, except for one arm-none-eabi configuration where I also use default cpu/fpu. Maybe I should remove --with-mode=thumb from my config --with-cpu=cortex-m3? (but I don't remember if the new logic was backported to gcc-5/gcc-6 ?) Christophe > Best regards, > > Thomas
Re: [PATCH] S/390: Run md tests with -march=native instead of -march=z13.
On Tue, Dec 13, 2016 at 10:42:37AM +0100, Jakub Jelinek wrote: > On Tue, Dec 13, 2016 at 10:28:29AM +0100, Dominik Vogt wrote: > > The attached patch fixes an md test execution problem on S/390. > > The tests would be built with -march=z13 but executed even on > > older machines. Build with -march=native instead, so executing > > the tests should work on any machine generation. > > > > Tested on s390x biarch, but not bootstrapped or regression tested. > > I think this isn't a very good idea. Then the testresults from one > box to another one will mean something different. Note that this is already the case: Recently the setmem-long-1.c test case has started crashing on zEC12 because -march=z13 now generates z13 specific instructions. > IMHO you want something like x86 avx_runtime effective target > (z13_runtime?), which would stand for running on z13 capable hw and > with z13 assembler support. Something like that, yes, but it's not so easy because the kernel has to support it too. Some features are disabled in a VM although the hardware supports them. What we really need is Run test if the test system (not just the hardware) supports the instruction set of the -march= option the test was compiled with, otherwise just compile it. I.e. derive the "effective_targt..." option from the "-march=..." option set by the torture test. > Or choose what options to include based on such effective target tests, > and perhaps also select a default action, like we do on some targets e.g. in > the vectorizer. Can you give an example test file, please? > Some tests are dg-do run by default if the hw supports > the needed ISA, or dg-do assemble, if the hw doesn't support that, but > at least the assembler can assemble those, otherwise dg-do compile or > something similar. > > With -march=native, you find some results in gcc-testresults and the exact > underlying hw will be part of the needed info to see what you've actually > tested. While usually just some tests are UNSUPPORTED if hw or assembler > doesn't have the needed features. Yes. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
Re: [PATCH] omp-low.c split
Hi, On Fri, Dec 09, 2016 at 07:18:54PM +0300, Alexander Monakov wrote: > On Fri, 9 Dec 2016, Jakub Jelinek wrote: > > Can you post an incremental patch fixing those issues? > > A few small nits I found while reading the patch. > > First of all, please use 'git diff --patience' (or --histogram) when > generating such patches, without it the changes in omp-low.c look uglier than > necessary. OK, the new patches use --patience, the result is actually quite a bit smaller. Thanks for the hint. > > The comment for 'struct oacc_loop' in new file omp-device.c is misplaced: for > some reason it's above the #include block. Oh, thanks for noticing, fixed. > > This patch doesn't seem to apply to current trunk due to a conflict in > cp/parser.c. Hmm, I don't think so? $ svn st $ svn up Updating '.': At revision 243600. $ patch --dry-run -p1 < /tmp/misc-next/pat/0001-Split-omp-low-into-multiple-files.patch checking file gcc/Makefile.in checking file gcc/c-family/c-omp.c checking file gcc/c/c-parser.c checking file gcc/c/c-typeck.c checking file gcc/c/gimple-parser.c checking file gcc/config/nvptx/nvptx.c checking file gcc/cp/parser.c checking file gcc/cp/semantics.c checking file gcc/fortran/trans-openmp.c checking file gcc/gengtype.c checking file gcc/gimple-fold.c checking file gcc/gimplify.c checking file gcc/lto-cgraph.c checking file gcc/omp-device.c checking file gcc/omp-device.h checking file gcc/omp-expand.c checking file gcc/omp-expand.h checking file gcc/omp-general.c checking file gcc/omp-general.h checking file gcc/omp-grid.c checking file gcc/omp-grid.h checking file gcc/omp-low.c checking file gcc/omp-low.h checking file gcc/toplev.c checking file gcc/tree-cfg.c checking file gcc/tree-parloops.c checking file gcc/tree-ssa-loop.c checking file gcc/tree-vrp.c checking file gcc/varpool.c $ echo $? 0 This was using the exact same file I have compressed and sent a while ago to the mailing list. > If you could create a git branch (perhaps in your personal > namespace so you can freely rebase it) with this patchset, I'd appreciate it. > Well... I do not see this as a long-term project so I do not really see much value in this. I have used git just because that is how I conveniently send patches to machines where I bootstrap them. I hope this will get committed very early so that we avoid any big conflicts in omp-low.c. But if this drags on for a while and if our git mirror recovers meanwhile, I can. > When trying to apply the patch, git notes a few remaining whitespace issues: > > $ zcat 0001-Split-omp-low-into-multiple-files.patch.gz | git apply --reject - > > :2734: space before tab in indent. > # BLOCK 2 (PAR_ENTRY_BB) > :5346: space before tab in indent. > true, GSI_SAME_STMT); > :8129: space before tab in indent. > after a stmt, not before. */ > :9060: space before tab in indent. > GOMP_atomic_start (); > :9061: space before tab in indent. > *addr = rhs; > I hope I have addressed this when doing what Jakub suggested, please let me know if not. Thanks for looking at the big patch! Martin
Re: [Patch, fortran] PR78737 - [OOP] linking error with deferred, undefined user-defined derived-type I/O
Dear Janus, We got there! OK for trunk. This was a demonstration of the corollary of the "bon mot" from Barack Obama at the end of the message :-) Many thanks for finding the right path. Paul On 13 December 2016 at 10:23, Janus Weil wrote: > 2016-12-13 10:58 GMT+01:00 Janus Weil : >> 2016-12-13 10:42 GMT+01:00 Janus Weil : please find attached a version of your patch that runs all the dtio testcases successfully. >>> >>> Great, thanks a lot. Your addition of >>> gfc_find_and_cut_at_last_class_ref is just what I was looking for >>> right now ... >>> >>> If you don't mind I'll write a ChangeLog, add the proper tests and >>> commit to trunk. Or do you prefer to take care of it yourself? >> >> Btw, to continue the brainstorming, I think I found a slightly better >> solution for the dtio_13 problem, which even removes the spurious >> error on that test case via a small fix in resolve_transfer. The >> attached patch is what I'd like to commit if you're ok with it. > > Finally, here is a complete patch, including testcase and ChangeLog > entries. Ok for trunk? > > Cheers, > Janus > > > 2016-12-13 Janus Weil > Paul Thomas > > PR fortran/78737 > * gfortran.h (gfc_find_typebound_dtio_proc): New prototype. > * interface.c (gfc_compare_interfaces): Whitespace fix. > (gfc_find_typebound_dtio_proc): New function. > (gfc_find_specific_dtio_proc): Use it. > * resolve.c (resolve_transfer): Improve error recovery. > * trans-io.c (get_dtio_proc): Implement polymorphic calls to DTIO > procedures. > > 2016-12-13 Janus Weil > Paul Thomas > > PR fortran/78737 > * gfortran.dg/dtio_13.f90: Remove spurious error. > * gfortran.dg/dtio_19.f90: New test case. -- If you're walking down the right path and you're willing to keep walking, eventually you'll make progress. Barack Obama
Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)
On Tue, Dec 13, 2016 at 2:36 AM, Martin Sebor wrote: > The attached patch avoids infinite recursion when traversing phi > nodes in maybe_warn_alloc_args_overflow by using a bitmap to keep > track of those already visited and breaking out. It looks somewhat excessive (the whole PHI node walk looks exponential in the number of alloca calls given a good enough testcase). It also looks like operand_signed_p really returns only a wild guess, neither conservatively true or false. Is that correct? Can you instead scrap the weird anti-range handling please? Thanks, Richard. > Thanks > Martin
Re: [PATCH] omp-low.c split
On Tue, Dec 13, 2016 at 11:15:43AM +0100, Martin Jambor wrote: > I have bootstrapped the two patches on aarch64-linux and bootstrapped > and tested them on x86_64-linux. What do you think? Thanks a lot for the work. If you wouldn't mind doing a couple of further changes (see below), I'd appreciate it, but if you want to commit it right away, I'm ok with that too. @@ -4321,7 +4322,7 @@ expand_cilk_for (struct omp_region *region, struct omp_for_data *fd) tree child_fndecl = gimple_omp_parallel_child_fn ( -as_a (last_stmt (region->outer->entry))); + as_a (last_stmt (region->outer->entry))); tree t, low_val = NULL_TREE, high_val = NULL_TREE; for (t = DECL_ARGUMENTS (child_fndecl); t; t = TREE_CHAIN (t)) { My preference for the above would be gomp_parallel *par_stmt = as_a (last_stmt (region->outer->entry)); tree child_fndecl = gimple_omp_parallel_child_fn (par_stmt); @@ -6428,7 +6429,7 @@ expand_omp_atomic_pipeline (basic_block load_bb, basic_block store_bb, floating point. This allows the atomic operation to properly succeed even with NaNs and -0.0. */ stmt = gimple_build_cond_empty - (build2 (NE_EXPR, boolean_type_node, + (build2 (NE_EXPR, boolean_type_node, new_storedi, old_vali)); gsi_insert_before (&si, stmt, GSI_SAME_STMT); And here tree ne = build2 (NE_EXPR, boolean_type_node, new_storedi, old_vali); stmt = gimple_build_cond_empty (ne); @@ -442,7 +442,7 @@ omp_max_simt_vf (void) if (!optimize) return 0; if (ENABLE_OFFLOADING) -for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c; ) +for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c;) I admit I don't know what the coding style we have in this case. Ok either way. @@ -5860,7 +5860,7 @@ lower_omp_sections (gimple_stmt_iterator *gsi_p, omp_context *ctx) new_body = maybe_catch_exception (new_body); t = gimple_build_omp_return -(!!omp_find_clause (gimple_omp_sections_clauses (stmt), + (!!omp_find_clause (gimple_omp_sections_clauses (stmt), OMP_CLAUSE_NOWAIT)); gimple_seq_add_stmt (&new_body, t); maybe_add_implicit_barrier_cancel (ctx, &new_body); @@ -6023,7 +6023,7 @@ lower_omp_single (gimple_stmt_iterator *gsi_p, omp_context *ctx) bind_body = maybe_catch_exception (bind_body); t = gimple_build_omp_return -(!!omp_find_clause (gimple_omp_single_clauses (single_stmt), + (!!omp_find_clause (gimple_omp_single_clauses (single_stmt), OMP_CLAUSE_NOWAIT)); gimple_seq_add_stmt (&bind_body_tail, t); maybe_add_implicit_barrier_cancel (ctx, &bind_body_tail); And in the above 2 spots something like: bool nowait = omp_find_clause (gimple_omp_single_clauses (single_stmt), OMP_CLAUSE_NOWAIT) != NULL_TREE; t = gimple_build_omp_return (nowait); (wonder why we use t for gimple, renaming t var to g would be also nice). But ok either way. @@ -8668,7 +8669,7 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx) break; case GIMPLE_TRANSACTION: lower_omp (gimple_transaction_body_ptr ( - as_a (stmt)), + as_a (stmt)), ctx); Here it fits, so no need to wrap: lower_omp (gimple_transaction_body_ptr (as_a (stmt)), ctx); Jakub
Re: [PATCH] S/390: Run md tests with -march=native instead of -march=z13.
On Tue, Dec 13, 2016 at 11:18:31AM +0100, Dominik Vogt wrote: > > IMHO you want something like x86 avx_runtime effective target > > (z13_runtime?), which would stand for running on z13 capable hw and > > with z13 assembler support. > > Something like that, yes, but it's not so easy because the kernel > has to support it too. Some features are disabled in a VM > although the hardware supports them. What we really need is The z13_runtime or whatever effective target routine can always just try to compile/link a simple testcase with at least one z13 specific instruction and try to run it. > Run test if the test system (not just the hardware) supports the > instruction set of the -march= option the test was compiled > with, otherwise just compile it. > > I.e. derive the "effective_targt..." option from the "-march=..." > option set by the torture test. > > > Or choose what options to include based on such effective target tests, > > and perhaps also select a default action, like we do on some targets e.g. in > > the vectorizer. > > Can you give an example test file, please? Look e.g. at gcc.dg/vect/vect.exp and testsuite/lib/*.exp for dg-do-what-default and the games done with it. E.g. if you set (and saved/restored) in addition to dg-do-what-default also EFFECTIVE_TARGETS then perhaps you could use et-dg-runtest or similar. Jakub
Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)
On Mon, Dec 12, 2016 at 06:36:16PM -0700, Martin Sebor wrote: > +/* Return true if the type of OP is signed, looking through any casts > + to an unsigned type. */ > + > +static bool > +operand_signed_p (tree op) > +{ > + bitmap visited = NULL; > + bool ret = operand_signed_p (op, &visited); > + > + if (visited) > +BITMAP_FREE (visited); I think you can drop the if before BITMAP_FREE here. Marek
Re: [PATCH, fortran, pr77785, v2] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990
Hi Andre, > all the sanitizer issues I fixed occur during compiling the testsuite. So I > would say, that when with the patch these errors do not occur anymore while > processing the testsuite, then those are tested for, right? aah, so you're saying that hunk is not actually related to the PR in the subject line, but instead fixes a testsuite failure seen with a sanitized compiler? That wasn't mentioned anywhere and sadly I forgot to bring my crystal ball ... Cheers, Janus > On Mon, 12 Dec 2016 13:37:43 +0100 > Janus Weil wrote: > >> Hi Andre, >> >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for >> > allocatable and pointer components in polymorphic objects (BT_CLASS) thus >> > fixing two ICEs reported in the PR. >> > >> > The next chunk fixes an ICE when the declaration containing the token >> > information is of type POINTER or REFERENCE. >> > >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? >> >> the resolve.c hunk is certainly ok. The trans-array.c part looks >> reasonable as well, but I wonder if it is actually covered by any of >> your test cases? Since they are all compile-only, with errors being >> thrown at resolution stage, do they even get to the translation stage? >> >> Cheers, >> Janus > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de
Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode
On 13/12/16 10:11, Christophe Lyon wrote: On 13 December 2016 at 10:54, Thomas Preudhomme wrote: On 12/12/16 21:17, Christophe Lyon wrote: Hi Thomas, Thanks for working on this, On 12 December 2016 at 18:52, Thomas Preudhomme wrote: Hi, The logic to make -mthumb optional for Thumb-only devices is only executed when no -marm or -mthumb is given on the command-line. This includes configuring GCC with --with-mode= because this makes the option to be passed before any others. The optional_mthumb-* testcases are skipped when -marm or -mthumb is passed on the command line but not when GCC was configured with --with-mode. Not only are the tests meaningless in these configurations, they also spuriously FAIL if --with-mode=arm was used since the test are built for Thumb-only devices, as reported by Christophe Lyon in [1]. [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html This patch adds logic to target-support.exp to check how was GCC configured and changes the optional_mthumb testcases to be skipped if there is a default mode (--with-mode=). It also fixes a couple of typo on the selectors. How hard would it be to skip these tests only if --with-mode=arm, such that they would still pass in configurations --with-mode=thumb? It seems easy to extend what you propose here, doesn't it? It is but IMO it gives a false sense of quality since it does not test the optional -mthumb logic. What is the motivation to make it run with --with-mode=thumb? With your patch, they will appear as unsupported, right? It's not a big deal. yes indeed. Do you know how most people configure their toolchains? Linaro and Ubuntu (I think) use --with-mode=thumb. Do most people use no --with-mode configure flag? I cannot answer for people in general but in our case we use --with-multilib-list which will build the toolchain for several -march, -mfpu and -mfloat combinations. But it's good to have a mix, it finds more issues in the testsuite as this very examples shows. In the validations I run against trunk, I always have --with-mode, except for one arm-none-eabi configuration where I also use default cpu/fpu. For good reason because without those it will compile target libraries without any option and so the default would be used (ie ARMv4T in default mode). Either you need to specify some --with-* or you built with multilib. Maybe I should remove --with-mode=thumb from my config --with-cpu=cortex-m3? (but I don't remember if the new logic was backported to gcc-5/gcc-6 ?) No it wasn't backported. Best regards, Thomas
Re: [patch, doc] Move -pthread documentation to linker options
Hi Sandra, > On 12/11/2016 01:28 PM, Rainer Orth wrote: >> Hi Sandra, >> >>> PR 16519 notes that -pthread has only ever been documented as an RS6000 and >>> Solaris 2 option. In fact it's supported by most/all(?) POSIX-flavored >>> targets, including GNU/Linux, BSD variants, Darwin, etc. It's probably best >>> to document it as a generic option, with the expectation that GCC supports >>> -pthread if the underlying operating system or C library provides an >>> implementation of the POSIX threads API. >>> >>> After scratching my head about it, it seemed that it's best categorized as >>> a linker option even though it also affects the preprocessor on some >>> targets. I'll wait a day or two before committing the attached patch, in >>> case anybody wants to argue that this is the wrong way to categorize it. >> >> I don't like categorizing it as a linker option: as you say, it affects >> the preprocessor as well (adding -D_REENTRANT on most systems), and in >> the past (not completely sure about the present) there were subtle bugs >> if you forgot to add -pthread during compilation. > > Do you have a suggestion for a better place to put it? Preferably *one* > place, instead of duplicating the docs in 10+ places with target-specific > options? I just checked and couldn't find a really good place. It has some similarity to C Language Options, but doesn't fit since it doesn't extend the language. Unless you want to introduce a whole new section, I think the best you can do is list it separately under Preprocessor and Link Options. I fully agree to not list it individually for each system that currently supports it. The better route would be to support it for both compilation and linking on any system that needs either for symmetry's sake, even if one of the two is just a no-op. Everything else is just a nightmarish user experience. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH, GCC/testsuite/ARM] Skip optional_mthumb tests if GCC has a default mode
On 13 December 2016 at 12:18, Thomas Preudhomme wrote: > On 13/12/16 10:11, Christophe Lyon wrote: >> >> On 13 December 2016 at 10:54, Thomas Preudhomme >> wrote: >>> >>> On 12/12/16 21:17, Christophe Lyon wrote: Hi Thomas, Thanks for working on this, On 12 December 2016 at 18:52, Thomas Preudhomme wrote: > > > Hi, > > The logic to make -mthumb optional for Thumb-only devices is only > executed > when no -marm or -mthumb is given on the command-line. This includes > configuring GCC with --with-mode= because this makes the option to be > passed > before any others. The optional_mthumb-* testcases are skipped when > -marm > or > -mthumb is passed on the command line but not when GCC was configured > with > --with-mode. Not only are the tests meaningless in these > configurations, > they also spuriously FAIL if --with-mode=arm was used since the test > are > built for Thumb-only devices, as reported by Christophe Lyon in [1]. > > [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html > > This patch adds logic to target-support.exp to check how was GCC > configured > and changes the optional_mthumb testcases to be skipped if there is a > default mode (--with-mode=). It also fixes a couple of typo on the > selectors. > How hard would it be to skip these tests only if --with-mode=arm, such that they would still pass in configurations --with-mode=thumb? It seems easy to extend what you propose here, doesn't it? >>> >>> >>> >>> It is but IMO it gives a false sense of quality since it does not test >>> the >>> optional -mthumb logic. >>> >>> What is the motivation to make it run with --with-mode=thumb? >>> >> >> With your patch, they will appear as unsupported, right? >> It's not a big deal. > > > yes indeed. > OK, that's fine if that's your intent. >> >> Do you know how most people configure their toolchains? >> Linaro and Ubuntu (I think) use --with-mode=thumb. >> Do most people use no --with-mode configure flag? > > > I cannot answer for people in general but in our case we use > --with-multilib-list which will build the toolchain for several -march, > -mfpu and -mfloat combinations. But it's good to have a mix, it finds more > issues in the testsuite as this very examples shows. > >> >> In the validations I run against trunk, I always have --with-mode, >> except for one arm-none-eabi configuration where I also use >> default cpu/fpu. > > > For good reason because without those it will compile target libraries > without any option and so the default would be used (ie ARMv4T in default > mode). Either you need to specify some --with-* or you built with multilib. > I keep this configuration with all defaults to make sure everything still works for the default, armv4t indeed. And there are often problems in the testsuite as you know, because people test with more modern configurations. >> >> Maybe I should remove --with-mode=thumb from my config >> --with-cpu=cortex-m3? (but I don't remember if the new logic >> was backported to gcc-5/gcc-6 ?) > > > No it wasn't backported. > > Best regards, > > Thomas
Re: [PATCH] Optimiza aggregate a = b = c = {} (PR c/78408)
Hi! Sorry for not getting to this earlier. On Mon, Nov 28, 2016 at 10:50:26AM +0100, Richard Biener wrote: > > + else if (gimple_call_builtin_p (defstmt, BUILT_IN_MEMSET) > > + && TREE_CODE (gimple_call_arg (defstmt, 0)) == ADDR_EXPR > > + && TREE_CODE (gimple_call_arg (defstmt, 1)) == INTEGER_CST > > + && TREE_CODE (gimple_call_arg (defstmt, 2)) == INTEGER_CST) > > +{ > > + HOST_WIDE_INT ssize, max_size, off; > > + bool reverse; > > + src2 = TREE_OPERAND (gimple_call_arg (defstmt, 0), 0); > > + get_ref_base_and_extent (src2, &off, &ssize, &max_size, &reverse); > > + if (ssize != max_size > > + || (ssize % BITS_PER_UNIT) != 0 > > + || !wi::eq_p (gimple_call_arg (defstmt, 2), ssize / BITS_PER_UNIT)) > > + src2 = NULL_TREE; > > I wonder why you jump through the hoops of get_ref_base_and_extent > given the call args will be invariant addresses and thus > get_addr_base_and_unit_offset would be more appropriate here. get_addr_base_and_unit_offset does not give me the size though, which is what I wanted to compute. Even if as you suggest I'd accept other INTEGER_CST sizes, it would still be better to punt if the memset is clearly invalid. And for the case where the memset call is followed by assignment, not memcpy (very common, as memcpy is often folded into the assignment), the verification needs to be done. > Also not sure why you want to restrict the size with the wi::eq_p > (probably for the a = b case where the size isn't given explicitely > but then you don't check whether off is 0 ...). I'd say passing But I'm not comparing the result of get_ref_base_and_extent, but the argument as is. Perhaps where it does above the src2 = NULL_TREE; I could save the size into one var, off into another one and set src2 to the result of get_ref_base_and_extent in that case, then if those vars are set require the second stmt to be a memset and do the same stuff there? > in base, offset and size for src and dest into this function will > simplify things and should allow to handle > > memset (p+10, 0, 24); > memcpy (q, p+10, 24); > > if you compare bases with operand_equal_p. > > > + if (refs_may_alias_p (dest, src)) > > +return; > > Why's that? I admit I'm not sure if GIMPLE_ASSIGN may be between overlapping objects or not, memset can't. Jakub
Re: [PATCH] omp-low.c split
Hi! On Fri, 9 Dec 2016 14:08:21 +0100, Martin Jambor wrote: > this is the promised attempt at splitting omp-low.c [...] Yay! \o/ I have not yet had a chance to review/test this patch, but I plan to. A few initial comments from the "bike shed departement"; I understand in GCC sources it will not be easy to rename stuff (such as files) later, so we should get the names agreed upon early: Generally, I agree with your division of "omp-low.c" parts. > - move everything that is part of pass_oacc_device_lower, > pass_omp_device_lower and pass_omp_target_link to a new file > omp-device.h, Should we call this file "omp-offload.c", as offloading is what this deals with, is the term we agreed to generally use (as far as I can tell)? > - move all pre-lowering gridification stuff to a new file > omp-grid.c. [...] Is that code generic enough to not call this file "omp-hsa.c" or similar? > - I moved stuff that was used from all over the place to a new file > omp-general.c (unless it would mean exposing omp_region or > omp_context types). I'd have called that simply "omp.c". > I am opened to suggestions what to do differently, names of the file > are for example of course subject to discussion, and I absolutely > welcome any review and checking, for one I am not going to pretend I > understand the stuff I put into omp-device.c. If however there is > consensus that we should do something like this, I would like to ask > the community to freeze omp-low.c file until this gets committed, I > hope you understand that I am afraid of any conflicts. I very much understand... :-| When I had worked on the very same thing months ago, and my changes went without review/approval for a long time, I had spent numerous hours on keeping my patch up to date. So, I'm happy to see that this is now near approval! (Even though I don't understand what's different now from when I worked on the same thing back then...) I hope to have time later today to review/test your actual patch. Grüße Thomas
Re: [PATCH] omp-low.c split
On Tue, Dec 13, 2016 at 12:39:01PM +0100, Thomas Schwinge wrote: > On Fri, 9 Dec 2016 14:08:21 +0100, Martin Jambor wrote: > > this is the promised attempt at splitting omp-low.c [...] > > Yay! \o/ > > I have not yet had a chance to review/test this patch, but I plan to. > > A few initial comments from the "bike shed departement"; I understand in > GCC sources it will not be easy to rename stuff (such as files) later, so > we should get the names agreed upon early: > > Generally, I agree with your division of "omp-low.c" parts. > > > - move everything that is part of pass_oacc_device_lower, > > pass_omp_device_lower and pass_omp_target_link to a new file > > omp-device.h, > > Should we call this file "omp-offload.c", as offloading is what this > deals with, is the term we agreed to generally use (as far as I can > tell)? That would be fine with me too. > > - move all pre-lowering gridification stuff to a new file > > omp-grid.c. [...] > > Is that code generic enough to not call this file "omp-hsa.c" or similar? And this as well. But omp-grid.c is fine too. > > - I moved stuff that was used from all over the place to a new file > > omp-general.c (unless it would mean exposing omp_region or > > omp_context types). > > I'd have called that simply "omp.c". The problem with that is that the corresponding header can't be called omp.h for obvious reasons, we already have one with very different meaning. Jakub
[PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object
Hi, Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly on GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC 5). The backports also bootstrap fine on x86_64-linux-gnu and testsuite shows no regression. ChangeLog entries are as follow: *** gcc/ChangeLog *** 2016-12-12 Thomas Preud'homme Backport from mainline 2016-11-25 Thomas Preud'homme PR tree-optimization/77673 * tree-ssa-math-opts.c (struct symbolic_number): Add new src field. (init_symbolic_number): Initialize src field from src parameter. (perform_symbolic_merge): Select most dominated statement as the source statement. Set src field of resulting n structure from the input src with the lowest address. (find_bswap_or_nop): Rename source_stmt into ins_stmt. (bswap_replace): Rename src_stmt into ins_stmt. Initially get source of load from src field rather than insertion statement. Cancel optimization if statement analyzed is not dominated by the insertion statement. (pass_optimize_bswap::execute): Rename src_stmt to ins_stmt. Compute dominance information. *** gcc/testsuite/ChangeLog *** 2016-12-12 Thomas Preud'homme Backport from mainline 2016-11-25 Thomas Preud'homme PR tree-optimization/77673 * gcc.dg/pr77673.c: New test. Is this ok for gcc-5-branch and gcc-6-branch? Best regards, Thomas diff --git a/gcc/config.gcc b/gcc/config.gcc index bfd1127d6e8e647ca8c3a57dd2d58b586dffe4a5..d7508e89de0e1d7a3027da0b04e4fa36c1e95fd0 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3710,34 +3710,18 @@ case "${target}" in # Add extra multilibs if test "x$with_multilib_list" != x; then arm_multilibs=`echo $with_multilib_list | sed -e 's/,/ /g'` - case ${arm_multilibs} in - aprofile) -# Note that arm/t-aprofile is a -# stand-alone make file fragment to be -# used only with itself. We do not -# specifically use the -# TM_MULTILIB_OPTION framework because -# this shorthand is more -# pragmatic. -tmake_profile_file="arm/t-aprofile" -;; - rmprofile) -# Note that arm/t-rmprofile is a -# stand-alone make file fragment to be -# used only with itself. We do not -# specifically use the -# TM_MULTILIB_OPTION framework because -# this shorthand is more -# pragmatic. -tmake_profile_file="arm/t-rmprofile" -;; - default) -;; - *) -echo "Error: --with-multilib-list=${with_multilib_list} not supported." 1>&2 -exit 1 -;; - esac + if test "x${arm_multilibs}" != xdefault ; then +for arm_multilib in ${arm_multilibs}; do + case ${arm_multilib} in + aprofile|rmprofile) + tmake_profile_file="arm/t-multilib" + ;; + *) + echo "Error: --with-multilib-list=${with_multilib_list} not supported." 1>&2 + exit 1 + ;; + esac +done if test "x${tmake_profile_file}" != x ; then # arm/t-aprofile and arm/t-rmprofile are only diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile index 90305e1206e3964e08a673e385d3198747bdffa1..2cbd8e3c8e8bcd4bed6368bfea83ece953c8dbb4 100644 --- a/gcc/config/arm/t-aprofile +++ b/gcc/config/arm/t-aprofile @@ -24,30 +24,13 @@ # have their default values during the configure step. We enforce # this during the top-level configury. -MULTILIB_OPTIONS = -MULTILIB_DIRNAMES= -MULTILIB_EXCEPTIONS = -MULTILIB_MATCHES = -MULTILIB_REUSE = +# Arch and FPU variants to build libraries with -# We have the following hierachy: -# ISA: A32 (.) or T32 (thumb) -# Architecture: ARMv7-A (v7-a), ARMv7VE (v7ve), or ARMv8-A (v8-a). -# FPU: VFPv3-D16 (fpv3), NEONv1 (simdv1), VFPv4-D16 (fpv4), -#NEON-VFPV4 (simdvfpv4), NEON for ARMv8 (simdv8), or None (.). -# Float-abi: Soft (.), softfp (softfp), or hard (hardfp). +MULTI_ARCH_OPTS_A = march=armv7-a/march=armv7ve/march=armv8-a +MULTI_ARCH_DIRS_A = v7-a v7ve v8-a -MULTILIB_OPTIONS += mthumb -MULTILIB_DIRNAMES += thumb - -MULTILIB_OPTIONS += march=armv7-a/march=armv7ve/march=armv8-a -MULTILIB_DIRNAMES += v7-a v7ve v8-a - -MULTILIB_OPTIONS += mfpu=vfpv3-d16/mfpu=neon/mfpu=vfpv4-d16/mfpu=neon-vfpv4/mfpu=neon-fp-armv8 -MULTILIB_DIRNAMES += fpv3 simdv1 fpv4 simdvfpv4 simdv8 - -MULTILIB_OPTIONS += mfloat-abi=softfp/mfloat-abi=hard -MULTILIB_DIRNAMES += softfp hard +MULTI_FPU_OPTS_A= mfpu=vfpv3-d16/mfpu=neon/mfpu=vfpv4-d16/mfpu=neon-vfpv4/mfpu=neon-fp-armv8 +MULTI_FPU_DIRS_A= fpv3 simdv1 fpv4 simdvfpv4 simdv8 # Option combinations to build library with @@ -71,7 +54,11 @@ MULTILIB_REQUIRED += *march=armv8-a MULTILIB_REQUIRED += *march=armv8-a/mfpu=neon-fp-armv8/mfloat-abi=* +# Matches + # CPU Matches +MULTILIB_MATCHES += march?armv7-a=mcpu?marvell-pj4 +MULTILIB_MATCHES += march?armv7-a=mcpu?generic-
[PATCH, GCC, gcc-5/6-branch] Improve comment for struct symbolic_number in bswap pass
Hi, Fix in r242869 for PR77673 was accompanied with r242870 which updated the description of the struct symbolic_number modified by the previous patch. It did so by rewriting the comment completely but I believe this patch should be still backported to make the comment match the code. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2016-12-12 Thomas Preud'homme Backport from mainline 2016-11-25 Thomas Preud'homme * tree-ssa-math-opts.c (struct symbolic_number): Improve comment. Is this ok for gcc-5-branch and gcc-6-branch? Best regards, Thomas diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 3882652dcfc0e04642196243a034f1f7c1405936..84c6fc80bc5b980ca8ed8cca5ad2ab7775f4f65b 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1619,25 +1619,32 @@ make_pass_cse_sincos (gcc::context *ctxt) return new pass_cse_sincos (ctxt); } -/* A symbolic number is used to detect byte permutation and selection - patterns. Therefore the field N contains an artificial number - consisting of octet sized markers: +/* A symbolic number structure is used to detect byte permutation and selection + patterns of a source. To achieve that, its field N contains an artificial + number consisting of BITS_PER_MARKER sized markers tracking where does each + byte come from in the source: - 0- target byte has the value 0 - FF - target byte has an unknown value (eg. due to sign extension) - 1..size - marker value is the target byte index minus one. + 0 - target byte has the value 0 + FF - target byte has an unknown value (eg. due to sign extension) + 1..size - marker value is the byte index in the source (0 for lsb). To detect permutations on memory sources (arrays and structures), a symbolic - number is also associated a base address (the array or structure the load is - made from), an offset from the base address and a range which gives the - difference between the highest and lowest accessed memory location to make - such a symbolic number. The range is thus different from size which reflects - the size of the type of current expression. Note that for non memory source, - range holds the same value as size. - - For instance, for an array char a[], (short) a[0] | (short) a[3] would have - a size of 2 but a range of 4 while (short) a[0] | ((short) a[0] << 1) would - still have a size of 2 but this time a range of 1. */ + number is also associated: + - a base address BASE_ADDR and an OFFSET giving the address of the source; + - a range which gives the difference between the highest and lowest accessed + memory location to make such a symbolic number; + - the address SRC of the source element of lowest address as a convenience + to easily get BASE_ADDR + offset + lowest bytepos. + + Note 1: the range is different from size as size reflects the size of the + type of the current expression. For instance, for an array char a[], + (short) a[0] | (short) a[3] would have a size of 2 but a range of 4 while + (short) a[0] | ((short) a[0] << 1) would still have a size of 2 but this + time a range of 1. + + Note 2: for non-memory sources, range holds the same value as size. + + Note 3: SRC points to the SSA_NAME in case of non-memory source. */ struct symbolic_number { uint64_t n; diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index ac15e8179a3257d1e190086c8089bc85ed8552bf..6413bd6d1ae17d04e276d97c088497d6d334823c 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1925,25 +1925,32 @@ make_pass_cse_sincos (gcc::context *ctxt) return new pass_cse_sincos (ctxt); } -/* A symbolic number is used to detect byte permutation and selection - patterns. Therefore the field N contains an artificial number - consisting of octet sized markers: +/* A symbolic number structure is used to detect byte permutation and selection + patterns of a source. To achieve that, its field N contains an artificial + number consisting of BITS_PER_MARKER sized markers tracking where does each + byte come from in the source: - 0- target byte has the value 0 - FF - target byte has an unknown value (eg. due to sign extension) - 1..size - marker value is the target byte index minus one. + 0 - target byte has the value 0 + FF - target byte has an unknown value (eg. due to sign extension) + 1..size - marker value is the byte index in the source (0 for lsb). To detect permutations on memory sources (arrays and structures), a symbolic - number is also associated a base address (the array or structure the load is - made from), an offset from the base address and a range which gives the - difference between the highest and lowest accessed memory location to make - such a symbolic number. The range is thus different from size which reflects - the size of the type of current expression. Note that for non
Re: [PATCH, GCC, gcc-5/6-branch] Fix PR77673: bswap loads passed end of object
Sorry, ignore the first attachment (2_combine_profile_multilib.patch). i always miss that Thunderbird selects the first file in the in-review folder upfront. Best regards, Thomas On 13/12/16 11:43, Thomas Preudhomme wrote: Hi, Fix in r242869 for PR77673 (bswap loads after end of object) applies cleanly on GCC 6 and with trivial fix for GCC 5 (gimple * in GCC 6 -> gimple in GCC 5). The backports also bootstrap fine on x86_64-linux-gnu and testsuite shows no regression. ChangeLog entries are as follow: *** gcc/ChangeLog *** 2016-12-12 Thomas Preud'homme Backport from mainline 2016-11-25 Thomas Preud'homme PR tree-optimization/77673 * tree-ssa-math-opts.c (struct symbolic_number): Add new src field. (init_symbolic_number): Initialize src field from src parameter. (perform_symbolic_merge): Select most dominated statement as the source statement. Set src field of resulting n structure from the input src with the lowest address. (find_bswap_or_nop): Rename source_stmt into ins_stmt. (bswap_replace): Rename src_stmt into ins_stmt. Initially get source of load from src field rather than insertion statement. Cancel optimization if statement analyzed is not dominated by the insertion statement. (pass_optimize_bswap::execute): Rename src_stmt to ins_stmt. Compute dominance information. *** gcc/testsuite/ChangeLog *** 2016-12-12 Thomas Preud'homme Backport from mainline 2016-11-25 Thomas Preud'homme PR tree-optimization/77673 * gcc.dg/pr77673.c: New test. Is this ok for gcc-5-branch and gcc-6-branch? Best regards, Thomas diff --git a/gcc/testsuite/gcc.dg/pr77673.c b/gcc/testsuite/gcc.dg/pr77673.c new file mode 100644 index ..e03bcb9284d1e5a0e496cfa547fdbab630bec04f --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr77673.c @@ -0,0 +1,19 @@ +/* { dg-do compile { target fpic } } */ +/* { dg-require-effective-target bswap32 } */ +/* { dg-options "-O2 -fPIC -fdump-tree-bswap" } */ +/* { dg-additional-options "-march=z900" { target s390*-*-* } } */ + +void mach_parse_compressed(unsigned char* ptr, unsigned long int* val) +{ + if (ptr[0] < 0xC0U) { +*val = ptr[0] + ptr[1]; +return; + } + + *val = ((unsigned long int)(ptr[0]) << 24) +| ((unsigned long int)(ptr[1]) << 16) +| ((unsigned long int)(ptr[2]) << 8) +| ptr[3]; +} + +/* { dg-final { scan-tree-dump-not "load_dst_\\d+ =.* if \\(" "bswap" } } */ diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 61b65bb824b1e90ab13e3cb3ac1b4fbbc34a3b70..3882652dcfc0e04642196243a034f1f7c1405936 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1645,6 +1645,7 @@ struct symbolic_number { tree base_addr; tree offset; HOST_WIDE_INT bytepos; + tree src; tree alias_set; tree vuse; unsigned HOST_WIDE_INT range; @@ -1746,6 +1747,7 @@ init_symbolic_number (struct symbolic_number *n, tree src) int size; n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE; + n->src = src; /* Set up the symbolic number N by setting each byte to a value between 1 and the byte size of rhs1. The highest order byte is set to n->size and the @@ -1859,6 +1861,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1, uint64_t inc; HOST_WIDE_INT start_sub, end_sub, end1, end2, end; struct symbolic_number *toinc_n_ptr, *n_end; + basic_block bb1, bb2; if (!n1->base_addr || !n2->base_addr || !operand_equal_p (n1->base_addr, n2->base_addr, 0)) @@ -1872,15 +1875,20 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1, { n_start = n1; start_sub = n2->bytepos - n1->bytepos; - source_stmt = source_stmt1; } else { n_start = n2; start_sub = n1->bytepos - n2->bytepos; - source_stmt = source_stmt2; } + bb1 = gimple_bb (source_stmt1); + bb2 = gimple_bb (source_stmt2); + if (dominated_by_p (CDI_DOMINATORS, bb1, bb2)) + source_stmt = source_stmt1; + else + source_stmt = source_stmt2; + /* Find the highest address at which a load is performed and compute related info. */ end1 = n1->bytepos + (n1->range - 1); @@ -1937,6 +1945,7 @@ perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1, n->vuse = n_start->vuse; n->base_addr = n_start->base_addr; n->offset = n_start->offset; + n->src = n_start->src; n->bytepos = n_start->bytepos; n->type = n_start->type; size = TYPE_PRECISION (n->type) / BITS_PER_UNIT; @@ -2147,7 +2156,7 @@ find_bswap_or_nop (gimple stmt, struct symbolic_number *n, bool *bswap) uint64_t cmpxchg = CMPXCHG; uint64_t cmpnop = CMPNOP; - gimple source_stmt; + gimple ins_stmt; int limit; /* The last parameter determines the depth search limit. It usually @@ -2157,9 +2166,9 @@ find_bswap_or_nop
Re: [PATCH, fortran, pr77785, v2] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990
Hi Janus, no sorry. I mixed up the context. I thought your question was on pr78534. Sorry for getting those two PRs mixed up. Just void my answer below. It is wrong. I will see what I can do about a better testcase for the trans-array part. The code responsible for the error unfortunately does not have a main program. So I need to invent something. - Andre On Tue, 13 Dec 2016 12:11:50 +0100 Janus Weil wrote: > Hi Andre, > > > all the sanitizer issues I fixed occur during compiling the testsuite. So I > > would say, that when with the patch these errors do not occur anymore while > > processing the testsuite, then those are tested for, right? > > aah, so you're saying that hunk is not actually related to the PR in > the subject line, but instead fixes a testsuite failure seen with a > sanitized compiler? That wasn't mentioned anywhere and sadly I forgot > to bring my crystal ball ... > > Cheers, > Janus > > > > > On Mon, 12 Dec 2016 13:37:43 +0100 > > Janus Weil wrote: > > > >> Hi Andre, > >> > >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for > >> > allocatable and pointer components in polymorphic objects (BT_CLASS) thus > >> > fixing two ICEs reported in the PR. > >> > > >> > The next chunk fixes an ICE when the declaration containing the token > >> > information is of type POINTER or REFERENCE. > >> > > >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? > >> > >> the resolve.c hunk is certainly ok. The trans-array.c part looks > >> reasonable as well, but I wonder if it is actually covered by any of > >> your test cases? Since they are all compile-only, with errors being > >> thrown at resolution stage, do they even get to the translation stage? > >> > >> Cheers, > >> Janus > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: [PATCH] [AArch64] Implement popcount pattern
Hi Kyrill, Thanks for reviewing the patch and your useful comments. >> looks good to me if it has gone through the normal required >> bootstrap and testing, but I can't approve. Bootstrapped and Regression Tested on aarch64-thunderx-linux. >> The rest of the MD file uses the term AdvSIMD. Also, the instrurction >> is CNT rather than "pop count". Done. >> __builtin_popcount takes an unsigned int, so this should be >> scanning for absence of popcountsi2 instead? Done. Please find attached the modified patch as per review comments and let me know if its okay for Stage-1 or current branch. Thanks, Naveendiff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 65eb326..0acb3f0 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3785,6 +3785,39 @@ } ) +;; Pop count be done via the "CNT" instruction in AdvSIMD. +;; +;; MOV v.1d, x0 +;; CNT v1.8b, v.8b +;; ADDV b2, v1.8b +;; MOV w0, v2.b[0] + +(define_expand "popcount2" + [(match_operand:GPI 0 "register_operand") + (match_operand:GPI 1 "register_operand")] + "TARGET_SIMD" +{ + rtx v = gen_reg_rtx (V8QImode); + rtx v1 = gen_reg_rtx (V8QImode); + rtx r = gen_reg_rtx (QImode); + rtx in = operands[1]; + rtx out = operands[0]; + if(mode == SImode) +{ + rtx tmp; + tmp = gen_reg_rtx (DImode); + /* If we have SImode, zero extend to DImode, pop count does + not change if we have extra zeros. */ + emit_insn (gen_zero_extendsidi2 (tmp, in)); + in = tmp; +} + emit_move_insn (v, gen_lowpart (V8QImode, in)); + emit_insn (gen_popcountv8qi2 (v1, v)); + emit_insn (gen_reduc_plus_scal_v8qi (r, v1)); + emit_insn (gen_zero_extendqi2 (out, r)); + DONE; +}) + (define_insn "clrsb2" [(set (match_operand:GPI 0 "register_operand" "=r") (clrsb:GPI (match_operand:GPI 1 "register_operand" "r")))] diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt.c b/gcc/testsuite/gcc.target/aarch64/popcnt.c new file mode 100644 index 000..37cf4b9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/popcnt.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int +foo (int x) +{ + return __builtin_popcount (x); +} + +long +foo1 (int x) +{ + return __builtin_popcountl (x); +} + +/* { dg-final { scan-assembler-not "popcount" } } */ +/* { dg-final { scan-assembler-times "cnt\t" 2 } } */
Re: [PATCH] [AArch64] Implement popcount pattern
Hi Naveen, On 13/12/16 11:51, Hurugalawadi, Naveen wrote: Hi Kyrill, Thanks for reviewing the patch and your useful comments. looks good to me if it has gone through the normal required bootstrap and testing, but I can't approve. Bootstrapped and Regression Tested on aarch64-thunderx-linux. The rest of the MD file uses the term AdvSIMD. Also, the instrurction is CNT rather than "pop count". Done. __builtin_popcount takes an unsigned int, so this should be scanning for absence of popcountsi2 instead? Done. Please find attached the modified patch as per review comments and let me know if its okay for Stage-1 or current branch. This looks much better, thanks. I still have a minor nit about the testcase. +long +foo1 (int x) +{ + return __builtin_popcountl (x); +} On ILP32 systems this would still use the SImode patterns, so I suggest you use __builtin_popcountll and an unsigned long long return type to ensure you always exercise the 64-bit code. + +/* { dg-final { scan-assembler-not "popcount" } } */ This looks ok to me otherwise, but you'll need an ok from the aarch64 folk. Kyrill Thanks, Naveen
Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known
On 13 December 2016 at 15:27, Jakub Jelinek wrote: > On Tue, Dec 13, 2016 at 03:08:17PM +0530, Prathamesh Kulkarni wrote: >> Thanks for the suggestions. It didn't occur to me to check for gimple_cond. >> I have tried to do the changes in the attached version. >> I am not sure if I have handled cond_expr correctly. >> IIUC, if gimple_assign has code cond_expr, then the condition is >> stored in gimple_assign_rhs1, >> however it's not a single operand but a tree of the form "op1 cond_code op2". >> Is that correct ? > > Yes. gimple_assign_rhs1 will be in what you are looking for EQ_EXPR or > NE_EXPR tree, its TREE_CODE will be this code you want to check, and > TREE_OPERAND (exp, 0) and TREE_OPERAND (exp, 1) the rhs1 and rhs2 you use > elsewhere. > >> However I am not able to write a test-case that generates cond_expr in the >> IL. >> I tried: >> t1 = strstr (s, t); >> (t1 == s) ? foo() : bar (); >> and other such variants but it seems the ?: operator is getting >> lowered to gimple_cond instead. > > It is, but in some cases tree-if-conv.c turns them back into COND_EXPRs. > I guess you need -ftree-loop-if-convert now, and it has to be in some loop > where the addition of cond_expr would likely turn it into a single bb loop. > You probably want constants or vars, not function calls in the ? : > expressions though. > >> +/* Try to fold strstr (s, t) == s to memcmp (s, t, strlen (t)) == 0. */ >> + >> +static void >> +fold_strstr_to_memcmp(enum tree_code code, tree rhs1, tree rhs2, gimple >> *stmt) > > Formatting, space before (. > >> +{ >> + gimple *call_stmt = NULL; >> + for (int pass = 0; pass < 2; pass++) >> +{ >> + gimple *g = SSA_NAME_DEF_STMT (rhs1); >> + if (g > > I think g should be always non-NULL (except for invalid IL), so probably no > need to check it. Ah indeed, thanks for pointing out. I assumed if ssa-var has default definition, then SSA_NAME_DEF_STMT would be NULL, but it's GIMPLE_NOP. > >> + && gimple_call_builtin_p (g, BUILT_IN_STRSTR) >> + && has_single_use (rhs1) >> + && gimple_call_arg (as_a (g), 0) == rhs2) > > I think gimple_call_arg works fine even with just gimple * argument. > So you can avoid the as_a (g) uglification and just use g. > >> + if (is_gimple_assign (stmt)) >> + { >> + if (gimple_assign_rhs_code (stmt) == COND_EXPR) >> + { >> + tree cond = gimple_assign_rhs1 (stmt); >> + TREE_SET_CODE (cond, EQ_EXPR); > > This looks weird. You are hardcoding EQ_EXPR, while for the > other case below you use code. So, do you handle properly both > EQ_EXPR and NE_EXPR for this and gimple_cond cases? > Also, for non-COND_EXPR assign you build a new stmt instead of reusing > the existing one, why? > >> + TREE_OPERAND (cond, 0) = memcmp_lhs; >> + TREE_OPERAND (cond, 1) = zero; >> + update_stmt (stmt); >> + } >> + else >> + { >> + gsi = gsi_for_stmt (stmt); >> + tree lhs = gimple_assign_lhs (stmt); >> + gassign *ga = gimple_build_assign (lhs, code, memcmp_lhs, >> + zero); >> + gsi_replace (&gsi, ga, false); >> + } >> + } >> + else >> + { >> + gcond *cond = as_a (stmt); >> + gimple_cond_set_lhs (cond, memcmp_lhs); >> + gimple_cond_set_rhs (cond, zero); >> + gimple_cond_set_code (cond, EQ_EXPR); > > Likewise here. Oops, sorry about that :/ Does this version look OK ? Bootstrap+test in progress. Thanks, Prathamesh > > Jakub 2016-12-13 Jakub Jelinek Prathamesh Kulkarni * tree-ssa-strlen.c (fold_strstr_to_memcmp): New function. (strlen_optimize_stmt): Call fold_strstr_to_memcmp. testsuite/ * gcc.dg/strlenopt-30.c: New test-case. diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c b/gcc/testsuite/gcc.dg/strlenopt-30.c new file mode 100644 index 000..089b3a2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/strlenopt-30.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +__attribute__((no_icf)) +_Bool f1(char *s) +{ + return __builtin_strstr (s, "hello") == s; +} + +__attribute__((no_icf)) +_Bool f2(char *s) +{ + return s == __builtin_strstr (s, "hello"); +} + +__attribute__((no_icf)) +_Bool f3(char *s) +{ + return s != __builtin_strstr (s, "hello"); +} + +__attribute__((no_icf)) +_Bool f4() +{ + char *foo_f4(void); + char *t1 = foo_f4(); + char *t2 = __builtin_strstr (t1, "hello"); + _Bool t3 = t2 == t1; + return t3; +} + +__attribute__((no_icf)) +void f5(char *s) +{ + char *t1 = __builtin_strstr (s, "hello"); + void foo_f5(void); + if (t1 != s) +foo_f5(); +} + +/* Do not perform transform, since strlen (t) + is unknown. */ + +__attribute__((no_
Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions
On 12/13/2016 10:17 AM, Jakub Jelinek wrote: > On Tue, Dec 13, 2016 at 09:56:00AM +0100, Martin Liška wrote: >> @@ -671,18 +678,203 @@ public: >> >> }; // class pass_sanopt >> > > Please add a short function comment here... > >> +static void >> +sanitize_asan_mark_unpoison (void) >> +{ >> + /* 1) Find all BBs that contain an ASAN_MARK poison call. */ >> + auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1); >> + bitmap_clear (with_poison); >> + basic_block bb; > > ... and here. Done. > >> +static void >> +sanitize_asan_mark_poison (void) >> +{ >> + /* 1) Find all BBs that possibly contain an ASAN_CHECK. */ >> + auto_sbitmap with_check (last_basic_block_for_fn (cfun) + 1); >> + bitmap_clear (with_check); >> + basic_block bb; > >> + FOR_EACH_BB_FN (bb, cfun) >> +{ >> + if (bitmap_bit_p (with_check, bb->index)) >> +continue; > > When would this happen? It seems you only set it for the current bb, > it doesn't seem that you'd try to process the same bb multiple times. That's not needed, probably an artifact from very first version. > >> + gimple_stmt_iterator gsi; >> + bool finish = false; >> + for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) >> +{ >> + gimple *stmt = gsi_stmt (gsi); >> + if (maybe_contains_asan_check (stmt)) >> +{ >> + bitmap_set_bit (with_check, bb->index); >> + finish = true; > > Why the finish var and separate if (finish) break; ? Perhaps from the first > iteration when you used a switch? Just doing break; should be enough. Well, I verified that even the first iteration (of patch) needed if(finish) break; as I need to jump to next iteration of FOR_EACH_BB_FN (bb, cfun). > >> + else if (asan_mark_p (stmt, ASAN_MARK_POISON)) >> +{ >> + if (dump_file) >> +fprintf (dump_file, "Removing ASAN_MARK poison\n"); >> + unlink_stmt_vdef (stmt); >> + gsi_remove (&gsi, true); > > Don't you need also release_defs (stmt); here (and in the other gsi_remove > spot)? As I remove only internal function calls that do not define a SSA name then not: ASAN_MARK (UNPOISON, &my_char, 1); ASAN_MARK (POISON, &my_char, 1); Ready to install? M. > > Jakub > >From 9335008e783155e131ca1dec1ef468f371cfcd1e Mon Sep 17 00:00:00 2001 From: marxin Date: Mon, 12 Dec 2016 15:22:05 +0100 Subject: [PATCH] Add sanopt for ASAN_MARK poison and unpoison. gcc/ChangeLog: 2016-12-12 Martin Liska * sanopt.c (sanopt_optimize_walker): Set contains_asan_mark. (sanopt_optimize): Add new argument. (sanitize_asan_mark_unpoison): New function. (maybe_contains_asan_check): Likewise. (sanitize_asan_mark_poison): Likewise. (pass_sanopt::execute): Call the new functions. --- gcc/sanopt.c | 210 ++- 1 file changed, 207 insertions(+), 3 deletions(-) diff --git a/gcc/sanopt.c b/gcc/sanopt.c index 320e14e9421..3aebb1621c9 100644 --- a/gcc/sanopt.c +++ b/gcc/sanopt.c @@ -160,8 +160,10 @@ struct sanopt_ctx /* Number of IFN_ASAN_CHECK statements. */ int asan_num_accesses; -}; + /* True when the current functions constains an ASAN_MARK. */ + bool contains_asan_mark; +}; /* Return true if there might be any call to free/munmap operation on any path in between DOM (which should be imm(BB)) and BB. */ @@ -582,6 +584,9 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) if (!remove) ctx->asan_num_accesses++; break; + case IFN_ASAN_MARK: + ctx->contains_asan_mark = true; + break; default: break; } @@ -620,10 +625,11 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) /* Try to remove redundant sanitizer checks in function FUN. */ static int -sanopt_optimize (function *fun) +sanopt_optimize (function *fun, bool *contains_asan_mark) { struct sanopt_ctx ctx; ctx.asan_num_accesses = 0; + ctx.contains_asan_mark = false; /* Set up block info for each basic block. */ alloc_aux_for_blocks (sizeof (sanopt_info)); @@ -638,6 +644,7 @@ sanopt_optimize (function *fun) free_aux_for_blocks (); + *contains_asan_mark = ctx.contains_asan_mark; return ctx.asan_num_accesses; } @@ -671,18 +678,207 @@ public: }; // class pass_sanopt +/* Sanitize all ASAN_MARK unpoison calls that are not reachable by a BB + that contains an ASAN_MARK poison. All these ASAN_MARK unpoison call + can be removed as all variables are unpoisoned in a function prologue. */ + +static void +sanitize_asan_mark_unpoison (void) +{ + /* 1) Find all BBs that contain an ASAN_MARK poison call. */ + auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1); + bitmap_clear (with_poison); + basic_block bb; + + FOR_EACH_BB_FN (bb, cfun) +{ + if (bitmap_bit_p (with_poison, bb->index)) + continue; + + gimple_stmt_iterator gsi; + for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
Re: [PATCH] Add pretty printer for ASAN_MARK and add a helper fn
On 12/13/2016 10:05 AM, Jakub Jelinek wrote: > Ok. But the builtins should be renamed too (incrementally), > BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory", > should really be BUILT_IN_ASAN_POISON_STACK_MEMORY etc. > > Jakub This is follow-up that I've just tested and reg-bootstrapped. Ready for trunk? Thanks, Martin >From 8fd18d8fde8f2e3a10812698c37d601c2a52aee7 Mon Sep 17 00:00:00 2001 From: marxin Date: Tue, 13 Dec 2016 10:19:48 +0100 Subject: [PATCH 1/2] Rename BUILT_IN_ASAN_CLOBBER_N to BUILT_IN_ASAN_POISON_STACK_MEMORY. gcc/ChangeLog: 2016-12-13 Martin Liska * asan.c (asan_expand_mark_ifn): Use renamed BUILT_IN_ASAN_{UN}CLOBBER_N to BUILT_IN_ASAN_{UN}POISON_STACK_MEMORY. * sanitizer.def: Likewise. --- gcc/asan.c| 5 +++-- gcc/sanitizer.def | 6 -- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index e297784270d..53acff0a2fb 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -2838,8 +2838,9 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter) gsi_insert_before (iter, g, GSI_SAME_STMT); tree sz_arg = gimple_assign_lhs (g); - tree fun = builtin_decl_implicit (is_poison ? BUILT_IN_ASAN_CLOBBER_N - : BUILT_IN_ASAN_UNCLOBBER_N); + tree fun + = builtin_decl_implicit (is_poison ? BUILT_IN_ASAN_POISON_STACK_MEMORY + : BUILT_IN_ASAN_UNPOISON_STACK_MEMORY); g = gimple_build_call (fun, 2, base_addr, sz_arg); gimple_set_location (g, loc); gsi_insert_after (iter, g, GSI_NEW_STMT); diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def index 3db08a7b702..43e46f90e9c 100644 --- a/gcc/sanitizer.def +++ b/gcc/sanitizer.def @@ -165,9 +165,11 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT, DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT, "__asan_after_dynamic_init", BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST) -DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory", +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POISON_STACK_MEMORY, + "__asan_poison_stack_memory", BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST) -DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, "__asan_unpoison_stack_memory", +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNPOISON_STACK_MEMORY, + "__asan_unpoison_stack_memory", BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST) /* Thread Sanitizer */ -- 2.11.0
Re: Fold strstr (s, t) eq/ne s to strcmp (s, t) eq/ne 0 if strlen (t) is known
On Tue, Dec 13, 2016 at 05:41:09PM +0530, Prathamesh Kulkarni wrote: > --- a/gcc/tree-ssa-strlen.c > +++ b/gcc/tree-ssa-strlen.c > @@ -,6 +,90 @@ handle_char_store (gimple_stmt_iterator *gsi) >return true; > } > > +/* Try to fold strstr (s, t) eq/ne s to memcmp (s, t, strlen (t)) eq/ne 0. > */ > + > +static void > +fold_strstr_to_memcmp (enum tree_code code, tree rhs1, tree rhs2, gimple > *stmt) You can drop code argument here, see below. And I'd say it is better to do the if (TREE_CODE (rhs1) != SSA_NAME || TREE_CODE (rhs2) != SSA_NAME) return; here than repeat it in all the callers. > + if (gimple_assign_rhs_code (stmt) == COND_EXPR) > + { > + tree cond = gimple_assign_rhs1 (stmt); > + TREE_SET_CODE (cond, code); TREE_CODE (cond) is already code, so no need to set it again. > + gcond *cond = as_a (stmt); > + gimple_cond_set_lhs (cond, memcmp_lhs); > + gimple_cond_set_rhs (cond, zero); > + gimple_cond_set_code (cond, code); And gimple_cond_code (cond) == code here too. > + update_stmt (cond); > + } You can perhaps move the update_stmt (stmt); to a single spot after all the 3 cases are handled. > + if (cond_code == EQ_EXPR || cond_code == NE_EXPR) > + { > + tree rhs1 = TREE_OPERAND (cond, 0); > + tree rhs2 = TREE_OPERAND (cond, 1); While it is necessary to check cond_code here and in the other spots similarly, because otherwise you don't know if it has 2 arguments etc., you can avoid the SSA_NAME tests here. > + if (TREE_CODE (rhs1) == SSA_NAME > + && TREE_CODE (rhs2) == SSA_NAME) > + fold_strstr_to_memcmp (cond_code, rhs1, rhs2, stmt); > + } > + } > + else if (code == EQ_EXPR || code == NE_EXPR) > + { > + tree rhs1 = gimple_assign_rhs1 (stmt); > + tree rhs2 = gimple_assign_rhs2 (stmt); > + > + if (TREE_CODE (rhs1) == SSA_NAME > + && TREE_CODE (rhs2) == SSA_NAME) And here. > + fold_strstr_to_memcmp (code, rhs1, rhs2, stmt); > + } > + } > +else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs)) > { > tree type = TREE_TYPE (lhs); > if (TREE_CODE (type) == ARRAY_TYPE) > @@ -2316,6 +2427,17 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi) > } > } > } > + else if (gcond *cond = dyn_cast (stmt)) > +{ > + enum tree_code code = gimple_cond_code (cond); > + tree lhs = gimple_cond_lhs (stmt); > + tree rhs = gimple_cond_rhs (stmt); > + > + if ((code == EQ_EXPR || code == NE_EXPR) > + && TREE_CODE (lhs) == SSA_NAME > + && TREE_CODE (rhs) == SSA_NAME) And here. > + fold_strstr_to_memcmp (code, lhs, rhs, stmt); > +} > >if (gimple_vdef (stmt)) > maybe_invalidate (stmt); Otherwise LGTM, but it would be nice to cover also the COND_EXPR case by a testcase (can be done incrementally). Jakub
Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions
On Tue, Dec 13, 2016 at 01:12:34PM +0100, Martin Liška wrote: > >> + gimple_stmt_iterator gsi; > >> + bool finish = false; > >> + for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) > >> + { > >> +gimple *stmt = gsi_stmt (gsi); > >> +if (maybe_contains_asan_check (stmt)) > >> + { > >> +bitmap_set_bit (with_check, bb->index); > >> +finish = true; > > > > Why the finish var and separate if (finish) break; ? Perhaps from the first > > iteration when you used a switch? Just doing break; should be enough. > > Well, I verified that even the first iteration (of patch) needed if(finish) > break; > as I need to jump to next iteration of FOR_EACH_BB_FN (bb, cfun). I fail to see functional difference between for (...) { ... bool finish = false; for (...) { ... if (...) { ... finish = true; } if (finish) break; } } and for (...) { ... for (...) { ... if (...) { ... break; } } } just the latter is not obfuscated. The break is in both cases in the same for loop. > > Don't you need also release_defs (stmt); here (and in the other gsi_remove > > spot)? > > As I remove only internal function calls that do not define a SSA name then > not: > ASAN_MARK (UNPOISON, &my_char, 1); > ASAN_MARK (POISON, &my_char, 1); If they have a vdef, then I believe they do define a SSA name (the vdef). I think unlink_stmt_vdef does not release the vdef SSA_NAME if any. Jakub
Re: [PATCH] Add pretty printer for ASAN_MARK and add a helper fn
On Tue, Dec 13, 2016 at 01:14:00PM +0100, Martin Liška wrote: > On 12/13/2016 10:05 AM, Jakub Jelinek wrote: > > Ok. But the builtins should be renamed too (incrementally), > > BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory", > > should really be BUILT_IN_ASAN_POISON_STACK_MEMORY etc. > > > > Jakub > > This is follow-up that I've just tested and reg-bootstrapped. > > Ready for trunk? > Thanks, > Martin > >From 8fd18d8fde8f2e3a10812698c37d601c2a52aee7 Mon Sep 17 00:00:00 2001 > From: marxin > Date: Tue, 13 Dec 2016 10:19:48 +0100 > Subject: [PATCH 1/2] Rename BUILT_IN_ASAN_CLOBBER_N to > BUILT_IN_ASAN_POISON_STACK_MEMORY. > > gcc/ChangeLog: > > 2016-12-13 Martin Liska > > * asan.c (asan_expand_mark_ifn): Use renamed > BUILT_IN_ASAN_{UN}CLOBBER_N to BUILT_IN_ASAN_{UN}POISON_STACK_MEMORY. > * sanitizer.def: Likewise. Ok, thanks. Jakub
Re: [PATCH] omp-low.c split
Hi, On Tue, Dec 13, 2016 at 12:43:16PM +0100, Jakub Jelinek wrote: > On Tue, Dec 13, 2016 at 12:39:01PM +0100, Thomas Schwinge wrote: > > On Fri, 9 Dec 2016 14:08:21 +0100, Martin Jambor wrote: > > > this is the promised attempt at splitting omp-low.c [...] > > > > Yay! \o/ > > > > I have not yet had a chance to review/test this patch, but I plan to. > > > > A few initial comments from the "bike shed departement"; I understand in > > GCC sources it will not be easy to rename stuff (such as files) later, so > > we should get the names agreed upon early: > > > > Generally, I agree with your division of "omp-low.c" parts. > > > > > - move everything that is part of pass_oacc_device_lower, > > > pass_omp_device_lower and pass_omp_target_link to a new file > > > omp-device.h, > > > > Should we call this file "omp-offload.c", as offloading is what this > > deals with, is the term we agreed to generally use (as far as I can > > tell)? > > That would be fine with me too. OK, will do. > > > > - move all pre-lowering gridification stuff to a new file > > > omp-grid.c. [...] > > > > Is that code generic enough to not call this file "omp-hsa.c" or similar? > Not at the moment, but... > And this as well. But omp-grid.c is fine too. ...I prefer omp-grid.c because I plan to use gridification also for GCN targets, though hopefully only as an optimization rather than a hard requirement ...and in fact I still think it is a good optimization of simple loops for execution on all CUDA-like environments with block/thread grids because it removes conditions which the run-time can handle better. > > > > - I moved stuff that was used from all over the place to a new file > > > omp-general.c (unless it would mean exposing omp_region or > > > omp_context types). > > > > I'd have called that simply "omp.c". > > The problem with that is that the corresponding header can't be called > omp.h for obvious reasons, we already have one with very different meaning. > That is exactly the reason why I chose omp-general. Thanks, Martin
Re: [PATCH] omp-low.c split
On Tue, 13 Dec 2016, Martin Jambor wrote: > I have bootstrapped the two patches on aarch64-linux and bootstrapped > and tested them on x86_64-linux. What do you think? Sorry for my 'false alarm' about cp/parser.c conflict in the previous mail -- I thought I was applying your patch to trunk, but now I see my tree was outdated. In the new patch there are 7 whitespace issues that git complains about: zcat 0001-Split-omp-low-into-multiple-files.patch.gz | git apply --check --whitespace=error-all - :2779: space before tab in indent. # BLOCK 2 (PAR_ENTRY_BB) :5391: space before tab in indent. true, GSI_SAME_STMT); :8174: space before tab in indent. after a stmt, not before. */ :9105: space before tab in indent. GOMP_atomic_start (); :9106: space before tab in indent. *addr = rhs; :9107: space before tab in indent. GOMP_atomic_end (); :10327: space before tab in indent. region. */ error: 7 lines add whitespace errors. A couple of typos in the Changelog: > (is_combined_parallel): kMoved to omp-expand.c. s/k// > * config/nvptx/nvptx.c: Include omp-generic.c. omp-general.h :) Alexander
RE: [PATCH, testsuite] MIPS: Skip msa-builtins-err.c if there is no assembly output.
Toma Tabacu writes: > > > > It's a shame this is the only way to deal with this but I see aarch64 > > have to resort to the same thing for error checking tests. > > > > I have looked into this some more and I 've found that the solution I > proposed is incomplete. > > The problem is that if no linker plugin is found, -fno-fat-lto-objects > is not passed on, so the test isn't skipped and it will fail because - > flto doesn't do assembly generation by default and the errors will not > be triggered. > > This can be fixed by adding -ffat-lto-objects as a test option for error > tests, as shown in the patch below. > > The thing is, this already happens for scan-assembler & friends behind- > the-scenes, but not for dg-error, because the former are run through > force_conventional_output, while the latter is not. > > A nicer solution would be to have a new directive which would do nothing > except for the force_conventional_output part, thus forcing assembly > generation, but this may be overkill. > > Regards, > Toma > > gcc/testsuite/ChangeLog: > > * gcc.target/mips/msa-builtins-err.c (dg-options): Use >-ffat-lto-objects to guarantee assembly generation. OK, thanks for investigating. Slight tweak to the changelog we just say what changed in the changelog not why. You covered 'why' in the code comment. This is just: (dg-options): Add -ffat-lto-objects option. Thanks, Matthew
RFC: Allow emergency EH pool size to be controlled (for stage 1)
This patch allows the size of the emergency buffer for exception handling to be controlled by a build-time macro (to avoid dynamic allocation) or by a run-time environment variable (to allocate a larger or smaller buffer). This will have to wait for the next stage 1 now, as it's too late for GCC 7, but this shows what I'm thinking about and so might avoid anybody else reinventing the wheel. The patch doesn't include documentation updates for the manual, which would be needed to explain the use of the macro and the env var. We could also add a --with-libstdcxx-static-eh-pool=N configure flag to make it easier to set the macro. commit 4e021613dcda14dfec5c88286f8faf3001afcd17 Author: Jonathan Wakely Date: Tue Dec 13 13:23:48 2016 + Allow pool size to be set by macro or env var diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac index 7d0fafa..be4679e 100644 --- a/libstdc++-v3/configure.ac +++ b/libstdc++-v3/configure.ac @@ -257,6 +257,7 @@ if $GLIBCXX_IS_NATIVE; then AC_CHECK_FUNCS(__cxa_thread_atexit_impl) AC_CHECK_FUNCS(aligned_alloc posix_memalign memalign _aligned_malloc) + AC_CHECK_FUNCS(secure_getenv) # For iconv support. AM_ICONV diff --git a/libstdc++-v3/crossconfig.m4 b/libstdc++-v3/crossconfig.m4 index 8cc788c..c01213a 100644 --- a/libstdc++-v3/crossconfig.m4 +++ b/libstdc++-v3/crossconfig.m4 @@ -184,6 +184,7 @@ case "${host}" in GCC_CHECK_TLS AC_CHECK_FUNCS(__cxa_thread_atexit_impl) AC_CHECK_FUNCS(aligned_alloc posix_memalign memalign _aligned_malloc) +AC_CHECK_FUNCS(secure_getenv) AM_ICONV ;; *-mingw32*) diff --git a/libstdc++-v3/libsupc++/eh_alloc.cc b/libstdc++-v3/libsupc++/eh_alloc.cc index d362e40..571fe6b 100644 --- a/libstdc++-v3/libsupc++/eh_alloc.cc +++ b/libstdc++-v3/libsupc++/eh_alloc.cc @@ -29,6 +29,7 @@ #include #if _GLIBCXX_HOSTED #include +#include #endif #include #include @@ -36,6 +37,14 @@ #include #include +#ifdef STATIC_EH_ALLOC_POOL_BYTES +char* alloc_pool_arena(std::size_t& size) +{ + size = STATIC_EH_ALLOC_POOL_BYTES; + static alignas(void*) char arena[STATIC_EH_ALLOC_POOL_BYTES]; + return arena; +} +#else #if _GLIBCXX_HOSTED using std::free; using std::malloc; @@ -50,8 +59,6 @@ extern "C" void *memset (void *, int, std::size_t); using namespace __cxxabiv1; -// ??? How to control these parameters. - // Guess from the size of basic types how large a buffer is reasonable. // Note that the basic c++ exception header has 13 pointers and 2 ints, // so on a system with PSImode pointers we're talking about 56 bytes @@ -73,6 +80,32 @@ using namespace __cxxabiv1; # define EMERGENCY_OBJ_COUNT 4 #endif +char *alloc_pool_arena(std::size_t& size) +{ + size = 0; +#if _GLIBCXX_HOSTED + const char *name = "GLIBCXX_EH_ARENA_SIZE"; + char *env = nullptr; +# ifdef _GLIBCXX_HAVE_SECURE_GETENV + env = ::secure_getenv (name); +# else + env = std::getenv (name); +# endif + if (env) +{ + char *end; + size = strtoul(env, &end, 0); + if (*end || (size == ULONG_MAX && errno == ERANGE)) + size = 0; +} +#endif // _GLIBCXX_HOSTED + if (size == 0) +size = (EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT + + EMERGENCY_OBJ_COUNT * sizeof (__cxa_dependent_exception)); + return (char *)malloc (size); +} +#endif // STATIC_EH_ALLOC_POOL_BYTES + namespace __gnu_cxx { void __freeres(); @@ -107,7 +140,7 @@ namespace // The free-list free_entry *first_free_entry; // The arena itself - we need to keep track of these only - // to implement in_pool. + // to implement in_pool and __freeres. char *arena; std::size_t arena_size; @@ -116,11 +149,8 @@ namespace pool::pool() { - // Allocate the arena - we could add a GLIBCXX_EH_ARENA_SIZE environment - // to make this tunable. - arena_size = (EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT - + EMERGENCY_OBJ_COUNT * sizeof (__cxa_dependent_exception)); - arena = (char *)malloc (arena_size); + // Allocate the arena. + arena = alloc_pool_arena (arena_size); if (!arena) { // If the allocation failed go without an emergency pool. @@ -255,11 +285,13 @@ namespace __gnu_cxx void __freeres() { +#ifndef STATIC_EH_ALLOC_POOL_BYTES if (emergency_pool.arena) { ::free(emergency_pool.arena); emergency_pool.arena = 0; } +#endif } }
Re: [PATCH, fortran, pr77785, v3] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990
Hi Janus, hi all, thanks for the input on the missing testcase, Janus (btw, when you know where to get a new crystal ball, let me know; I am missing mine, too). The new version of the patch adds a new testcase coarray_41.f90 to test that the compiler compiles correctly and the test runs ok. Bootstrapped and regtested on x86_64-linux/f23. Ok for trunk? Regards, Andre On Tue, 13 Dec 2016 12:11:50 +0100 Janus Weil wrote: > Hi Andre, > > > all the sanitizer issues I fixed occur during compiling the testsuite. So I > > would say, that when with the patch these errors do not occur anymore while > > processing the testsuite, then those are tested for, right? > > aah, so you're saying that hunk is not actually related to the PR in > the subject line, but instead fixes a testsuite failure seen with a > sanitized compiler? That wasn't mentioned anywhere and sadly I forgot > to bring my crystal ball ... > > Cheers, > Janus > > > > > On Mon, 12 Dec 2016 13:37:43 +0100 > > Janus Weil wrote: > > > >> Hi Andre, > >> > >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for > >> > allocatable and pointer components in polymorphic objects (BT_CLASS) thus > >> > fixing two ICEs reported in the PR. > >> > > >> > The next chunk fixes an ICE when the declaration containing the token > >> > information is of type POINTER or REFERENCE. > >> > > >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? > >> > >> the resolve.c hunk is certainly ok. The trans-array.c part looks > >> reasonable as well, but I wonder if it is actually covered by any of > >> your test cases? Since they are all compile-only, with errors being > >> thrown at resolution stage, do they even get to the translation stage? > >> > >> Cheers, > >> Janus > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de -- Andre Vehreschild * Email: vehre ad gmx dot de gcc/fortran/ChangeLog: 2016-12-13 Andre Vehreschild PR fortran/77785 * resolve.c (resolve_symbol): Correct attr lookup to the _data component. * trans-array.c (gfc_alloc_allocatable_for_assignment): Indirect ref pointers and references before retrieving the caf-token. gcc/testsuite/ChangeLog: 2016-12-13 Andre Vehreschild PR fortran/77785 * gfortran.dg/coarray_38.f90: Added expecting error message. * gfortran.dg/coarray_41.f90: New test. * gfortran.dg/coarray_class_2.f90: New test. diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index c7d872c..b610797 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -14044,8 +14044,8 @@ resolve_symbol (gfc_symbol *sym) if (flag_coarray == GFC_FCOARRAY_LIB && sym->ts.type == BT_CLASS && sym->ts.u.derived && CLASS_DATA (sym) && CLASS_DATA (sym)->attr.codimension - && (sym->ts.u.derived->attr.alloc_comp - || sym->ts.u.derived->attr.pointer_comp)) + && (CLASS_DATA (sym)->ts.u.derived->attr.alloc_comp + || CLASS_DATA (sym)->ts.u.derived->attr.pointer_comp)) { gfc_error ("Sorry, allocatable/pointer components in polymorphic (CLASS) " "type coarrays at %L are unsupported", &sym->declared_at); diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 8753cbf..0cd83f4 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -9337,6 +9337,8 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop, if (token == NULL_TREE) { tmp = gfc_get_tree_for_caf_expr (expr1); + if (POINTER_TYPE_P (TREE_TYPE (tmp))) + tmp = build_fold_indirect_ref (tmp); gfc_get_caf_token_offset (&caf_se, &token, NULL, tmp, NULL_TREE, expr1); token = gfc_build_addr_expr (NULL_TREE, token); diff --git a/gcc/testsuite/gfortran.dg/coarray_38.f90 b/gcc/testsuite/gfortran.dg/coarray_38.f90 index c8011d4..04ef742 100644 --- a/gcc/testsuite/gfortran.dg/coarray_38.f90 +++ b/gcc/testsuite/gfortran.dg/coarray_38.f90 @@ -92,7 +92,7 @@ end type t type t2 class(t), allocatable :: caf2[:] end type t2 -class(t), save, allocatable :: caf[:] +class(t), save, allocatable :: caf[:] ! { dg-error "Sorry, allocatable/pointer components in polymorphic" } type(t) :: x type(t2) :: y diff --git a/gcc/testsuite/gfortran.dg/coarray_41.f90 b/gcc/testsuite/gfortran.dg/coarray_41.f90 new file mode 100644 index 000..b62d8e4 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray_41.f90 @@ -0,0 +1,29 @@ +! { dg-do run } +! { dg-options "-fcoarray=lib -lcaf_single" } + +program coarray_41 + + integer, allocatable :: vec(:)[:,:] + + allocate(vec(10)[2,*], source= 37) + + if (.not. allocated(vec)) error stop + + call foo(vec) + + if (any(vec /= 42)) error stop + + deallocate(vec) +contains + + subroutine foo(gv) + +integer, allocatable, intent(inout) :: gv(:)[:,:] +integer, allocatable :: gvin(:) + +allocate(gvin, mold=gv) +gvin = 5 +gv = gv + gvin + end subroutine foo + +end program coarray_41 diff --git a/gc
Re: [PATCH] Tree-level fix for PR 69526
On Wed, Dec 7, 2016 at 5:14 PM, Robin Dapp wrote: >> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) >> produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore >> overflow of the inner operation and for some reason your change >> to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 >> overflows but we ignored that. > > In this case the range of _8 was [1,1] so no overflow. > I think the problem is rather about the interpretation of the inner > constant. I tried discerning two cases now, a range split (i.e. when the > single range of a binop variable becomes an anti range or two ranges > after combining it with the binop's other range) and an overflow of the > range's min and max. > If the range didn't split, we can perform the simplification. If there > was a min and max overflow, we have to interpret the inner constant as > signed and perform a sign extension before converting it to the outer > type. Without overflow we can use TYPE_SIGN (inner_type). > Does this make sense? I'm not sure there is anything to "interpret" -- the operation is unsigned and overflow is when the operation may wrap around zero. There might be clever ways of re-writing the expression to (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) avoiding the overflow and thus allowing the transform but I'm not sure that's good. A related thing would be canonicalizing unsigned X plus CST to unsigned X minus CST' if CST' has a smaller absolute value than CST. I think currently we simply canonicalize to 'plus CST' but also only in fold-const.c, not in match.pd (ah we do, but only in a simplified manner). That said, can we leave that "trick" out of the patch? I think your more complicated "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all ops (like MULT_EXPR where more complicated cases can arise). Richard. > Included the remarks and attached the new version. > > Regards > Robin
Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)
On Thu, Dec 8, 2016 at 1:51 PM, Martin Liška wrote: > On 12/02/2016 01:29 PM, Richard Biener wrote: >> On Thu, Dec 1, 2016 at 5:30 PM, Martin Liška wrote: >>> On 11/23/2016 03:13 PM, Jakub Jelinek wrote: On Wed, Nov 23, 2016 at 02:57:07PM +0100, Martin Liška wrote: > I started review process in libsanitizer: https://reviews.llvm.org/D26965 > And I have a question that was asked in the review: can we distinguish > between load and store > in case of having usage of ASAN_POISON? I think with ASAN_POISON it is indeed just loads from after scope that can be caught, a store overwrites the variable with a new value and when turning the store after we make the var no longer addressable into SSA form, we loose information about the out of scope store. Furthermore, if there is first a store and then a read, like: if (argc != 12312) { char my_char; ptr = &my_char; } *ptr = i + 26; return *ptr; we don't notice even the read. Not sure what could be done against that though. I think we'd need to hook into the into-ssa framework, there it should know the current value of the variable at the point of the store is result of ASAN_POISON and be able to instead of turning that my_char = _23; into my_char_35 = _23; turn it into: my_char_35 = ASAN_POISON (_23); which would represent after scope store into my_char. Not really familiar with into-ssa though to know where to do it. Jakub >>> >>> Richi, may I ask you for help with this question? >> >> Probably where we handle the CLOBBER case (rewrite_stmt, maybe_register_def), >> we do this for -Wuninitialized. >> >> Richard. > > Thanks for the tip, however as the optimization of memory address store + > load happens > before we rewrite my_char into SSA, it would be probably hard to guess which > memory > stores and loads should be preserved: > > use-after-scope-20.c.032t.ccp1: > main (int argc, char * * argv) > { > int my_char; > int * ptr; > int _1; > int _11; > >[0.0%]: > if (argc_4(D) != 12312) > goto ; [0.0%] > else > goto ; [0.0%] > >[0.0%]: > ASAN_MARK (2, &my_char, 4); > ptr_8 = &my_char; > ASAN_MARK (1, &my_char, 4); > >[0.0%]: > # ptr_2 = PHI > _1 = argc_4(D) + 26; > *ptr_2 = _1; > _11 = *ptr_2; > return _11; > > } The SSA renamer sees my_char = ASAN_MARK; ptr_8 = &my_char; my_char = ASAN_MARK; ? It does perform a DOM walk when updating the stmts so simply registering the appropriate current def should do the trick? > I sent updated version of patch to LLVM phabricator: > https://reviews.llvm.org/D26965 > > Hopefully we can cherry pick the patch very soon to our trunk. > > M. > >> >>> Thanks, >>> Martin >>> >
Re: [PR78365] ICE in determine_value_range, at tree-ssa-loo p-niter.c:413
On Mon, Dec 12, 2016 at 8:57 AM, kugan wrote: > Hi Richard, > >>> I am fine with the new patch but you'll need an approval from Honza >>> or Richi. >>> >>> I find it a bit saddening that we cannot really rely on >>> gimple_call_fntype but at least I do not see any other way. >> >> >> The patch looks somewhat backward. It populates the param type from >> the callee but the only thing we really know is the _originating_ type >> from the callers DECL_ARGUMENTS (if the JF is based on a parameter > > > I am not sure I understood this. I think we get param_type from callees > DECL_ARGUMENTS. > >> which is the only case where we do not know the type of the actual >> value statically -- we only know the type we expect). >> >> So the type would be present only on IPA_JF_PASS_THROUGH JFs >> (? does that also cover modified params?). >> >> At propagation time when applying the jump-function we have to make >> sure to first convert the actual parameter value to that expected type >> (or drop to BOTTOM if we can't do that conversion due to excess >> mismatches). > > > From ipa vrp point of view, that is what is being done with the patch (In > this version, I also changed the POINTER_TYPE_P to use param_type). At the > point we create ipa_compute_jump_functions_for_edge, we are covering the VR > for arguments to param_type. > > In propagate_vr_accross_jump_function also, we are doing the conversion. Hum, I think what your patch does is only half-way correct. I'm looking a bit more into how the IPA propagation works and it looks like it's not the JF where the type needs to be stored but the lattice itself -- we need to know at propagation time what type we expect for the arguments. And if there's a mismatch with the actual type flowing through the JF then we need to convert or punt. And looking closer that type should get known during propagation as at that time we should have the actual parameter types as seen in the function body. And ipa_get_callee_param_type should go away - just use jfunc->param_type = TREE_TYPE (arg); (well, and no need to store that redundant information in the jump function -- it's there for all but pass-through JFs by means of the JF -- and pass-through JFs get their type at propagation time from the source lattice values) So we just need to get propagation correct (and not fiddle strangely with types in JF building). Richard. > Thanks, > Kugan
Re: [PATCH] Fill bitregion_{start,end} in store_constructor (PR, tree-optimization/78428).
On Tue, Dec 13, 2016 at 10:05 AM, Martin Liška wrote: > On 12/12/2016 12:10 PM, Eric Botcazou wrote: >>> Ok. I'm sending a patch that put gcc_unreachable to places where either size >>> or (and) offset is a non-constant. This survives regression tests >>> (including ada) on x86_64-linux-gnu. Apart from that normal bootstrap + >>> regression tests works fine on ppc64le-redhat-linux. >> >> I didn't manage to break it so it is OK by me. >> > > Out of curiosity, I blame the commit which added the > > if (offset) > { > machine_mode address_mode; > rtx offset_rtx; > > offset > = SUBSTITUTE_PLACEHOLDER_IN_EXPR (offset, > make_tree (TREE_TYPE (exp), >target)); > > and it comes to commit done in 1993: > > commit c869557a9ccc1bd3e5474b144bcb84065db23549 > Author: kenner > Date: Mon Oct 4 01:48:03 1993 + > > (store_expr): Use expr_size value, not size_int. > (store_constructor): Handle case of variable position and allow it to > contain > a PLACEHOLDER_EXPR. > (get_inner_reference): Make a WITH_RECORD_EXPR if required. > (expand_expr, case PLACEHOLDER_EXPR, WITH_RECORD_EXPR): New cases. > (expand_expr, case ARRAY_REF): Make WITH_RECORD_EXPR expressions when > needed. Yeah, I expect that we got rid of the cases at the time we introduced GIMPLE (and gimplification). > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@5584 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > May I install the patch? OK from my POV. Thanks, Richard. > Martin >
Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)
On Tue, Dec 13, 2016 at 2:50 PM, Jonathan Wakely wrote: > This patch allows the size of the emergency buffer for exception > handling to be controlled by a build-time macro (to avoid dynamic > allocation) or by a run-time environment variable (to allocate a > larger or smaller buffer). > > This will have to wait for the next stage 1 now, as it's too late for > GCC 7, but this shows what I'm thinking about and so might avoid > anybody else reinventing the wheel. > > The patch doesn't include documentation updates for the manual, which > would be needed to explain the use of the macro and the env var. We > could also add a --with-libstdcxx-static-eh-pool=N configure flag to > make it easier to set the macro. Looks reasonable. But I wonder whether we'd want the default arena size to be configurable as well, thus split static-eh-pool=N into default-eh-pool-size=N and -static-eh-pool (without =N). Also maybe N should be the number of exception objects rather than bytes? Otherwise a target independent N is hard to specify for say, a distribution that wants either a different default or a static buffer. Richard. >
Re: [Patch, fortran] PR78737 - [OOP] linking error with deferred, undefined user-defined derived-type I/O
Hi Paul, > We got there! OK for trunk. thanks. Unfortunately there was a problem with my latest patch, so what I now committed as r243609 is basically your fixed version of my draft patch (with some very minor adjustments). Phew, we finally nailed it! (My dtio_13 fix had seemed to work at some point, but in the end apparently it didn't, so I discarded it.) > This was a demonstration of the corollary of the "bon mot" from Barack > Obama at the end of the message :-) :) Maybe we can even "make gfortran great again" (to paraphrase another specimen of that category) ;P > Many thanks for finding the right path. Thanks for your help on the way, and of course thanks to Damian for triggering this whole discussion in the first place. Cheers, Janus > On 13 December 2016 at 10:23, Janus Weil wrote: >> 2016-12-13 10:58 GMT+01:00 Janus Weil : >>> 2016-12-13 10:42 GMT+01:00 Janus Weil : > please find attached a version of your patch that runs all the dtio > testcases successfully. Great, thanks a lot. Your addition of gfc_find_and_cut_at_last_class_ref is just what I was looking for right now ... If you don't mind I'll write a ChangeLog, add the proper tests and commit to trunk. Or do you prefer to take care of it yourself? >>> >>> Btw, to continue the brainstorming, I think I found a slightly better >>> solution for the dtio_13 problem, which even removes the spurious >>> error on that test case via a small fix in resolve_transfer. The >>> attached patch is what I'd like to commit if you're ok with it. >> >> Finally, here is a complete patch, including testcase and ChangeLog >> entries. Ok for trunk? >> >> Cheers, >> Janus >> >> >> 2016-12-13 Janus Weil >> Paul Thomas >> >> PR fortran/78737 >> * gfortran.h (gfc_find_typebound_dtio_proc): New prototype. >> * interface.c (gfc_compare_interfaces): Whitespace fix. >> (gfc_find_typebound_dtio_proc): New function. >> (gfc_find_specific_dtio_proc): Use it. >> * resolve.c (resolve_transfer): Improve error recovery. >> * trans-io.c (get_dtio_proc): Implement polymorphic calls to DTIO >> procedures. >> >> 2016-12-13 Janus Weil >> Paul Thomas >> >> PR fortran/78737 >> * gfortran.dg/dtio_13.f90: Remove spurious error. >> * gfortran.dg/dtio_19.f90: New test case. > > > > -- > If you're walking down the right path and you're willing to keep > walking, eventually you'll make progress. > > Barack Obama
Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)
On 13 December 2016 at 16:42, Richard Biener wrote: > Also maybe N should be the number of exception objects rather > than bytes? Otherwise a target independent N is hard to specify > for say, a distribution that wants either a different default or a > static buffer. But how to specify the number of exception objects that may have different sizes?
Re: [PATCH] omp-low.c split
On 12/13/2016 04:42 AM, Martin Jambor wrote: >> And this as well. But omp-grid.c is fine too. > > ...I prefer omp-grid.c because I plan to use gridification also for > GCN targets, though hopefully only as an optimization rather than a > hard requirement ...and in fact I still think it is a good > optimization of simple loops for execution on all CUDA-like > environments with block/thread grids because it removes conditions > which the run-time can handle better. Regarding gridification, is your cauldron talk from 2015 still current, or have there been some significant changes? When we first started with OpenACC we were using a lot of the existing lower_omp_for* infrastructure handle ACC LOOPs. But there was a couple of problems with that. First, the chunk partitioning caused a lot of overhead, and second because of OpenACC execution model it made more sense to write our own functions (lower_oacc_head_tail / lower_oacc_reductions). In fact, during lowering gcc only marks where the loops are. All of those markers get replaced and the loops get optimized during the oaccdevlow pass which runs on the target compiler. Right now one of the significant bottlenecks we're experiencing on nvptx targets is with I/O. First, prior to launching a PTX kernel, libgomp transfers each data mapping individually in a synchronous manner. I'm debating whether it makes sense to pass in all of those data mappings to the accelerator prior to the PTX kernel launch asynchronously, obviously with an explicit synchronization barrier just prior to launching the kernel. Another bottleneck involves firstprivate variables. Often, those variables are 'scalars' and consequently, they shouldn't need explicit data mappings. I noticed that Jakub introduced a special GOMP_MAP_FIRSTPRIVATE_INT, which omits data mappings for integral types with less than or equal precision to pointers. It would probably be beneficial to expand this to reals. The last observation is that OpenMP code in general passes a struct with all of the data mappings to the various OMP regions/offloaded code. That's fine, but for offloading targets, particularly nvptx, it would probably be slightly more efficient if those OMP regions took actual function arguments instead of a single struct. At least on nvptx targets, in order to pass that struct to the accelerator, the runtime must first allocate device memory for it, then copy all of the struct contents to the device each time prior to launching a PTX kernel. A lot of this could be bypassed because cuLaunchKernel accepts a variable number of kernel arguments. Obviously, those arguments need to be transferred to the accelerator one way or another, so I'm not sure yet how beneficial this optimization would end up being. To be clear, I'm not proposing any of these changes for gcc7. Any changes to the above will go to gomp-4_0-branch first, then we'll port them over to gcc8. What type of performance problems are you experiencing with HSA? Cesar
Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)
On Tue, Dec 13, 2016 at 3:45 PM, Ville Voutilainen wrote: > On 13 December 2016 at 16:42, Richard Biener > wrote: >> Also maybe N should be the number of exception objects rather >> than bytes? Otherwise a target independent N is hard to specify >> for say, a distribution that wants either a different default or a >> static buffer. > > > But how to specify the number of exception objects that may have > different sizes? We have EMERGENCY_OBJ_SIZE for this (yeah, some guesstimate) which can be easily target configured in libstdc++ itself. We possibly should split it into a sizeof () of the exception header plus some payload guesstimate. Richard.
[PATCH] Fix PR78788
The following patch fixes PR78788. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2016-12-13 Richard Biener PR tree-optimization/78788 * tree-vrp.c (set_value_range): Allow [-INF(OVF), +INF(OVF)]. (set_and_canonicalize_value_range): Do not drop the above to VARYING. * gcc.dg/torture/pr78788.c: New testcase. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 243599) +++ gcc/tree-vrp.c (working copy) @@ -365,10 +365,6 @@ set_value_range (value_range *vr, enum v cmp = compare_values (min, max); gcc_assert (cmp == 0 || cmp == -1 || cmp == -2); - - if (needs_overflow_infinity (TREE_TYPE (min))) - gcc_assert (!is_overflow_infinity (min) - || !is_overflow_infinity (max)); } if (flag_checking @@ -506,14 +502,9 @@ set_and_canonicalize_value_range (value_ } } - /* Drop [-INF(OVF), +INF(OVF)] to varying. */ - if (needs_overflow_infinity (TREE_TYPE (min)) - && is_overflow_infinity (min) - && is_overflow_infinity (max)) -{ - set_value_range_to_varying (vr); - return; -} + /* Do not drop [-INF(OVF), +INF(OVF)] to varying. (OVF) has to be sticky + to make sure VRP iteration terminates, otherwise we can get into + oscillations. */ set_value_range (vr, t, min, max, equiv); } Index: gcc/testsuite/gcc.dg/torture/pr78788.c === --- gcc/testsuite/gcc.dg/torture/pr78788.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr78788.c (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ + +int a; +long b; +long c; +void d() +{ + int e = 0; + for (; b; b++) +if (c) + { + e++; + e++; + } + while (e) +a = e -= 2; +}
Re: RFC: Allow emergency EH pool size to be controlled (for stage 1)
On 13 December 2016 at 16:52, Richard Biener wrote: > On Tue, Dec 13, 2016 at 3:45 PM, Ville Voutilainen > wrote: >> On 13 December 2016 at 16:42, Richard Biener >> wrote: >>> Also maybe N should be the number of exception objects rather >>> than bytes? Otherwise a target independent N is hard to specify >>> for say, a distribution that wants either a different default or a >>> static buffer. >> >> >> But how to specify the number of exception objects that may have >> different sizes? > > We have EMERGENCY_OBJ_SIZE for this (yeah, some guesstimate) > which can be easily target configured in libstdc++ itself. We possibly > should split it into a sizeof () of the exception header plus some > payload guesstimate. Ah, yes, that's what the patch does with the default-size when there is nothing in the environment to customize the value. Makes sense. I don't know whether a separate byte-wise count would be useful, I must admit I have no field experience with such tunings.
[PATCH] Dump number of loops in a function plus fixup state
I've found this useful several times. Note that currently number_of_loops returns loops plus freed ones, the patch prepares to change that, adjusting number_of_loops callers to use vec_safe_length if they really want the maximum allocated loop number. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2016-12-13 Richard Biener * cfgloopmanip.c (place_new_loop): Use vec_safe_length to get at the maximum allocated loop number. * loop-init.c (fix_loop_structure): Likewise. * tree-cfg.c (fixup_loop_arrays_after_move): Use place_new_loop. (dump_function_to_file): Dump the number of loops in the function as well as whether they need fixup. Index: gcc/cfgloopmanip.c === --- gcc/cfgloopmanip.c (revision 243599) +++ gcc/cfgloopmanip.c (working copy) @@ -430,7 +430,7 @@ remove_path (edge e, bool *irred_invalid void place_new_loop (struct function *fn, struct loop *loop) { - loop->num = number_of_loops (fn); + loop->num = vec_safe_length (loops_for_fn (fn)->larray); vec_safe_push (loops_for_fn (fn)->larray, loop); } Index: gcc/loop-init.c === --- gcc/loop-init.c (revision 243599) +++ gcc/loop-init.c (working copy) @@ -253,7 +258,7 @@ fix_loop_structure (bitmap changed_bbs) /* Remember the number of loops so we can return how many new loops flow_loops_find discovered. */ - old_nloops = number_of_loops (cfun); + old_nloops = vec_safe_length (current_loops->larray); /* Re-compute loop structure in-place. */ flow_loops_find (current_loops); @@ -321,7 +327,7 @@ fix_loop_structure (bitmap changed_bbs) timevar_pop (TV_LOOP_INIT); - return number_of_loops (cfun) - old_nloops; + return vec_safe_length (current_loops->larray) - old_nloops; } /* The RTL loop superpass. The actual passes are subpasses. See passes.c for Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 243597) +++ gcc/tree-cfg.c (working copy) @@ -7095,8 +7095,7 @@ fixup_loop_arrays_after_move (struct fun (*get_loops (fn1))[loop->num] = NULL; /* Place it in the new loop array, assigning it a new number. */ - loop->num = number_of_loops (fn2); - vec_safe_push (loops_for_fn (fn2)->larray, loop); + place_new_loop (fn2, loop); /* Recurse to children. */ for (loop = loop->inner; loop; loop = loop->next) @@ -7561,7 +7560,14 @@ dump_function_to_file (tree fndecl, FILE fprintf (file, " __GIMPLE ()\n%s (", function_name (fun)); } else -fprintf (file, "%s %s(", function_name (fun), tmclone ? "[tm-clone] " : ""); +{ + if (loops_for_fn (fun)) + fprintf (file, ";; %d loops found%s\n", number_of_loops (fun), +loops_for_fn (fun)->state & LOOPS_NEED_FIXUP +? " (fixup required)" : ""); + fprintf (file, "%s %s(", function_name (fun), + tmclone ? "[tm-clone] " : ""); +} arg = DECL_ARGUMENTS (fndecl); while (arg)
Re: [PATCH, fortran, pr77785, v3] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990
Hi Andre, > thanks for the input on the missing testcase, Janus (btw, when you know where > to > get a new crystal ball, let me know; I am missing mine, too). The new version > of the patch adds a new testcase coarray_41.f90 to test that the compiler > compiles correctly and the test runs ok. > > Bootstrapped and regtested on x86_64-linux/f23. Ok for trunk? yes, good for trunk then. Cheers, Janus > On Tue, 13 Dec 2016 12:11:50 +0100 > Janus Weil wrote: > >> Hi Andre, >> >> > all the sanitizer issues I fixed occur during compiling the testsuite. So I >> > would say, that when with the patch these errors do not occur anymore while >> > processing the testsuite, then those are tested for, right? >> >> aah, so you're saying that hunk is not actually related to the PR in >> the subject line, but instead fixes a testsuite failure seen with a >> sanitized compiler? That wasn't mentioned anywhere and sadly I forgot >> to bring my crystal ball ... >> >> Cheers, >> Janus >> >> >> >> > On Mon, 12 Dec 2016 13:37:43 +0100 >> > Janus Weil wrote: >> > >> >> Hi Andre, >> >> >> >> > the attached patch corrects reporting of "Sorry, unimplemented yet" for >> >> > allocatable and pointer components in polymorphic objects (BT_CLASS) >> >> > thus >> >> > fixing two ICEs reported in the PR. >> >> > >> >> > The next chunk fixes an ICE when the declaration containing the token >> >> > information is of type POINTER or REFERENCE. >> >> > >> >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? >> >> >> >> the resolve.c hunk is certainly ok. The trans-array.c part looks >> >> reasonable as well, but I wonder if it is actually covered by any of >> >> your test cases? Since they are all compile-only, with errors being >> >> thrown at resolution stage, do they even get to the translation stage? >> >> >> >> Cheers, >> >> Janus >> > >> > >> > -- >> > Andre Vehreschild * Email: vehre ad gmx dot de > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de
Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions
This patch should address the comments in the last patch. I have tested this patch with bootstrap builds and make check regression tests on a little endian Power8 64-bit system and a big endian Power7 32/64-bit system with no regressions. Can I check this into the trunk? [gcc] 2016-12-13 Michael Meissner * config/rs6000/rs6000.c (rs6000_split_vec_extract_var): On ISA 3.0/power9, add support to use the VEXTU{B,H,W}{L,R}X extract instructions. * config/rs6000/vsx.md (VSr2): Add IEEE 128-bit floating point type constraint registers. (VSr3): Likewise. (FL_CONV): New mode iterator for binary floating types that have a direct conversion from 64-bit integer to floating point. (vsx_extract__p9): Add support for the ISA 3.0/power9 VEXTU{B,H,W}{L,R}X extract instructions. (vsx_extract__p9 splitter): Add splitter to load up the extract byte position into the GPR if we are using the VEXTU{B,H,W}{L,R}X extract instructions. (vsx_extract__di_p9): Support extracts to GPRs. (vsx_extract__store_p9): Support extracting to GPRs so that we can use reg+offset address instructions. (vsx_extract__var): Support extracts to GPRs. (vsx_extract___var): New combiner insn to combine vector extracts with zero_extend. (vsx_ext__fl_): Optimize extracting a small integer vector element and converting it to a floating point type. (vsx_ext__ufl_): Likewise. [gcc/testsuite] 2016-12-13 Michael Meissner * gcc/testsuite/gcc.target/powerpc/vec-extract.h: If DO_TRACE is defined, add tracing of the various extracts to stderr. Add support for tests that convert the result to another type. * gcc/testsuite/gcc.target/powerpc/vec-extract-v2df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v4sf.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v4si-df.c: Add new tests that do an extract and then convert the values double. * gcc/testsuite/gcc.target/powerpc/vec-extract-v4siu-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v16qi-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v8hiu-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v8hi-df.c: Likewise. * gcc.target/powerpc/p9-extract-1.c: Update test to check for VEXTU{B,H,W}{L,R}X instructions being generated by default instead of VEXTRACTU{B,H} and XXEXTRACTUW. * gcc.target/powerpc/p9-extract-3.c: New test for combination of vec_extract and convert to floating point. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -7519,6 +7519,52 @@ rs6000_split_vec_extract_var (rtx dest, { int bit_shift = byte_shift + 3; rtx element2; + int dest_regno = regno_or_subregno (dest); + int src_regno = regno_or_subregno (src); + int element_regno = regno_or_subregno (element); + + /* See if we want to generate VEXTU{B,H,W}{L,R}X if the destination is in +a general purpose register. */ + if (TARGET_P9_VECTOR + && (mode == V16QImode || mode == V8HImode || mode == V4SImode) + && INT_REGNO_P (dest_regno) + && ALTIVEC_REGNO_P (src_regno) + && INT_REGNO_P (element_regno)) + { + rtx dest_si = gen_rtx_REG (SImode, dest_regno); + rtx element_si = gen_rtx_REG (SImode, element_regno); + + if (mode == V16QImode) + emit_insn (VECTOR_ELT_ORDER_BIG + ? gen_vextublx (dest_si, element_si, src) + : gen_vextubrx (dest_si, element_si, src)); + + else if (mode == V8HImode) + { + rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH + ? dest_si + : gen_rtx_REG (SImode, REGNO (tmp_gpr))); + emit_insn (gen_ashlsi3 (tmp_gpr_si, element_si, const1_rtx)); + emit_insn (VECTOR_ELT_ORDER_BIG +? gen_vextuhlx (dest_si, tmp_gpr_si, src) +: gen_vextuhrx (dest_si, tmp_gpr_si, src)); + } + + + else + { + rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH + ? dest_si + : gen_rtx_REG (SImode, REGNO (tmp_gpr))); + emit_insn (gen_ashlsi3 (tmp_gpr_si, element_si, const1_rtx)); +
Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions
On 12/13/2016 01:29 PM, Jakub Jelinek wrote: > On Tue, Dec 13, 2016 at 01:12:34PM +0100, Martin Liška wrote: + gimple_stmt_iterator gsi; + bool finish = false; + for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) + { +gimple *stmt = gsi_stmt (gsi); +if (maybe_contains_asan_check (stmt)) + { +bitmap_set_bit (with_check, bb->index); +finish = true; >>> >>> Why the finish var and separate if (finish) break; ? Perhaps from the first >>> iteration when you used a switch? Just doing break; should be enough. >> >> Well, I verified that even the first iteration (of patch) needed if(finish) >> break; >> as I need to jump to next iteration of FOR_EACH_BB_FN (bb, cfun). > > I fail to see functional difference between > for (...) > { > ... > bool finish = false; > for (...) > { > ... > if (...) > { > ... > finish = true; > } > if (finish) > break; > } > } > and > for (...) > { > ... > for (...) > { > ... > if (...) > { > ... > break; > } > } > } > just the latter is not obfuscated. The break is in both cases in the same > for loop. Sorry for that, I really overlooked that. It was obfuscated. > >>> Don't you need also release_defs (stmt); here (and in the other gsi_remove >>> spot)? >> >> As I remove only internal function calls that do not define a SSA name then >> not: >> ASAN_MARK (UNPOISON, &my_char, 1); >> ASAN_MARK (POISON, &my_char, 1); > > If they have a vdef, then I believe they do define a SSA name (the vdef). > I think unlink_stmt_vdef does not release the vdef SSA_NAME if any. Yep, fixed in the patch. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Martin > > Jakub > >From e313f08405ef7bb6597df64773348a3d04b3ab4e Mon Sep 17 00:00:00 2001 From: marxin Date: Mon, 12 Dec 2016 15:22:05 +0100 Subject: [PATCH] Add sanopt for ASAN_MARK poison and unpoison. gcc/ChangeLog: 2016-12-12 Martin Liska * sanopt.c (sanopt_optimize_walker): Set contains_asan_mark. (sanopt_optimize): Add new argument. (sanitize_asan_mark_unpoison): New function. (maybe_contains_asan_check): Likewise. (sanitize_asan_mark_poison): Likewise. (pass_sanopt::execute): Call the new functions. --- gcc/sanopt.c | 206 +-- 1 file changed, 202 insertions(+), 4 deletions(-) diff --git a/gcc/sanopt.c b/gcc/sanopt.c index 320e14e9421..7fe8d32e39a 100644 --- a/gcc/sanopt.c +++ b/gcc/sanopt.c @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see #include "backend.h" #include "tree.h" #include "gimple.h" +#include "ssa.h" #include "tree-pass.h" #include "tree-ssa-operands.h" #include "gimple-pretty-print.h" @@ -37,7 +38,6 @@ along with GCC; see the file COPYING3. If not see #include "tree-phinodes.h" #include "ssa-iterators.h" - /* This is used to carry information about basic blocks. It is attached to the AUX field of the standard CFG block. */ @@ -160,8 +160,10 @@ struct sanopt_ctx /* Number of IFN_ASAN_CHECK statements. */ int asan_num_accesses; -}; + /* True when the current functions constains an ASAN_MARK. */ + bool contains_asan_mark; +}; /* Return true if there might be any call to free/munmap operation on any path in between DOM (which should be imm(BB)) and BB. */ @@ -582,6 +584,9 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) if (!remove) ctx->asan_num_accesses++; break; + case IFN_ASAN_MARK: + ctx->contains_asan_mark = true; + break; default: break; } @@ -620,10 +625,11 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) /* Try to remove redundant sanitizer checks in function FUN. */ static int -sanopt_optimize (function *fun) +sanopt_optimize (function *fun, bool *contains_asan_mark) { struct sanopt_ctx ctx; ctx.asan_num_accesses = 0; + ctx.contains_asan_mark = false; /* Set up block info for each basic block. */ alloc_aux_for_blocks (sizeof (sanopt_info)); @@ -638,6 +644,7 @@ sanopt_optimize (function *fun) free_aux_for_blocks (); + *contains_asan_mark = ctx.contains_asan_mark; return ctx.asan_num_accesses; } @@ -671,18 +678,201 @@ public: }; // class pass_sanopt +/* Sanitize all ASAN_MARK unpoison calls that are not reachable by a BB + that contains an ASAN_MARK poison. All these ASAN_MARK unpoison call + can be removed as all variables are unpoisoned in a function prologue. */ + +static void +sanitize_asan_mark_unpoison (void) +{ + /* 1) Find all BBs that contain an ASAN_MARK poison call. */ + auto_sbitmap with_poison (last_basic_block_for_fn (cfun) + 1); + bitmap_clear (with_poison); + basic_block bb; + + F
Re: [RFC][PATCH] Sanopt for use-after-scope ASAN_MARK internal functions
On Tue, Dec 13, 2016 at 04:21:56PM +0100, Martin Liška wrote: > gcc/ChangeLog: > > 2016-12-12 Martin Liska > > * sanopt.c (sanopt_optimize_walker): Set contains_asan_mark. > (sanopt_optimize): Add new argument. > (sanitize_asan_mark_unpoison): New function. > (maybe_contains_asan_check): Likewise. > (sanitize_asan_mark_poison): Likewise. > (pass_sanopt::execute): Call the new functions. Ok, thanks. Jakub
Re: [PATCH, rs6000] Add support for vec_pack and vec_sld built-ins
On Thu, 2016-12-08 at 21:03 -0600, Segher Boessenkool wrote: > On Thu, Dec 08, 2016 at 08:28:22AM -0800, Carl E. Love wrote: > > The following patch adds two more built-ins that are missing. > > Specifically: > > > > vector floatvec_packvector doublevector double > > vector double vec_sld vector doublevector double > > > > The patch has been boot strapped and tested on > > powerpc64le-unknown-linux-gnu (Power 8) and on > > powerpc64-unknown-linux-gnu (Power 7) with no regressions. > > > 2016-12-08 Carl Love > > > >* config/rs6000/rs6000-c.c: Add built-in support for > >vector float vec_pack vector double vector double > >vector double vec_sld vector double vector double > > Add some parens and commas here? > > >* config/rs6000/rs6000.c: Add icode check for vsldoi_v2df to allow > >4-bit unsigned literal. > >* config/rs6000/rs6000-builtin.def: Add definition for VSLDOI_2DF > >* doc/extend.texi: Update the built-in documentation file for the > >new powerpc vec_pack and vec_sld built-ins. > > Okay for trunk with that. Thanks, Segher: The patch has been committed with the requested changes to the ChangeLog message. Revision 243612 - Directory Listing Modified Tue Dec 13 16:18:36 2016 UTC (77 seconds ago) by carll gcc/ChangeLog: 2016-12-13 Carl Love * config/rs6000/rs6000-c.c: Add built-in support for vector float vec_pack (vector double, vector double) vector double vec_sld (vector double, vector double) * config/rs6000/rs6000.c: Add icode check for vsldoi_v2df to allow 4-bit unsigned literal. * config/rs6000/rs6000-builtin.def: Add definition for VSLDOI_2DF * doc/extend.texi: Update the built-in documentation file for the new powerpc vec_pack and vec_sld built-ins. gcc/testsuite/ChangeLog: 2016-12-13 Carl Love * gcc.target/powerpc/builtins-3.c: Add new test of the test suite file. * gcc.target/powerpc/builtins-3-p8.c: Add new test of the test suite file. Carl Love
Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)
On 12/13/2016 03:52 AM, Marek Polacek wrote: On Mon, Dec 12, 2016 at 06:36:16PM -0700, Martin Sebor wrote: +/* Return true if the type of OP is signed, looking through any casts + to an unsigned type. */ + +static bool +operand_signed_p (tree op) +{ + bitmap visited = NULL; + bool ret = operand_signed_p (op, &visited); + + if (visited) +BITMAP_FREE (visited); I think you can drop the if before BITMAP_FREE here. Correct. jeff
Re: [PATCH, fortran, pr77785, v3] [Coarray] ICE in gfc_get_caf_token_offset, at fortran/trans-expr.c:1990
Hi Janus, thanks for the review. Committed as r243614. - Andre On Tue, 13 Dec 2016 16:08:59 +0100 Janus Weil wrote: > Hi Andre, > > > thanks for the input on the missing testcase, Janus (btw, when you know > > where to get a new crystal ball, let me know; I am missing mine, too). The > > new version of the patch adds a new testcase coarray_41.f90 to test that > > the compiler compiles correctly and the test runs ok. > > > > Bootstrapped and regtested on x86_64-linux/f23. Ok for trunk? > > yes, good for trunk then. > > Cheers, > Janus > > > > > On Tue, 13 Dec 2016 12:11:50 +0100 > > Janus Weil wrote: > > > >> Hi Andre, > >> > >> > all the sanitizer issues I fixed occur during compiling the testsuite. > >> > So I would say, that when with the patch these errors do not occur > >> > anymore while processing the testsuite, then those are tested for, > >> > right? > >> > >> aah, so you're saying that hunk is not actually related to the PR in > >> the subject line, but instead fixes a testsuite failure seen with a > >> sanitized compiler? That wasn't mentioned anywhere and sadly I forgot > >> to bring my crystal ball ... > >> > >> Cheers, > >> Janus > >> > >> > >> > >> > On Mon, 12 Dec 2016 13:37:43 +0100 > >> > Janus Weil wrote: > >> > > >> >> Hi Andre, > >> >> > >> >> > the attached patch corrects reporting of "Sorry, unimplemented yet" > >> >> > for allocatable and pointer components in polymorphic objects > >> >> > (BT_CLASS) thus fixing two ICEs reported in the PR. > >> >> > > >> >> > The next chunk fixes an ICE when the declaration containing the token > >> >> > information is of type POINTER or REFERENCE. > >> >> > > >> >> > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? > >> >> > >> >> the resolve.c hunk is certainly ok. The trans-array.c part looks > >> >> reasonable as well, but I wonder if it is actually covered by any of > >> >> your test cases? Since they are all compile-only, with errors being > >> >> thrown at resolution stage, do they even get to the translation stage? > >> >> > >> >> Cheers, > >> >> Janus > >> > > >> > > >> > -- > >> > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de -- Andre Vehreschild * Email: vehre ad gmx dot de Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 243613) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,11 @@ +2016-12-13 Andre Vehreschild + + PR fortran/77785 + * resolve.c (resolve_symbol): Correct attr lookup to the _data + component. + * trans-array.c (gfc_alloc_allocatable_for_assignment): Indirect ref + pointers and references before retrieving the caf-token. + 2016-12-13 Janus Weil Paul Thomas Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (Revision 243613) +++ gcc/fortran/resolve.c (Arbeitskopie) @@ -14044,8 +14044,8 @@ if (flag_coarray == GFC_FCOARRAY_LIB && sym->ts.type == BT_CLASS && sym->ts.u.derived && CLASS_DATA (sym) && CLASS_DATA (sym)->attr.codimension - && (sym->ts.u.derived->attr.alloc_comp - || sym->ts.u.derived->attr.pointer_comp)) + && (CLASS_DATA (sym)->ts.u.derived->attr.alloc_comp + || CLASS_DATA (sym)->ts.u.derived->attr.pointer_comp)) { gfc_error ("Sorry, allocatable/pointer components in polymorphic (CLASS) " "type coarrays at %L are unsupported", &sym->declared_at); Index: gcc/fortran/trans-array.c === --- gcc/fortran/trans-array.c (Revision 243613) +++ gcc/fortran/trans-array.c (Arbeitskopie) @@ -9337,6 +9337,8 @@ if (token == NULL_TREE) { tmp = gfc_get_tree_for_caf_expr (expr1); + if (POINTER_TYPE_P (TREE_TYPE (tmp))) + tmp = build_fold_indirect_ref (tmp); gfc_get_caf_token_offset (&caf_se, &token, NULL, tmp, NULL_TREE, expr1); token = gfc_build_addr_expr (NULL_TREE, token); Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 243613) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,10 @@ +2016-12-13 Andre Vehreschild + + PR fortran/77785 + * gfortran.dg/coarray_38.f90: Added expecting error message. + * gfortran.dg/coarray_41.f90: New test. + * gfortran.dg/coarray_class_2.f90: New test. + 2016-12-13 Carl Love * gcc.target/powerpc/builtins-3.c: Add new test of the test suite Index: gcc/testsuite/gfortran.dg/coarray_38.f90 === --- gcc/testsuite/gfortran.dg/coarray_38.f90 (Revision 243613) +++ gcc/testsuite/gfortran.dg/coarray_38.f90 (Arbeitskopie) @@ -92,7 +92,7 @@ type t2 class(t), allocatable :: caf2[:] end type t2 -class(t), save, allocatable :: caf[:] +class(t), save, allocatable :: caf[:] ! { dg-error "Sorry, allocatable/pointer
[PATCH, rs6000] Cost model adjustment
Hi, This patch makes a slight adjustment to the vectorization cost model that was previously overlooked. Bootstrapped and tested on powerpc64le-unknown-linux gnu with no regressions. Ok for trunk? Thanks, Bill 2016-12-13 Bill Schmidt * config/rs6000/rs600.c (rs6000_builtin_vectorization_cost): Adjust unaligned load cost. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 243578) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -5358,6 +5358,9 @@ rs6000_builtin_vectorization_cost (enum vect_cost_ return 3; case unaligned_load: + if (TARGET_P9_VECTOR) + return 3; + if (TARGET_EFFICIENT_UNALIGNED_VSX) return 1;
[PATCH, i386]: Fix PR78794, ~9% regression in 32-bit mode for 462.libquntum on Avoton after r243202
Hello! Attached patch fixes STV cost function to better model gains of pandn insn on non-BMI targets. As explained in the PR, STV converts four scalar arithmetic insns (2 * not and 2 * and) to one (pandn). The patch increases gain for non-BMI targets for 2 * ix86_cost->add to a total of 3 * ix86_cost->add. 2016-12-13 Uros Bizjak PR target/78794 * config/i386/i386.c (dimode_scalar_chain::compute_convert_gain): Calculate additional gain for andnot for targets without BMI. testsuite/ChangeLog: 2016-12-13 Uros Bizjak PR target/78794 * gcc.target/i386/pr78794.c: New test. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 243611) +++ config/i386/i386.c (working copy) @@ -3419,6 +3419,10 @@ dimode_scalar_chain::compute_convert_gain () || GET_CODE (src) == AND) { gain += ix86_cost->add; + /* Additional gain for andnot for targets without BMI. */ + if (GET_CODE (XEXP (src, 0)) == NOT + && !TARGET_BMI) + gain += 2 * ix86_cost->add; if (CONST_INT_P (XEXP (src, 0))) gain -= vector_const_cost (XEXP (src, 0)); if (CONST_INT_P (XEXP (src, 1))) Index: testsuite/gcc.target/i386/pr78794.c === --- testsuite/gcc.target/i386/pr78794.c (nonexistent) +++ testsuite/gcc.target/i386/pr78794.c (working copy) @@ -0,0 +1,32 @@ +/* PR target/pr78794 */ +/* { dg-do compile { target { ia32 } } } */ +/* { dg-options "-O2 -march=slm -mno-bmi -mno-stackrealign" } */ +/* { dg-final { scan-assembler "pandn" } } */ + +typedef unsigned long long ull; + +struct S1 +{ + float x; + ull y; +}; + + +struct S2 +{ + int a1; + struct S1 *node; + int *a2; +}; + +void +foo(int c1, int c2, int c3, struct S2 *reg) +{ + int i; + for(i=0; ia1; i++) +if(reg->node[i].y & ((ull) 1 << c1)) + { + if(reg->node[i].y & ((ull) 1 << c2)) + reg->node[i].y ^= ((ull) 1 << c3); + } +}
[arm-embedded] Add ARMv8-M Security Extensions support to Cortex-M23 and Cortex-M33
Hi, We have decided to apply the following patch to the embedded-6-branch to enable ARMv8-M Security Extensions to ARM Cortex-M23 and ARM Cortex-M33. ChangeLog entry is as follows: *** gcc/ChangeLog *** 2016-12-13 Thomas Preud'homme * config/arm/arm-cores.def (cortex-m23): Add FL2_CMSE flag. (cortex-m33): Likewise. I've checked that gcc.target/arm/cmse/cmse-1.c compiles with -mcpu=cortex-m23 -mcmse and -mcpu=cortex-m33 and that it does not when trying with -mcpu=cortex-m0 -mcmse or -mcpu=cortex-m7 -mcmse. Best regards, Thomas diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index e5b492897a7b39ade9286e6b8b15934277a6848e..add6b21ee5238aab5759d17f3de1aa5dff928d7f 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -166,9 +166,9 @@ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, ARM_FSET_MAKE_ ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, cortexa7, 7A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), cortex_a12) /* V8 Architecture Processors */ -ARM_CORE("cortex-m23", cortexm23, cortexm23, 8M_BASE, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_FOR_ARCH8M_BASE), v6m) +ARM_CORE("cortex-m23", cortexm23, cortexm23, 8M_BASE, ARM_FSET_MAKE (FL_LDSCHED | FL_FOR_ARCH8M_BASE, FL2_CMSE), v6m) ARM_CORE("cortex-a32", cortexa32, cortexa53, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35) -ARM_CORE("cortex-m33", cortexm33, cortexm33, 8M_MAIN, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN), v7m) +ARM_CORE("cortex-m33", cortexm33, cortexm33, 8M_MAIN, ARM_FSET_MAKE (FL_LDSCHED | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN, FL2_CMSE), v7m) ARM_CORE("cortex-a35", cortexa35, cortexa53, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35) ARM_CORE("cortex-a53", cortexa53, cortexa53, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a53) ARM_CORE("cortex-a57", cortexa57, cortexa57, 8A, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a57)
Re: Ping #1: [patch,testsuite,avr]: Filter-out -mmcu= from options for tests that set -mmcu=
2016-12-13 13:36 GMT+04:00 Georg-Johann Lay : > Ping #1 > > As I explained below, the solution taken be arm (pruning output) > does not work here. Approved. Please apply your patch. > > Johann > > On 02.12.2016 11:21, Georg-Johann Lay wrote: >> >> On 01.12.2016 17:40, Mike Stump wrote: >>> >>> On Dec 1, 2016, at 3:54 AM, Georg-Johann Lay wrote: This patch moves the compile tests that have a hard coded -mmcu=MCU in their dg-options to a new folder. The exp driver filters out -mmcu= from the command line options that are provided by, say, board description files or --tool-opts. This is needed because otherwise conflicting -mmcu= will FAIL respective test cases because of "specified option '-mmcu' more than once" errors from avr-gcc. Ok for trunk? >>> >>> >>> So, it would be nice if different ports can use roughly similar >>> schemes to handle the same problems. I think arm is one of the more >>> complex ports at this point in this regard with a lot of people and a >>> lot of years time to contemplate and implement solutions to the >>> problem. They in particular don't have to move test cases around to >>> handle the difference like this, I think it would be best to avoid >>> that requirement if possible. >>> >>> Glancing around, two starting points for how the arm achieves what it >>> does: >>> >>> lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.* >>> conflicts with -m(cpu|arch)=.* switch" >>> >>> in arm.exp, and they use something like: >>> >>> /* { dg-require-effective-target arm_crypto_ok } >>> */ >>> |crypto-vsha256hq_u32.c:2:/* { dg-require-effective-target >>> arm_crypto_ok } */ >>> /* { dg-add-options arm_crypto } >>> */ >>> |crypto-vsha256su0q_u32.c:2:/* { dg-require-effective-target >>> arm_crypto_ok } */ >>> >>> to validate the requirements of the test case, and to ensure that >>> optional things are selected. Nice, simple, extensible, handles >>> multilibs, dejagnu arguments and different cpu defaults as I recall. >>> >>> You won't need all the hair the arm folks have, but if you stub in >>> support in that direction, you then have simple, easy expansion room >>> to match all complexities that the arm folks have already hit and >>> solved. >> >> >> I tried this approach, but it does not work as expected. >> >> Notice that avr-gcc throws an error if conflicting -mmcu options are >> supplied. Pruning the output will make some tests "PASS", but the >> compiler didn't actually do anything but producing an error message... >> >> And one test FAILs because it should produce some specific diagnostic, >> but again the compiler just throws a different error, the output is >> pruned and the expected message is missing, hence fail. >> >> Also one test case is for ATmega8, but you won't run the whole test >> suite against ATmega8 because that device is too restricted to give >> reasonable results... Hence for some tests -mmcu=atmega8 is added by >> hand. >> >> Johann >> >> >> >> >> >
Re: [C++ PATCH] c++/78776 fix alias template ICE
On 12/12/2016 10:28 PM, Jason Merrill wrote: I suspect that most uses don't need to change. Indeed. This addresses the ones I could find. Both by grepping pt.c and fixing subsequent test fall out. took the opportunity to make the control flow in get_underlying_template somewhat clearer. I don't think there's anything unsurprising in this patch. ok? nathan -- Nathan Sidwell 2016-12-12 Nathan Sidwell PR c++/69481 * cp-tree.h (TYPE_TEMPLATE_INFO): Remove alias type checking. (TYPE_ALIAS_TEMPLATE_INFO): New. (TYPE_TEMPLATE_INFO_MAYBE_ALIAS): New. Use those macros. * error.c (dump_alias_template_specialization): Adjust. * pt.c (maybe_process_partial_specialization, iterative_has_template_arg, find_parameter_packs_r, alias_template_specialization_p, dependent_alias_template_spec_p, get_underlying_template, lookup_template_class_1, unify): Adjust template using decl access. PR c++/69481 * g++.dg/cpp0x/pr69481.C: New. Index: cp/cp-tree.h === --- cp/cp-tree.h (revision 243604) +++ cp/cp-tree.h (working copy) @@ -3038,23 +3038,30 @@ extern void decl_shadowed_for_var_insert ->template_info) /* Template information for an ENUMERAL_, RECORD_, UNION_TYPE, or - BOUND_TEMPLATE_TEMPLATE_PARM type. Note that if NODE is a - specialization of an alias template, this accessor returns the - template info for the alias template, not the one (if any) for the - template of the underlying type. */ + BOUND_TEMPLATE_TEMPLATE_PARM type. This ignores any alias + templateness of NODE. */ #define TYPE_TEMPLATE_INFO(NODE) \ - ((TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE))) \ - ? (DECL_LANG_SPECIFIC (TYPE_NAME (NODE))\ - ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE))\ - : NULL_TREE) \ - : ((TREE_CODE (NODE) == ENUMERAL_TYPE)\ - ? ENUM_TEMPLATE_INFO (NODE) \ - : ((TREE_CODE (NODE) == BOUND_TEMPLATE_TEMPLATE_PARM) \ - ? TEMPLATE_TEMPLATE_PARM_TEMPLATE_INFO (NODE) \ - : (CLASS_TYPE_P (NODE) \ - ? CLASSTYPE_TEMPLATE_INFO (NODE)\ - : NULL_TREE + (TREE_CODE (NODE) == ENUMERAL_TYPE \ + ? ENUM_TEMPLATE_INFO (NODE) \ + : (TREE_CODE (NODE) == BOUND_TEMPLATE_TEMPLATE_PARM \ + ? TEMPLATE_TEMPLATE_PARM_TEMPLATE_INFO (NODE) \ + : (CLASS_TYPE_P (NODE) \ + ? CLASSTYPE_TEMPLATE_INFO (NODE)\ + : NULL_TREE))) +/* Template information for an alias template type. */ +#define TYPE_ALIAS_TEMPLATE_INFO(NODE) \ + (DECL_LANG_SPECIFIC (TYPE_NAME (NODE))\ + ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE))\ + : NULL_TREE) + +/* If NODE is a specialization of an alias template, this accessor + returns the template info for the alias template. Otherwise behave + as TYPE_TEMPLATE_INFO. */ +#define TYPE_TEMPLATE_INFO_MAYBE_ALIAS(NODE)\ + (TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE)) \ + ? TYPE_ALIAS_TEMPLATE_INFO (NODE) \ + : TYPE_TEMPLATE_INFO (NODE)) /* Set the template information for an ENUMERAL_, RECORD_, or UNION_TYPE to VAL. */ Index: cp/error.c === --- cp/error.c (revision 243604) +++ cp/error.c (working copy) @@ -365,15 +365,13 @@ dump_template_bindings (cxx_pretty_print static void dump_alias_template_specialization (cxx_pretty_printer *pp, tree t, int flags) { - tree name; - gcc_assert (alias_template_specialization_p (t)); + tree decl = TYPE_NAME (t); if (!(flags & TFF_UNQUALIFIED_NAME)) -dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags); - name = TYPE_IDENTIFIER (t); - pp_cxx_tree_identifier (pp, name); - dump_template_parms (pp, TYPE_TEMPLATE_INFO (t), +dump_scope (pp, CP_DECL_CONTEXT (decl), flags); + pp_cxx_tree_identifier (pp, DECL_NAME (decl)); + dump_template_parms (pp, DECL_TEMPLATE_INFO (decl), /*primary=*/false, flags & ~TFF_TEMPLATE_HEADER); } Index: cp/pt.c === --- cp/pt.c (revision 243604) +++ cp/pt.c (working copy) @@ -940,10 +940,11 @@ maybe_process_partial_specialization (tr if (TYPE_ALIAS_P (type)) { - if (TYPE_TEMPLATE_INFO (type) - && DECL_ALIAS_TEMPLATE_P (TYPE_TI_TEMPLATE (type))) + tree tinfo = TYPE_ALIAS_TEMPLATE_INFO (type); + + if (tinfo && DECL_ALIAS_TEMPLATE_P (TI_TEMPLATE (tinfo))) error ("specialization of alias template %qD", - TYPE_TI_TEMPLATE (type)); + TI_TEMPLATE (tinfo)); else error ("explicit specialization of non-template %qT", type); return error_mark_node; @@ -1829,7 +1830,7 @@ iterative_hash_template_arg (tree arg, h // left alone, or untouched specializations because // coerce_template_parms returns the unconverted template // arguments if it sees incomplete argument packs. - tree ti = TYPE_TEMPLATE_INFO (arg); + tree ti = TYPE_ALIAS_TEMPLATE_INFO (arg); return hash_t
Re: libgo patch committed: Copy hash code from Go 1.7 runtime
On Fri, Dec 9, 2016 at 1:37 AM, Rainer Orth wrote: > > 2016-12-09 Rainer Orth > > * configure.ac: Call GCC_CHECK_LINKER_HWCAP. > * Makefile.am (AM_LDFLAGS): Initialize to HWCAP_LDFLAGS. > [USING_SPLIT_STACK]: Add to AM_LDFLAGS. > * aclocal.m4: Regenerate. > * configure: Regenerate. > * Makefile.in: Regenerate. > * testsuite/Makefile.in: Regenerate. Thanks. Committed to mainline as follows. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 243459) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -ac59bb383e1b446c68465af793722dd0e84abefb +556a546ba3c7bb14bd1b9b8469ee3b7a914909f6 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/Makefile.am === --- libgo/Makefile.am (revision 243445) +++ libgo/Makefile.am (working copy) @@ -46,8 +46,10 @@ AM_CFLAGS = -fexceptions -fnon-call-exce -I $(srcdir)/../libgcc -I $(srcdir)/../libbacktrace \ -I $(MULTIBUILDTOP)../../gcc/include +AM_LDFLAGS = $(HWCAP_LDFLAGS) + if USING_SPLIT_STACK -AM_LDFLAGS = -XCClinker $(SPLIT_STACK) +AM_LDFLAGS += -XCClinker $(SPLIT_STACK) endif # Multilib support. @@ -561,7 +563,7 @@ s-sigtab: $(srcdir)/mksigtab.sh gen-sysi runtime.inc: s-runtime-inc; @true s-runtime-inc: runtime.lo Makefile rm -f runtime.inc.tmp2 - grep -v "#define _" runtime.inc.tmp | grep -v "#define c[01] " > runtime.inc.tmp2 + grep -v "#define _" runtime.inc.tmp | grep -v "#define [cm][01234] " > runtime.inc.tmp2 for pattern in '_[GP][a-z]' _Max _Lock _Sig _Trace _MHeap _Num; do \ grep "#define $$pattern" runtime.inc.tmp >> runtime.inc.tmp2; \ done Index: libgo/configure.ac === --- libgo/configure.ac (revision 243445) +++ libgo/configure.ac (working copy) @@ -421,6 +421,9 @@ case "$target" in esac AC_SUBST(OSCFLAGS) +dnl Check linker hardware capability support. +GCC_CHECK_LINKER_HWCAP + dnl Use -fsplit-stack when compiling C code if available. AC_CACHE_CHECK([whether -fsplit-stack is supported], [libgo_cv_c_split_stack_supported],
[Patch, Fortran, cleanup] PR 78798: some int-valued functions should be bool
Hi all, here is a straightforward cleanup patch that makes a few functions return a bool instead of an int. Regtests cleanly on x86_64-linux-gnu. Ok for trunk? Cheers, Janus 2016-12-13 Janus Weil PR fortran/78798 * gfortran.h (gfc_is_constant_expr, gfc_is_formal_arg, gfc_is_compile_time_shape): Return bool instead of int. * array.c (gfc_is_compile_time_shape): Ditto. * expr.c (gfc_is_constant_expr): Ditto. * resolve.c (gfc_is_formal_arg): Ditto. Make formal_arg_flag bool. Index: gcc/fortran/array.c === --- gcc/fortran/array.c (revision 243609) +++ gcc/fortran/array.c (working copy) @@ -2581,18 +2581,16 @@ gfc_find_array_ref (gfc_expr *e) /* Find out if an array shape is known at compile time. */ -int +bool gfc_is_compile_time_shape (gfc_array_spec *as) { - int i; - if (as->type != AS_EXPLICIT) -return 0; +return false; - for (i = 0; i < as->rank; i++) + for (int i = 0; i < as->rank; i++) if (!gfc_is_constant_expr (as->lower[i]) || !gfc_is_constant_expr (as->upper[i])) - return 0; + return false; - return 1; + return true; } Index: gcc/fortran/expr.c === --- gcc/fortran/expr.c (revision 243609) +++ gcc/fortran/expr.c (working copy) @@ -882,10 +882,9 @@ done: /* Determine if an expression is constant in the sense of F08:7.1.12. - * This function expects that the expression has already been simplified. - * FIXME: Return a bool, not an int. */ + * This function expects that the expression has already been simplified. */ -int +bool gfc_is_constant_expr (gfc_expr *e) { gfc_constructor *c; @@ -892,7 +891,7 @@ gfc_is_constant_expr (gfc_expr *e) gfc_actual_arglist *arg; if (e == NULL) -return 1; +return true; switch (e->expr_type) { @@ -902,7 +901,7 @@ gfc_is_constant_expr (gfc_expr *e) || gfc_is_constant_expr (e->value.op.op2))); case EXPR_VARIABLE: - return 0; + return false; case EXPR_FUNCTION: case EXPR_PPC: @@ -915,7 +914,7 @@ gfc_is_constant_expr (gfc_expr *e) { for (arg = e->value.function.actual; arg; arg = arg->next) if (!gfc_is_constant_expr (arg->expr)) - return 0; + return false; } if (e->value.function.isym @@ -923,13 +922,13 @@ gfc_is_constant_expr (gfc_expr *e) || e->value.function.isym->pure || e->value.function.isym->inquiry || e->value.function.isym->transformational)) - return 1; + return true; - return 0; + return false; case EXPR_CONSTANT: case EXPR_NULL: - return 1; + return true; case EXPR_SUBSTRING: return e->ref == NULL || (gfc_is_constant_expr (e->ref->u.ss.start) @@ -943,14 +942,14 @@ gfc_is_constant_expr (gfc_expr *e) for (; c; c = gfc_constructor_next (c)) if (!gfc_is_constant_expr (c->expr)) - return 0; + return false; - return 1; + return true; default: gfc_internal_error ("gfc_is_constant_expr(): Unknown expression type"); - return 0; + return false; } } Index: gcc/fortran/gfortran.h === --- gcc/fortran/gfortran.h (revision 243609) +++ gcc/fortran/gfortran.h (working copy) @@ -3088,7 +3088,7 @@ bool gfc_check_init_expr (gfc_expr *); gfc_expr *gfc_build_conversion (gfc_expr *); void gfc_free_ref_list (gfc_ref *); void gfc_type_convert_binary (gfc_expr *, int); -int gfc_is_constant_expr (gfc_expr *); +bool gfc_is_constant_expr (gfc_expr *); bool gfc_simplify_expr (gfc_expr *, int); int gfc_has_vector_index (gfc_expr *); @@ -3180,7 +3180,7 @@ bool gfc_resolve_iterator (gfc_iterator *, bool, b bool find_forall_index (gfc_expr *, gfc_symbol *, int); bool gfc_resolve_index (gfc_expr *, int); bool gfc_resolve_dim_arg (gfc_expr *); -int gfc_is_formal_arg (void); +bool gfc_is_formal_arg (void); void gfc_resolve_substring_charlen (gfc_expr *); match gfc_iso_c_sub_interface(gfc_code *, gfc_symbol *); gfc_expr *gfc_expr_to_initialize (gfc_expr *); @@ -3218,7 +3218,7 @@ gfc_array_ref *gfc_find_array_ref (gfc_expr *); tree gfc_conv_array_initializer (tree type, gfc_expr *); bool spec_size (gfc_array_spec *, mpz_t *); bool spec_dimen_size (gfc_array_spec *, int, mpz_t *); -int gfc_is_compile_time_shape (gfc_array_spec *); +bool gfc_is_compile_time_shape (gfc_array_spec *); bool gfc_ref_dimen_size (gfc_array_ref *, int dimen, mpz_t *, mpz_t *); Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (revision 243609) +++ gcc/fortran/resolve.c (working copy) @@ -72,9 +72,9 @@ static bool first_actual_arg = false; static int omp_workshare_flag; -/* Nonzero if we are processing a for
[PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions
This patch adds support for the vec_vinsert4b and vec_vextract4b built-in functions that generate the ISA 3.0 XXINSERTW and XXEXTRACTUW/VEXTUW{L,R}X instructions. These functions are part of the PowerOpen 64-bit ELF V2 abi. In doing the work, I noticed the P9V built-in ternary functions incorrectly were declared to be binary. I have fixed these functions. The built-ins added are: long long vec_vextract4b (const vector signed char, const int); long long vec_vextract4b (const vector unsigned char, const int); vector signed char vec_insert4b (vector int, vector signed char, const int); vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char, const int); vector signed char vec_insert4b (long long, vector signed char, const int); vector unsigned char vec_insert4b (long long, vector unsigned char, const int); Note, the ABI only adds the form of vec_insert4b that takes a vector int as the first argument. On little endian systems, you have to swap double words to get the desired element into the scalar position for the XXINSERTW instruction. I have added a GCC extension to alternatively take a long long (or long in 64-bit) for the value to be inserted, since IMHO, it makes the built-in much easier to use. I have done bootstrap builds on a 64-bit power8 little endian system and a 32/64-bit power7 big endian system. There were no regressions. Can I check this into the GCC trunk? [gcc] 2016-12-13 Michael Meissner * config/rs6000/predicates.md (const_0_to_11_operand): New predicate, match 0..11. * config/rs6000/rs6000-builtin.def (BU_P9V_VSX_3): Set built-in type to ternary, not binary. (BU_P9V_64BIT_VSX_3): Likewise. (P9V_BUILTIN_VEXTRACT4B): Add support for vec_vinsert4b and vec_extract4b non-overloaded built-in functions. (P9V_BUILTIN_VINSERT4B): Likewise. (P9V_BUILTIN_VINSERT4B_DI): Likewise. (P9V_BUILTIN_VEC_VEXTULX): Move to section that adds 2 operand ISA 3.0 built-in functions. (P9V_BUILTIN_VEC_VEXTURX): Likewise. (P9V_BUILTIN_VEC_VEXTRACT4B): Add support for overloaded vec_insert4b and vec_extract4 built-in functions. (P9V_BUILTIN_VEC_VINSERT4B): Likewise. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add overloaded support for vec_vinsert4b and vec_extract4b. * config/rs6000/rs6000.c (altivec_expand_builtin): Add checks for the vec_insert4b and vec_extract4b byte number being a constant in the range 0..11. * config/rs6000/vsx.md (UNSPEC_XXEXTRACTUW): New unspec. (UNSPEC_XXINSERTW): Likewise. (vextract4b): Add support for the vec_vextract4b built-in function. (vextract4b_internal): Likewise. (vinsert4b): Add support for the vec_insert4b built-in function. Include both a version that inserts element 1 from a V4SI object and one that inserts a DI object. (vinsert4b_internal): Likewise. (vinsert4b_di): Likewise. (vinsert4b_di_internal): Likewise. * config/rs6000/altivec.h (vec_vinsert4b): Support vec_vinsert4b and vec_extract4b built-in functions. * doc/extend.doc (PowerPC VSX built-in functions): Document vec_insert4b and vec_extract4b. [gcc/testsuite] 2016-12-13 Michael Meissner * gcc.target/powerpc/p9-vinsert4b-1.c: New test. * gcc.target/powerpc/p9-vinsert4b-2.c: Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/predicates.md === --- gcc/config/rs6000/predicates.md (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy) @@ -210,6 +210,11 @@ (define_predicate "const_0_to_7_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 7)"))) +;; Match op = 0..11 +(define_predicate "const_0_to_11_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 11)"))) + ;; Match op = 0..15 (define_predicate "const_0_to_15_operand" (and (match_code "const_int") Index: gcc/config/rs6000/rs6000-builtin.def === --- gcc/config/rs6000/rs6000-builtin.def (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/rs6000-builtin.def(.../gcc/config/rs6000) (working copy) @@ -877,7 +877,16 @@ "__builtin_vsx_" NAME, /* NAME */ \ RS6000_BTM_P9_VECTOR, /* MASK */ \ (RS6000_BTC_ ## ATTR/* ATTR */ \ -| RS6000_
Re: [Patch, Fortran, cleanup] PR 78798: some int-valued functions should be bool
On Tue, Dec 13, 2016 at 8:13 PM, Janus Weil wrote: > Hi all, > > here is a straightforward cleanup patch that makes a few functions > return a bool instead of an int. Regtests cleanly on x86_64-linux-gnu. > Ok for trunk? Ok, thanks. -- Janne Blomqvist
Re: [PATCH, rs6000] Cost model adjustment
Hi Bill, On Tue, Dec 13, 2016 at 11:13:08AM -0600, Bill Schmidt wrote: > This patch makes a slight adjustment to the vectorization cost model that > was previously overlooked. > > Bootstrapped and tested on powerpc64le-unknown-linux gnu with no regressions. > Ok for trunk? Sure, thanks! > 2016-12-13 Bill Schmidt > > * config/rs6000/rs600.c (rs6000_builtin_vectorization_cost): > Adjust unaligned load cost.
[PATCH, Fortran, pr78780, v1] [7 Regression] [Coarray] ICE in conv_caf_send, at fortran/trans-intrinsic.c:1936
Hi all, attached patch fixes the issue by improving the check whether a call of the caf-runtime-routines needs to be generated instead of a regular assignment. Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? - Andre -- Andre Vehreschild * Email: vehre ad gmx dot de gcc/testsuite/ChangeLog: 2016-12-13 Andre Vehreschild PR fortran/78780 * gfortran.dg/coarray/alloc_comp_5.f90: New test. gcc/fortran/ChangeLog: 2016-12-13 Andre Vehreschild PR fortran/78780 * trans-expr.c (gfc_trans_assignment_1): Improve check whether detour caf-runtime routines is needed. diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 2f45d40..f908c25 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -9718,7 +9718,7 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, bool scalar_to_array; tree string_length; int n; - bool maybe_workshare = false; + bool maybe_workshare = false, lhs_refs_comp = false, rhs_refs_comp = false; symbol_attribute lhs_caf_attr, rhs_caf_attr, lhs_attr; bool is_poly_assign; @@ -9758,8 +9758,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, mode. */ if (flag_coarray == GFC_FCOARRAY_LIB) { - lhs_caf_attr = gfc_caf_attr (expr1); - rhs_caf_attr = gfc_caf_attr (expr2); + lhs_caf_attr = gfc_caf_attr (expr1, false, &lhs_refs_comp); + rhs_caf_attr = gfc_caf_attr (expr2, false, &rhs_refs_comp); } if (lss != gfc_ss_terminator) @@ -9959,10 +9959,19 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, } else if (flag_coarray == GFC_FCOARRAY_LIB && lhs_caf_attr.codimension && rhs_caf_attr.codimension - && lhs_caf_attr.alloc_comp && rhs_caf_attr.alloc_comp) + && ((lhs_caf_attr.allocatable && lhs_refs_comp) + || (rhs_caf_attr.allocatable && rhs_refs_comp))) { + /* Only detour to caf_send[get][_by_ref] () when the lhs or rhs is an + allocatable component, because those need to be accessed via the + caf-runtime. No need to check for coindexes here, because resolve + has rewritten those already. */ gfc_code code; gfc_actual_arglist a1, a2; + /* Clear the structures to prevent accessing garbage. */ + memset (&code, '\0', sizeof (gfc_code)); + memset (&a1, '\0', sizeof (gfc_actual_arglist)); + memset (&a2, '\0', sizeof (gfc_actual_arglist)); a1.expr = expr1; a1.next = &a2; a2.expr = expr2; diff --git a/gcc/testsuite/gfortran.dg/coarray/alloc_comp_5.f90 b/gcc/testsuite/gfortran.dg/coarray/alloc_comp_5.f90 new file mode 100644 index 000..159f5a3 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray/alloc_comp_5.f90 @@ -0,0 +1,16 @@ +! { dg-do run } + +program Jac + type Domain +integer :: n=64 +integer,allocatable :: endsi(:) + end type + type(Domain),allocatable :: D[:,:,:] + + allocate(D[2,2,*]) + allocate(D%endsi(2), source = 0) + D%endsi(2) = D%n + if (any(D%endsi /= [ 0, 64])) error stop + deallocate(D) +end program +
Re: [PATCH, Fortran, pr78780, v1] [7 Regression] [Coarray] ICE in conv_caf_send, at fortran/trans-intrinsic.c:1936
On Tue, Dec 13, 2016 at 07:32:33PM +0100, Andre Vehreschild wrote: > > attached patch fixes the issue by improving the check whether a call of the > caf-runtime-routines needs to be generated instead of a regular assignment. > > Bootstraps and regtests ok on x86_64-linux/f23. Ok for trunk? > > gcc/testsuite/ChangeLog: > > 2016-12-13 Andre Vehreschild > > PR fortran/78780 > * gfortran.dg/coarray/alloc_comp_5.f90: New test. > > gcc/fortran/ChangeLog: > > 2016-12-13 Andre Vehreschild > > PR fortran/78780 > * trans-expr.c (gfc_trans_assignment_1): Improve check whether detour > caf-runtime routines is needed. > OK. -- Steve
Re: [Patch, Fortran, cleanup] PR 78798: some int-valued functions should be bool
2016-12-13 19:19 GMT+01:00 Janne Blomqvist : > On Tue, Dec 13, 2016 at 8:13 PM, Janus Weil wrote: >> Hi all, >> >> here is a straightforward cleanup patch that makes a few functions >> return a bool instead of an int. Regtests cleanly on x86_64-linux-gnu. >> Ok for trunk? > > Ok, thanks. Thanks, Janne. Committed as r243621. Cheers, Janus
Re: [C++ PATCH] c++/78776 fix alias template ICE
On 12/13/2016 12:49 PM, Nathan Sidwell wrote: +/* Template information for an alias template type. */ +#define TYPE_ALIAS_TEMPLATE_INFO(NODE) \ + (DECL_LANG_SPECIFIC (TYPE_NAME (NODE)) \ + ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE)) \ + : NULL_TREE) + +/* If NODE is a specialization of an alias template, this accessor + returns the template info for the alias template. Otherwise behave + as TYPE_TEMPLATE_INFO. */ +#define TYPE_TEMPLATE_INFO_MAYBE_ALIAS(NODE) \ + (TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE)) \ + ? TYPE_ALIAS_TEMPLATE_INFO (NODE) \ + : TYPE_TEMPLATE_INFO (NODE)) Looks like this is checking DECL_LANG_SPECIFIC twice again; we should be able to drop the check from TYPE_TEMPLATE_INFO_MAYBE_ALIAS. /* An alias template name is never deduced. */ if (TYPE_ALIAS_P (arg)) arg = strip_typedefs (arg); - tree argvec = INNERMOST_TEMPLATE_ARGS (TYPE_TI_ARGS (arg)); + + tree tinfo = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (arg); You don't need _MAYBE_ALIAS here. And without it, we should be able to drop the strip_typedefs. OK with those changes. Jason
Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)
On 12/09/2016 07:46 PM, Martin Sebor wrote: On 12/09/2016 02:49 PM, Jason Merrill wrote: On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor wrote: For flexible array members, because they're not in C++, we get to make up the rules that make the most sense to us. IMO, they should fit in well with the rest of the language. I disagree; we should support C code, but flexible arrays don't really fit with the C++ object model, so I don't think trying to do anything clever with them in constructors is worthwhile. With the suggested approach the array becomes just an ordinary member. It's not a flexible array member anymore because its bound is deduced from the initializer (it just looks like one). The NSDMI char a[] = "foo" syntax is just a shorthand for char a[4] = "foo". That would mean that the size of the class varies with its initializer, which again gets into not fitting with the C++ object model. Jason
Re: [PATCH] PR 78534 Change character length from int to size_t
On Mon, Dec 12, 2016 at 9:39 PM, Janne Blomqvist wrote: > On Mon, Dec 12, 2016 at 7:39 PM, Andre Vehreschild wrote: >> Hi Janne, >> >> I found that you are favoring build_int_cst (size_type_node, 0) over >> size_zero_node. Is there a reason to this? > > Yes. AFAIU size_zero_node is a zero constant for sizetype which is not > the same as size_type_node. > > AFAIK the difference is that size_type_node is the C size_t type, > whereas sizetype is a GCC internal type used for address expressions. > On a "normal" target I understand that they are the same size, but > there are some slight differences in semantics, e.g. size_type_node > like C unsigned integer types is defined to wrap around on overflow > whereas sizetype is undefined on overflow. > > I don't know if GCC supports some strange targets with some kind of > segmented memory where the size of sizetype would be different from > size_type_node, but I guess it's possible in theory at least. > > That being said, now that you mention in I should be using > build_zero_cst (some_type_node) instead of > build_int_cst(some_type_node, 0). There's also build_one_cst that I > should use. > >> Furthermore did I have to patch this: >> >> diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c >> index 585f25d..f374558 100644 >> --- a/gcc/fortran/dump-parse-tree.c >> +++ b/gcc/fortran/dump-parse-tree.c >> @@ -465,7 +465,7 @@ show_expr (gfc_expr *p) >> break; >> >> case BT_HOLLERITH: >> - fprintf (dumpfile, "%dH", p->representation.length); >> + fprintf (dumpfile, "%zdH", p->representation.length); >> c = p->representation.string; >> for (i = 0; i < p->representation.length; i++, c++) >> { >> >> to bootstrap on x86_64-linux/f23. > > Ah, thanks for the catch. I'll fix it by using HOST_WIDE_INT_PRINT_DEC > since I'll have to change gfc_charlen_t to be a typedef form > HOST_WIDE_INT (see my answer to FX). > >> And I have this regression: >> >> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03 -O1 (test for excess >> errors) >> >> allocate_deferred_char_scalar_1.f03:184:0: >> >> p = '12345679' >> >> Warning: '__builtin_memcpy' writing 8 bytes into a region of size 5 overflows >> the destination [-Wstringop-overflow=] >> allocate_deferred_char_scalar_1.f03:242:0: >> >> p = 4_'12345679' >> >> Warning: '__builtin_memcpy' writing 32 bytes into a region of size 20 >> overflows >> the destination [-Wstringop-overflow=] > > I'm seeing that too, but I assumed they would be fixed by Paul's > recent patch which I don't yet have in my tree yet due to the git > mirror being stuck.. > >> Btw, the patch for changing the ABI of the coarray-libs is already nearly >> done. >> I just need to figure that what the state of regressions is with and without >> my >> change. > > Thanks. > > I'll produce an updated patch with the changes discussed so far. > > > -- > Janne Blomqvist Hi, attached is the updated patch that applies on top of the original. I didn't do the charlen_zero_node etc, I just fixed the relatively few places in my previous patch rather than everywhere in the entire frontend. Now that the git mirror is working again, I see though those warnings with -O1 from gfortran.dg/allocate_deferred_char_scalar_1.f03 are still there, and Paul's patch didn't get rid of them. :(. I have looked at the tree dumps, however, and AFAICS it's a bogus warning. Ok for trunk? -- Janne Blomqvist diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c index 585f25d..7aafe54 100644 --- a/gcc/fortran/dump-parse-tree.c +++ b/gcc/fortran/dump-parse-tree.c @@ -348,12 +348,10 @@ show_constructor (gfc_constructor_base base) static void -show_char_const (const gfc_char_t *c, int length) +show_char_const (const gfc_char_t *c, gfc_charlen_t length) { - int i; - fputc ('\'', dumpfile); - for (i = 0; i < length; i++) + for (gfc_charlen_t i = 0; i < length; i++) { if (c[i] == '\'') fputs ("''", dumpfile); @@ -465,7 +463,8 @@ show_expr (gfc_expr *p) break; case BT_HOLLERITH: - fprintf (dumpfile, "%dH", p->representation.length); + fprintf (dumpfile, HOST_WIDE_INT_PRINT_DEC "H", + p->representation.length); c = p->representation.string; for (i = 0; i < p->representation.length; i++, c++) { diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6a05e9e..e82c320 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -2067,10 +2067,12 @@ gfc_intrinsic_sym; typedef splay_tree gfc_constructor_base; -/* This should be a size_t. But occasionally the string length field - is used as a flag with values -1 and -2, see - e.g. gfc_add_assign_aux_vars. */ -typedef ptrdiff_t gfc_charlen_t; +/* This should be an unsigned variable of type size_t. But to handle + compiling to a 64-bit target from a 32-bit host, we need to use a + HOST_WIDE
Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)
On 12/13/2016 12:06 PM, Jason Merrill wrote: On 12/09/2016 07:46 PM, Martin Sebor wrote: On 12/09/2016 02:49 PM, Jason Merrill wrote: On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor wrote: For flexible array members, because they're not in C++, we get to make up the rules that make the most sense to us. IMO, they should fit in well with the rest of the language. I disagree; we should support C code, but flexible arrays don't really fit with the C++ object model, so I don't think trying to do anything clever with them in constructors is worthwhile. With the suggested approach the array becomes just an ordinary member. It's not a flexible array member anymore because its bound is deduced from the initializer (it just looks like one). The NSDMI char a[] = "foo" syntax is just a shorthand for char a[4] = "foo". That would mean that the size of the class varies with its initializer, which again gets into not fitting with the C++ object model. Okay, I accept the challenge of finding a case where the size of a class depends on an initializer expression :) Here's one: constexpr long f (int) { return 1; } constexpr long f (long) { return LONG_MAX; }; struct A { // sizeof (A) == 4 enum E { i = f (0) } e; }; struct B { // sizeof (B) == 8 enum E { i = f (0L) } e; }; It's not exactly the same but I think it's close enough to argue that deducing the array size from its initializer wouldn't go against the established model. FWIW, I have a (vague) recollection of an incident where the enum size problem caused ABI breakage in a product (it was in C and involved simple macros). Because an enum is normally expected to be the same size as an int, having its size change based on the (hidden behind a macro) value of one of its enumerators is likely far more unexpected than having an array size depend on its its initializer. Martin
Re: [C++ PATCH] c++/78776 fix alias template ICE
On 12/13/2016 03:43 PM, Nathan Sidwell wrote: On 12/13/2016 02:03 PM, Jason Merrill wrote: OK with those changes. Done. +/* Template information for an alias template type. */ +#define TYPE_ALIAS_TEMPLATE_INFO(NODE) \ + (DECL_LANG_SPECIFIC (TYPE_NAME (NODE)) \ + ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE)) \ + : NULL_TREE) + +/* If NODE is a specialization of an alias template, this accessor + returns the template info for the alias template. Otherwise behave + as TYPE_TEMPLATE_INFO. */ +#define TYPE_TEMPLATE_INFO_MAYBE_ALIAS(NODE) \ + (TYPE_ALIAS_P (NODE) && DECL_LANG_SPECIFIC (TYPE_NAME (NODE)) \ + ? DECL_TEMPLATE_INFO (TYPE_NAME (NODE)) \ + : TYPE_TEMPLATE_INFO (NODE)) I'm still seeing the redundant DECL_LANG_SPECIFIC check. Jason
[PATCH] Use the middle-end boolean_type_node
Use the boolean_type_node setup by the middle-end instead of redefining it. boolean_type_node is not used in GFortran for any ABI-visible stuff, only internally as the type of boolean expressions. There appears to be one exception to this, namely the caf_get* and caf_send* calls which have boolean_type_node arguments. However, on the library side they seem to use C _Bool, so I suspect this might be a case of a argument mismatch that hasn't affected anything so far. The practical effect of this is that the size of such variables will be the same as a C _Bool or C++ bool, that is, on most targets a single byte. Previously we redefined boolean_type_node to be a Fortran default logical kind sized variable, that is 4 or 8 bytes depending on compile options. This might enable slightly more compact code, in case the optimizer determines that the result of such a generated comparison expression needs to be stored in some temporary location rather than being used immediately. Regression tested on x86_64-pc-linux-gnu, Ok for trunk? 2016-12-13 Janne Blomqvist * trans-types.c (gfc_init_types): Don't redefine boolean type node. --- gcc/fortran/trans-types.c | 4 1 file changed, 4 deletions(-) diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c index 354308f..e8dafa0 100644 --- a/gcc/fortran/trans-types.c +++ b/gcc/fortran/trans-types.c @@ -961,10 +961,6 @@ gfc_init_types (void) wi::mask (n, UNSIGNED, TYPE_PRECISION (size_type_node))); - boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind); - boolean_true_node = build_int_cst (boolean_type_node, 1); - boolean_false_node = build_int_cst (boolean_type_node, 0); - /* ??? Shouldn't this be based on gfc_index_integer_kind or so? */ gfc_charlen_int_kind = 4; gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind); -- 2.7.4
Re: [patch, fortran] Fix ICEs with -fimplicit-none
Le 11/12/2016 à 23:21, Thomas Koenig a écrit : Hello world, the attached patch fixes a 5/6/7 regression with -fimplicit-none. Regression-tested. OK for all affected branches? OK. Thanks Mikael
Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)
On 12/13/2016 03:25 PM, Martin Sebor wrote: On 12/13/2016 12:06 PM, Jason Merrill wrote: On 12/09/2016 07:46 PM, Martin Sebor wrote: On 12/09/2016 02:49 PM, Jason Merrill wrote: On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor wrote: For flexible array members, because they're not in C++, we get to make up the rules that make the most sense to us. IMO, they should fit in well with the rest of the language. I disagree; we should support C code, but flexible arrays don't really fit with the C++ object model, so I don't think trying to do anything clever with them in constructors is worthwhile. With the suggested approach the array becomes just an ordinary member. It's not a flexible array member anymore because its bound is deduced from the initializer (it just looks like one). The NSDMI char a[] = "foo" syntax is just a shorthand for char a[4] = "foo". That would mean that the size of the class varies with its initializer, which again gets into not fitting with the C++ object model. Okay, I accept the challenge of finding a case where the size of a class depends on an initializer expression :) Here's one: constexpr long f (int) { return 1; } constexpr long f (long) { return LONG_MAX; }; struct A { // sizeof (A) == 4 enum E { i = f (0) } e; }; struct B { // sizeof (B) == 8 enum E { i = f (0L) } e; }; It's not exactly the same but I think it's close enough to argue that deducing the array size from its initializer wouldn't go against the established model. I don't see the similarity; A and B both have constant size. Jason
Re: C++ PATCH to reject initializating flexible array members in constructors (PR c++/72775)
On 12/13/2016 02:32 PM, Jason Merrill wrote: On 12/13/2016 03:25 PM, Martin Sebor wrote: On 12/13/2016 12:06 PM, Jason Merrill wrote: On 12/09/2016 07:46 PM, Martin Sebor wrote: On 12/09/2016 02:49 PM, Jason Merrill wrote: On Fri, Dec 9, 2016 at 11:33 AM, Martin Sebor wrote: For flexible array members, because they're not in C++, we get to make up the rules that make the most sense to us. IMO, they should fit in well with the rest of the language. I disagree; we should support C code, but flexible arrays don't really fit with the C++ object model, so I don't think trying to do anything clever with them in constructors is worthwhile. With the suggested approach the array becomes just an ordinary member. It's not a flexible array member anymore because its bound is deduced from the initializer (it just looks like one). The NSDMI char a[] = "foo" syntax is just a shorthand for char a[4] = "foo". That would mean that the size of the class varies with its initializer, which again gets into not fitting with the C++ object model. Okay, I accept the challenge of finding a case where the size of a class depends on an initializer expression :) Here's one: constexpr long f (int) { return 1; } constexpr long f (long) { return LONG_MAX; }; struct A { // sizeof (A) == 4 enum E { i = f (0) } e; }; struct B { // sizeof (B) == 8 enum E { i = f (0L) } e; }; It's not exactly the same but I think it's close enough to argue that deducing the array size from its initializer wouldn't go against the established model. I don't see the similarity; A and B both have constant size. Sure, just as C and D below do today struct C { // sizeof (C) == 4 char a[4] = "abc"; }; struct D { // sizeof (D) == 8 char a[8] = "abcdefg"; }; and as would E and F below with what I suggest: struct E { // sizeof (E) == 4 char a[] = "abc"; }; struct F { // sizeof (F) == 8 char a[] = "abcdefg"; }; Where is the disconnect? I'm not suggesting that the size of an object of a class change depending on how the member array is initialized in each ctor initializer list (if that's how it came across). Martin
_Rb_tree regression
Hi I have been reported privately by Christophe in copy a regression resulting from my recent changes to _Rb_tree. I removed a constructor still necessary in C++03 mode or before. Tests would have shown it if I had run them in C++03. * include/bits/stl_tree.h (_Rb_tree_impl(const _Key_compare&, const _Node_allocator&): Restore before C++11 mode. I prefer to restore it only before C++11 while it used to be available in any mode. It is only call with rvalue references. Don't hesitate to commit it yourself if you prefer to fix it quickly otherwise I'll do it after validation in about 24 hours. François diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h index 1bfbfa7..cb2c116 100644 --- a/libstdc++-v3/include/bits/stl_tree.h +++ b/libstdc++-v3/include/bits/stl_tree.h @@ -692,7 +692,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION , _Base_key_compare(__x._M_key_compare) { } -#if __cplusplus >= 201103L +#if __cplusplus < 201103L + _Rb_tree_impl(const _Key_compare& __comp, const _Node_allocator& __a) + : _Node_allocator(__a), _Base_key_compare(__comp) + { } +#else _Rb_tree_impl(_Rb_tree_impl&&) = default; _Rb_tree_impl(const _Key_compare& __comp, _Node_allocator&& __a) : _Node_allocator(std::move(__a)), _Base_key_compare(__comp)
[LRA] Fix ICE for paradoxical subregs on strict-alignment platforms
Hi, the Ada runtime library doesn't build on SPARC 32-bit with LRA because of an ICE on a couple of files: eric@polaris:~/build/gcc/sparc-sun-solaris2.10> gcc/gnat1 -I gcc/ada/rts - quiet -dumpbase g-debpoo.adb -auxbase g-debpoo -O2 -gnatpg -fPIC -mlra - mcpu=v9 g-debpoo.adb -o g-debpoo.s +===GNAT BUG DETECTED==+ | 7.0.0 20161213 (experimental) [trunk revision 243595] (sparc-sun- solaris2.10) GCC error:| | in lra_set_insn_recog_data, at lra.c:965 | | Error detected around g-debpoo.adb:1896:8 The problem is that curr_insn_transform attempts to reload subregs before reloading addresses: if (! check_only_p) /* Make equivalence substitution and memory subreg elimination before address processing because an address legitimacy can depend on memory mode. */ Now, on strict-alignment platforms, simplify_operand_subreg has a special provision for paradoxical subregs of memory references: /* If we change the address for a paradoxical subreg of memory, the address might violate the necessary alignment or the access might be slow. So take this into consideration. We need not worry about accesses beyond allocated memory for paradoxical memory subregs as we don't substitute such equiv memory (see processing equivalences in function lra_constraints) and because for spilled pseudos we allocate stack memory enough for the biggest corresponding paradoxical subreg. */ if (!(MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (mode) && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg))) || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode) && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg return true; That is to say, if the adjusted memory reference is not sufficient aligned, the routine will reload the original memory reference instead of the adjusted one on strict-alignment platforms. But there is a hitch: if the address of this original memory reference is invalid, the routine will nevertheless reload the entire memory reference, leading to the aforementioned ICE. The proposed fix is to reload the address, if it is invalid, just before reloading the memory reference in simplify_operand_subreg; this yields the correct sequence of reloads for the case I investigated. The patch also contains formatting fixes and adds a 'return' for the sake of clarity. Tested on x86-64/Linux and SPARC/Solaris with LRA, OK for the mainline? 2016-12-13 Eric Botcazou * lra-constraints.c (process_address): Add forward declaration. (simplify_operand_subreg): In the MEM case, if the adjusted memory reference is not sufficient aligned and the address was invalid, reload the address before reloading the original memory reference. Fix long lines and add a final return for the sake of clarity. -- Eric BotcazouIndex: lra-constraints.c === --- lra-constraints.c (revision 243595) +++ lra-constraints.c (working copy) @@ -1454,6 +1454,7 @@ insert_move_for_subreg (rtx_insn **befor } static int valid_address_p (machine_mode mode, rtx addr, addr_space_t as); +static bool process_address (int, bool, rtx_insn **, rtx_insn **); /* Make reloads for subreg in operand NOP with internal subreg mode REG_MODE, add new reloads for further processing. Return true if @@ -1480,13 +1481,13 @@ simplify_operand_subreg (int nop, machin type = curr_static_id->operand[nop].type; if (MEM_P (reg)) { - rtx subst; - + const bool addr_was_valid + = valid_address_p (innermode, XEXP (reg, 0), MEM_ADDR_SPACE (reg)); alter_subreg (curr_id->operand_loc[nop], false); - subst = *curr_id->operand_loc[nop]; + rtx subst = *curr_id->operand_loc[nop]; lra_assert (MEM_P (subst)); - if (! valid_address_p (innermode, XEXP (reg, 0), - MEM_ADDR_SPACE (reg)) + + if (!addr_was_valid || valid_address_p (GET_MODE (subst), XEXP (subst, 0), MEM_ADDR_SPACE (subst)) || ((get_constraint_type (lookup_constraint @@ -1503,10 +1504,10 @@ simplify_operand_subreg (int nop, machin ADDRESS, SCRATCH)][0]], MEM_ADDR_SPACE (subst { - /* If we change address for paradoxical subreg of memory, the + /* If we change the address for a paradoxical subreg of memory, the address might violate the necessary alignment or the access might - be slow. So take this into consideration. We should not worry - about access beyond allocated memory for paradoxical memory + be slow. So take this into consideration. We need not worry + about accesses beyond allocated memory for paradoxical memory subregs as we don't substitute suc
Re: [PATCH v3,rs6000] Add built-in function support for Power9 byte instructions
Hi Kelvin, On Mon, Dec 12, 2016 at 05:40:05PM -0700, Kelvin Nilsen wrote: > The patch has been bootstrapped and tested on > powerpc64le-unknown-linux and powerpc-unknown-linux (big-endian, with > both -m32 and -m64 target options) with no regressions. > > Is this ok for the trunk? Yes it is, much better, thanks! Two comments below, please fix the testcase one before commit if it is indeed a problem: > +;; Though the instructions to which this expansion maps operate on > +;; 64-bit registers, the current implementation only operates on > +;; SI-mode operands as the high-order bits provide no information > +;; that is not already available in the low-order bits. To avoid the > +;; costs of data widening operations, a future enhancement might add > +;; support for DI-mode operands. And operands[1] could be QImode. > +(define_expand "cmprb" > + [(set (match_dup 3) > + (unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r") > + (match_operand:SI 2 "gpc_reg_operand" "r")] > + UNSPEC_CMPRB)) > --- gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c (revision 0) > +++ gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c (working copy) Did you forget the scan-assembler here and in the next one, or do you only want to test it does indeed compile? Segher
Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions
On Tue, Dec 13, 2016 at 10:15:02AM -0500, Michael Meissner wrote: > This patch should address the comments in the last patch. > > I have tested this patch with bootstrap builds and make check regression tests > on a little endian Power8 64-bit system and a big endian Power7 32/64-bit > system with no regressions. Can I check this into the trunk? > + else if (mode == V8HImode) > + { > + rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH > + ? dest_si > + : gen_rtx_REG (SImode, REGNO (tmp_gpr))); I think you have these the wrong way around? You didn't address the reload_completed on all the splitters yet; is there a reason for it? Okay with these fixed. Segher
Re: [PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions
Hi Mike, On Tue, Dec 13, 2016 at 01:16:56PM -0500, Michael Meissner wrote: > I have done bootstrap builds on a 64-bit power8 little endian system and a > 32/64-bit power7 big endian system. There were no regressions. Can I check > this into the GCC trunk? Yes, please apply. One remark below. > 2016-12-13 Michael Meissner > > * config/rs6000/predicates.md (const_0_to_11_operand): New > predicate, match 0..11. > * config/rs6000/rs6000-builtin.def (BU_P9V_VSX_3): Set built-in > type to ternary, not binary. > (BU_P9V_64BIT_VSX_3): Likewise. > (P9V_BUILTIN_VEXTRACT4B): Add support for vec_vinsert4b and > vec_extract4b non-overloaded built-in functions. > (P9V_BUILTIN_VINSERT4B): Likewise. > (P9V_BUILTIN_VINSERT4B_DI): Likewise. > (P9V_BUILTIN_VEC_VEXTULX): Move to section that adds 2 operand ISA > 3.0 built-in functions. > (P9V_BUILTIN_VEC_VEXTURX): Likewise. > (P9V_BUILTIN_VEC_VEXTRACT4B): Add support for overloaded > vec_insert4b and vec_extract4 built-in functions. > (P9V_BUILTIN_VEC_VINSERT4B): Likewise. > * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add > overloaded support for vec_vinsert4b and vec_extract4b. > * config/rs6000/rs6000.c (altivec_expand_builtin): Add checks for > the vec_insert4b and vec_extract4b byte number being a constant in > the range 0..11. > * config/rs6000/vsx.md (UNSPEC_XXEXTRACTUW): New unspec. > (UNSPEC_XXINSERTW): Likewise. > (vextract4b): Add support for the vec_vextract4b built-in > function. > (vextract4b_internal): Likewise. > (vinsert4b): Add support for the vec_insert4b built-in function. > Include both a version that inserts element 1 from a V4SI object > and one that inserts a DI object. > (vinsert4b_internal): Likewise. > (vinsert4b_di): Likewise. > (vinsert4b_di_internal): Likewise. > * config/rs6000/altivec.h (vec_vinsert4b): Support vec_vinsert4b > and vec_extract4b built-in functions. > * doc/extend.doc (PowerPC VSX built-in functions): Document > vec_insert4b and vec_extract4b. > > [gcc/testsuite] > 2016-12-13 Michael Meissner > > * gcc.target/powerpc/p9-vinsert4b-1.c: New test. > * gcc.target/powerpc/p9-vinsert4b-2.c: Likewise. > @@ -15766,6 +15766,40 @@ altivec_expand_builtin (tree exp, rtx ta > case VSX_BUILTIN_VEC_EXT_V1TI: >return altivec_expand_vec_ext_builtin (exp, target); > > +case P9V_BUILTIN_VEXTRACT4B: > +case P9V_BUILTIN_VEC_VEXTRACT4B: > + arg1 = CALL_EXPR_ARG (exp, 1); > + STRIP_NOPS (arg1); > + > + /* Generate a normal call if it is invalid. */ > + /* If we got invalid arguments bail out before generating bad rtl. */ If what is invalid? Just remove the first comment? Thanks, Segher
Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B,H,W}{L,R}X instructions
On Tue, Dec 13, 2016 at 04:29:36PM -0600, Segher Boessenkool wrote: > On Tue, Dec 13, 2016 at 10:15:02AM -0500, Michael Meissner wrote: > > This patch should address the comments in the last patch. > > > > I have tested this patch with bootstrap builds and make check regression > > tests > > on a little endian Power8 64-bit system and a big endian Power7 32/64-bit > > system with no regressions. Can I check this into the trunk? > > > + else if (mode == V8HImode) > > + { > > + rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH > > + ? dest_si > > + : gen_rtx_REG (SImode, REGNO (tmp_gpr))); > > I think you have these the wrong way around? The rs6000_split_vec_extract_var function is called from several places in vsx.md to do a variable vector extract. In looking at each of the cases, there is a GPR tmp register for each of the calls, so I could modify it to move: gcc_assert (REG_P (tmp_gpr)); before the support for VEXTU{B,H,W}{L,R}X instructions, and leave the gcc_assert (REG_P (tmp_altivec)); and remove the test for SCRATCH. In the original version of the code, the non-variable case also called rs6000_split_vec_extract_var, and it did not have a scratch register. > You didn't address the reload_completed on all the splitters yet; is there > a reason for it? Because there are 4 different cases that generate wildy different code, based on what register class or memory is used: 1) For DImode, DFmode, and SFmode we could be extracting to a vector register and we would not use VEXTU{B,H,W}{L,R}X but instead do a VLSO and other operations; 2) Even if the result is in a GPR, DImode and DFmode have to do VLSO because there is no VEXTUD{L,R}X instruction that deposits the value in a GPR. Similarly, on ISA 2.07, we don't have those instructions, so we have to generate the more complex instructions; 3) The variable extract code also handles variable extracts that are stores; 4) Finally there is the new case where we are extracting to a GPR when we have ISA 3.0 instructions. Basically, until we know all of the details (i.e. after register allocator), we can't do the split, because the code is different. There is also the practical case that due to the funky way the scalar parts are not in the bottom part of the register, that SUBREG's really don't work between 64-bit and 128-bit items that go in vector registers. After register allocation, we can do gen_rtx_REG (, ) to change the types, but that really doesn't work before register allocator. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions
On Tue, Dec 13, 2016 at 04:40:20PM -0600, Segher Boessenkool wrote: > > + /* If we got invalid arguments bail out before generating bad rtl. > > */ > > If what is invalid? Just remove the first comment? I had copied the code from DST before, and it returns (const_int 0). However, normal built-ins can't return (const_int 0) because the rest of the compiler generates all sorts of errors. Instead after raising the error, you have to generate a normal function call. I'll remove the comment. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH] fix integer overflow bugs in gimple-ssa-sprintf.c (PR 78608)
Ping: https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00262.html (I would have almost forgotten about this if it weren't for bug 78786. While working on a fix for it I keep thinking that some of the changes I'm making look like they should have already been made.) Thanks Martin On 12/02/2016 05:36 PM, Martin Sebor wrote: Bug 78608 - gimple-ssa-sprintf.c:570:17: runtime error: negation of -9223372036854775808 cannot be represented in type 'long int' points out an integer overflow bug in the pass caught by ubsan. The bug was due to negating a number without checking for equality to INT_MIN. In addition, my recent change to fix 78521 introduced a call to abs() that broke the Solaris bootstrap: https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00161.html While fixing these two problems I noticed that the rest of the pass wasn't handling the corner case of a width with the value of INT_MIN specified via an argument to the asterisk, such as in: int n = snprintf(0, 0, "%*i", INT_MIN, 0); This is undefined behavior because negative width is supposed to be treated as the left justification flag followed by a positive width (thus resulting in INT_MAX + 1 bytes). This problem affected all integer and floating point directives. Finally, while there, I decided to include in information messages a bit of detail about ranges of floating point values that was missing. I did this to help answer questions like those raised earlier this week by Gerald here ("where does the 317 come from?): https://gcc.gnu.org/ml/gcc/2016-11/msg00102.html The attached patch adjusts the pass to handle these problems. Martin