Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment
On Wed, Feb 17, 2016 at 09:20:01PM -0700, Martin Sebor wrote: > >The reason why MAX_STACK_ALIGNMENT is wrong is that on most targets > >it is terribly small number (a couple of bytes usually), only i?86/x86_64 is > >an exception, because it is the only target that supports dynamic stack > >realignment. > > I see. Thank you for the explanation. I've confirmed it in > an arm-eabi cross compiler where MAX_STACK_ALIGNMENT is 64. > > What I still don't understand is why a user-specified alignment > is being tested for inequality to MAX_STACK_ALIGNMENT in > check_cxx_fundamental_alignment_constraints (the code whose > example I followed): > > 7765 #undef MAX_TARGET_FIELD_ALIGNMENT > 7766 /* For stack variables, the target supports at most > 7767 MAX_STACK_ALIGNMENT. */ > 7768 else if (decl_function_context (node) != NULL > 7769 && requested_alignment > (max_align = MAX_STACK_ALIGNMENT)) > 7770 alignment_too_large_p = true; > > That would then seem also wrong, although I haven't been able to > trigger that code with a simple test case because the call to > decl_function_context() always returns null, so maybe the code > is never used. Bet that code predates PR33721 2010 Richard's changes. So indeed, it looks wrong now. > I introduced the check for the upper bound because larger alignment > values (1L << 32 and greater) also cause an ICE. Sure, I've said that the alignment in bits is passed in unsigned int argument, which is why I've suggested the (unsigned int) align != align check. > --- gcc/c-family/c-common.c (revision 233476) > +++ gcc/c-family/c-common.c (working copy) > @@ -9818,6 +9818,33 @@ check_builtin_function_arguments (tree f > >switch (DECL_FUNCTION_CODE (fndecl)) > { > +case BUILT_IN_ALLOCA_WITH_ALIGN: > + { > + /* Get the requested alignment (in bits) if it's a constant > +integer expression. */ > + unsigned HOST_WIDE_INT align = TREE_CODE (args[1]) == INTEGER_CST > + && tree_fits_uhwi_p (args[1]) ? tree_to_uhwi (args[1]) : 0; This has still wrong formatting and unnecessary INTEGER_CST check (tree_fits_uhwi_p checks that the argument is non-NULL INTEGER_CST and only then checks the value). The formatting is wrong, because the && would need to go below TREE_CODE in this case. > + /* Determine if the requested alignment is a power of 2. */ > + if ((align & (align - 1))) > + align = 0; > + > + /* The maximum alignment in bits corresponding to the same > +maximum in bytes enforced in check_user_alignment(). */ > + unsigned maxalign = (UINT_MAX >> 1) + 1; > + > + /* Reject invalid alignments. */ > + if (align < BITS_PER_UNIT || maxalign < align) > + { > + error_at (EXPR_LOC_OR_LOC (args [1], input_location), No space before [. > + "second argument to function %qE must be a constant " > + "integer power of 2 between %qi and %qu bits", > + fndecl, BITS_PER_UNIT, maxalign); > + return false; > + } > + return true; > + } This looks ok to me. The non-doc/ part of the changes are ok for trunk with the above mentioned changes, the doc/ part I'll leave to Joseph or others, really not sure what exactly we want to document and in what form. Jakub
Re: [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes
On Wed, Feb 17, 2016 at 7:51 PM, Andreas Krebbel wrote: > I'm having this patchset in my local tree for quite a while now. > Posting it was so far prevented by some internal process hurdles. I'm > aware it isn't stage 4 material. I nevertheless would like to commit > this since: > > * It is z13 only and z13 support was new in GCC 6 anyway. The risk to > cause regressions for other cpu levels is small (hopefully). > > * It is required to get rid of some nasty performance regressions > which can be observed with -march=z13 otherwise. > > Any objections? THe bugfixes are obviously fine, the rest is up to the s390x maintainers. Richard. > Bye, > > -Andreas- > > Andreas Krebbel (9): > S/390: Add IBM z13 pipeline description > S/390: z13 lcbb fix address operand. > S/390: z13 inline stpcpy implementation. > S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy > implementation. > S/390: z13 fix mode in vcond expansion > S/390: Add vec_sub_u128 to vecintrin.h > S/390: z13 Change predicates of 128 bit add sub. > S/390: Add single element vector types to iterators. > S/390: z13 Add missing commutative operand markers. > > gcc/config/s390/2827.md| 9 +- > gcc/config/s390/2964.md| 64 > gcc/config/s390/s390-protos.h | 1 + > gcc/config/s390/s390.c | 381 > + > gcc/config/s390/s390.md| 19 +- > gcc/config/s390/vecintrin.h| 1 + > gcc/config/s390/vector.md | 60 ++-- > gcc/config/s390/vx-builtins.md | 56 +-- > gcc/testsuite/gcc.target/s390/md/movstr-1.c| 2 +- > gcc/testsuite/gcc.target/s390/md/movstr-2.c| 98 ++ > gcc/testsuite/gcc.target/s390/vector/int128-1.c| 47 +++ > gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c | 23 ++ > 12 files changed, 628 insertions(+), 133 deletions(-) > create mode 100644 gcc/config/s390/2964.md > create mode 100644 gcc/testsuite/gcc.target/s390/md/movstr-2.c > create mode 100644 gcc/testsuite/gcc.target/s390/vector/int128-1.c > create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c > > -- > 1.9.1 >
Re: RFC: [Patch, PR Bug 60818] - ICE in validate_condition_mode on powerpc*-linux-gnu* ]
On Thu, Feb 18, 2016 at 09:41:49AM +1030, Alan Modra wrote: > On Wed, Feb 17, 2016 at 06:31:45AM -0600, Segher Boessenkool wrote: > > > Corresponding content of "op" which causes the ICE: > > > gdb) p debug_rtx (op) > > > (gtu:SI (reg:CC 166) -- (operator and mode doesn't > > > match) > > > (const_int 0 [0])) > > > > That is invalid RTL for this target (should be CCUNS). Invalid RTL > > should not be passed to recog. > > Really?? combine does that all the time, when it asks "is this > instruction valid"! No, it asks "is this valid RTL a valid insn for the target". Some extreme examples... (plus:SI (reg:SI) (nil)) blows up. (set:SI (reg:SI) (reg:SI)) is silently accepted. > > > (gdb) p debug_rtx (other_insn) > > > (insn 11 10 16 2 (set (reg:SI 165 [ D.2339+-3 ]) > > > (if_then_else:SI (ne (reg:CC 166) > > > (const_int 0 [0])) > > > (reg:SI 168) > > > (reg:SI 167))) test.c:7 317 {isel_unsigned_si} > > > (expr_list:REG_DEAD (reg:SI 168) > > > (expr_list:REG_DEAD (reg:SI 167) > > > (expr_list:REG_DEAD (reg:CC 166) > > > (expr_list:REG_EQUAL (gtu:SI (reg:CC 166) > > > (const_int 0 [0])) > > > (nil)) > > > > The REG_EQUAL there is bad already. Where does that come from? > > Rohit explain that quite well already, I thought. It's there due to > combine transforming a GTU to NE in another insn, which means the reg > mode changes to CCmode via rs6000.h:SELECT_CC_MODE. I meant where in the code exactly, and of course what is the problem there, fix *that*, because *that* is the problem. I have a patch btw. > You might argue that combine shouldn't create such a note, but whether > the note is valid or not depends on the target, doesn't it? The note was valid. Then the mode of the psuedo was changed, and now the note is no longer valid (or on some targets it still may be, targets that allow multiple CC_MODEs for one and the same comparison+modes). > And the > usual way for combine to check validity of rtl is to form up an > instruction and pass that to recog. Which is exactly what happens > later when combine tries to use the note and runs into the rs6000 > backend assert. No, that is not what happens. The bad note is formed at an *earlier* combination (19->8 with current trunk and the first testcase). That passes recog just fine -- and correctly so -- but recog doesn't even get to see the notes at all. There is nothing you can do in rs6000 code to stop this bug. You can only hide the resulting crash. Either combine should delete the note (my current patch), or it can change the note to a copy of the new expression. My big worry is, can there be a later insn that does not mention the reg in the pattern, but *does* have it in its own REG_EQ*? I don't see how it could be generated, but it's not really invalid either? > It seems quite plain to me that this is primarily an rs6000 backend > problem, solved by the blindingly obvious patch I posted. Whether you > want to do something in combine as well is a secondary problem. The > rs6000 backend shouldn't assert on this rtl. The rs6000 backend *should* assert on it: it is invcalid RTL, it will not work, it is meaningless. And in fact that assert *did* catch the bug here! Segher
Re: [RFC] [P2] [PR tree-optimization/33562] Lowering more complex assignments.
On Wed, Feb 17, 2016 at 5:10 PM, Jeff Law wrote: > On 02/17/2016 07:13 AM, Richard Biener wrote: >>> >>> - /* Continue walking until we reach a kill. */ >>> - while (!stmt_kills_ref_p (temp, ref)); >>> + /* Continue walking until we reach a full kill as a single statement >>> + or there are no more live bytes. */ >>> + while (!stmt_kills_ref_p (temp, ref) >>> +&& !(live_bytes && bitmap_empty_p (live_bytes))); >> >> >> Just a short quick comment - the above means you only handle partial >> stores >> with no interveaning uses. You don't handle, say >> >> struct S { struct R { int x; int y; } r; int z; } s; >> >> s = { {1, 2}, 3 }; >> s.r.x = 1; >> s.r.y = 2; >> struct R r = s.r; >> s.z = 3; >> >> where s = { {1, 2}, 3} is still dead. > > Right. But handling that has never been part of DSE's design goals. Once > there's a use, DSE has always given up. Yeah, which is why I in the end said we need a "better" DSE ... > Having said that... > >> >> That is, you don't make use of the live_bytes in the >> ref_maybe_used_by_stmt_p >> check (you can skip uses of only dead bytes). >> >> Not sure if it makes a difference in practice (compared to the cost it >> would take). > > Not sure either. It doesn't appear that it would be hard to experiment with > that to see if it's worth the effort. My gut feeling is we're not going to > see this often, if at all, in practice. Yeah, I think the case we're after and that happens most is sth like a = { aggregate init }; a.a = ...; a.b = ...; ... and what you add is the ability to remove the aggregate init completely. What would be nice to have is to remove it partly as well, as for struct { int i; int j; int k; } a = {}; a.i = 1; a.k = 3; we'd like to remove the whole-a zeroing but we need to keep zeroing of a.j. I believe that SRA already has most of the analysis part, what it is lacking is that SRA works not flow-sensitive (it just gathers function-wide data) and that it doesn't consider base objects that have their address taken or are pointer-based. >> >> Rather than building ao_refs in clear_bytes_written_by just use >> get_ref_base_and_extent >> directly. > > Easy enough to do, but ISTM if we use get_ref_base_and_extent in > clear_bytes_written-by, then the other blob of similar code in tree-ssa-dse > should be handled in the same way. ie, the code you see in > clear_bytes_written_by is almost a direct copy of code already existing in > tree-ssa-dse.c (hence my feeling that there's some refactoring of that code > that we want to do). > > > >> >> You don't handle stuff like >> >> s[i] = { 1, 2 }; >> s[i].x = 1; >> s[i].y = 1; >> >> either btw. > > Correct I believe. > > IIRC (I think I looked at this during debugging at some point), the > ao_ref->max_size field will cover the entire array for this kind of > situation because we don't know which element in the array we're hitting (or > -1 if we don't know the array's size). I don't see a reasonable way to > handle it with an ao_ref style interface unless the variable parts of the > address computation are all rolled into the ao_ref->base field. > > I did look for cases where the initial store was to a varying location and > thus max_size covered the entire array with killing stores that eventually > covered the entire array (but with each individual killing store having size > == max_size) -- the situation never came up in the codes I looked at (gcc & > its runtime libraries of course). I think the only way to handle this case is to "strip" a common base with a varying address and replace it for the sake of get_ref_base_and_extent (that is, tell get_ref_base_and_extent to stop at s[i] in this case). So you split the piece-alias-test into a same-base comparison plus offset/size ontop of that. That said, your patch addresses a very specific case nothing else in the compiler handles right now, but it's also quite limited. So evaluating the compile-time impact is important here as well as trying to cover more cases of "partial inits after full inits" optimization. I don't believe we need to rush this into GCC 6. Richard. > Jeff
[PATCH] Fix PR37448 (and dups?)
This fixes the IPA inline analysis quadraticness that arises when we inline functions into the same function many times via inline_to_all_callers as we do in the testcase for PR37448. In that case updating the overall summary of the caller is done for each inlined call instead of just once. The solution is to keep track of which nodes we inlined to and delay the summary update. Honza, I believe this is safe as the callers summary should only matter for further inline decisions and not this one (inlining into all callers). Double-checking is appreciated - there should be no code changes by this patch (I verified this on the PR37448 testcase). This brings down compile-time of ipa inlining heuristics from ipa inlining heuristics : 260.14 (84%) usr 0.50 (23%) sys 260.91 (84%) wall 102217 kB (10%) ggc (that's an optimized non-checking compiler) to ipa inlining heuristics : 3.52 ( 3%) usr 0.55 (12%) sys 4.21 ( 3%) wall 102216 kB (12%) ggc (that's an unoptimized, checking enabled compiler, "before" on that unoptimized compiler is ipa inlining heuristics : 935.06 (89%)). We still do a lot of inlining here so we see integration : 21.22 (17%) usr 0.45 (10%) sys 21.68 (17%) wall 142979 kB (17%) ggc which is supposedly expected (same unoptimized, checking enabled compiler). Bootstrap & regtest is running on x86_64-unknown-linux-gnu, ok for trunk and branch(es)? Thanks, Richard. 2016-02-18 Richard Biener PR ipa/37448 * ipa-inline.c (inline_to_all_callers_1): Add to the callers hash-set, do not update overall summaries here. Renamed from ... (inline_to_all_callers): ... this which is now wrapping the above and performing delayed overall summary update. Index: gcc/ipa-inline.c === *** gcc/ipa-inline.c(revision 233498) --- gcc/ipa-inline.c(working copy) *** flatten_function (struct cgraph_node *no *** 2163,2169 recursion. */ static bool ! inline_to_all_callers (struct cgraph_node *node, void *data) { int *num_calls = (int *)data; bool callee_removed = false; --- 2163,2170 recursion. */ static bool ! inline_to_all_callers_1 (struct cgraph_node *node, void *data, !hash_set *callers) { int *num_calls = (int *)data; bool callee_removed = false; *** inline_to_all_callers (struct cgraph_nod *** 2193,2199 inline_summaries->get (node->callers->caller)->size); } ! inline_call (node->callers, true, NULL, NULL, true, &callee_removed); if (dump_file) fprintf (dump_file, " Inlined into %s which now has %i size\n", --- 2194,2203 inline_summaries->get (node->callers->caller)->size); } ! /* Remember which callers we inlined to, delaying updating the !overall summary. */ ! callers->add (node->callers->caller); ! inline_call (node->callers, true, NULL, NULL, false, &callee_removed); if (dump_file) fprintf (dump_file, " Inlined into %s which now has %i size\n", *** inline_to_all_callers (struct cgraph_nod *** 2211,2216 --- 2215,2237 return false; } + /* Wrapper around inline_to_all_callers_1 doing delayed overall summary +update. */ + + static bool + inline_to_all_callers (struct cgraph_node *node, void *data) + { + hash_set callers; + bool res = inline_to_all_callers_1 (node, data, &callers); + /* Perform the delayed update of the overall summary of all callers + processed. This avoids quadratic behavior in the cases where + we have a lot of calls to the same function. */ + for (hash_set::iterator i = callers.begin (); +i != callers.end (); ++i) + inline_update_overall_summary (*i); + return res; + } + /* Output overall time estimate. */ static void dump_overall_stats (void)
Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE
On 17 February 2016 at 11:22, Kyrill Tkachov wrote: > > On 17/02/16 10:20, Christophe Lyon wrote: >> >> On 17 February 2016 at 11:05, Kyrill Tkachov >> wrote: >>> >>> On 17/02/16 10:03, Christophe Lyon wrote: On 15 February 2016 at 12:32, Kyrill Tkachov wrote: > > On 04/02/16 08:58, Ramana Radhakrishnan wrote: >> >> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson >> wrote: >>> >>> This is my suggested fix for PR 65932, which is a linux kernel >>> miscompile with gcc-5.1. >>> >>> The problem here is caused by a chain of events. The first is that >>> the relatively new eipa_sra pass creates fake parameters that behave >>> slightly differently than normal parameters. The second is that the >>> optimizer creates phi nodes that copy local variables to fake >>> parameters and/or vice versa. The third is that the ouf-of-ssa pass >>> assumes that it can emit simple move instructions for these phi >>> nodes. >>> And the fourth is that the ARM port has a PROMOTE_MODE macro that >>> forces QImode and HImode to unsigned, but a >>> TARGET_PROMOTE_FUNCTION_MODE hook that does not. So signed char and >>> short parameters have different in register representations than >>> local >>> variables, and require a conversion when copying between them, a >>> conversion that the out-of-ssa pass can't easily emit. >>> >>> Ultimately, I think this is a problem in the arm backend. It should >>> not have a PROMOTE_MODE macro that is changing the sign of char and >>> short local variables. I also think that we should merge the >>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to >>> prevent this from happening again. >>> >>> I see four general problems with the current ARM PROMOTE_MODE >>> definition. >>> 1) Unsigned char is only faster for armv5 and earlier, before the >>> sxtb >>> instruction was added. It is a lose for armv6 and later. >>> 2) Unsigned short was only faster for targets that don't support >>> unaligned accesses. Support for these targets was removed a while >>> ago, and this PROMODE_MODE hunk should have been removed at the same >>> time. It was accidentally left behind. >>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it >>> was >>> converted to a function, the PROMOTE_MODE code was copied without the >>> UNSIGNEDP changes. Thus it is only an accident that >>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree. Changing >>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE >>> changes to resolve the difference are safe. >>> 4) There is a general principle that you should only change >>> signedness >>> in PROMOTE_MODE if the hardware forces it, as otherwise this results >>> in extra conversion instructions that make code slower. The mips64 >>> hardware for instance requires that 32-bit values be sign-extended >>> regardless of type, and instructions may trap if this is not true. >>> However, it has a set of 32-bit instructions that operate on these >>> values, and hence no conversions are required. There is no similar >>> case on ARM. Thus the conversions are unnecessary and unwise. This >>> can be seen in the testcases where gcc emits both a zero-extend and a >>> sign-extend inside a loop, as the sign-extend is required for a >>> compare, and the zero-extend is required by PROMOTE_MODE. >> >> Given Kyrill's testing with the patch and the reasonably detailed >> check of the effects of code generation changes - The arm.h hunk is ok >> - I do think we should make this explicit in the documentation that >> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and >> better still maybe put in a checking assert for the same in the >> mid-end but that could be the subject of a follow-up patch. >> >> Ok to apply just the arm.h hunk as I think Kyrill has taken care of >> the testsuite fallout separately. > > Hi all, > > I'd like to backport the arm.h from this ( r233130) to the GCC 5 > branch. As the CSE patch from my series had some fallout on x86_64 > due to a deficiency in the AVX patterns that is too invasive to fix > at this stage (and presumably backport), I'd like to just backport > this arm.h fix and adjust the tests to XFAIL the fallout that comes > with not applying the CSE patch. The attached patch does that. > > The code quality fallout on code outside the testsuite is not > that gread. The SPEC benchmarks are not affected by not applying > the CSE change, and only a single sequence in a popular embedded > benchmark > shows some degradation for -mtune=cortex-a9 in the same way as the > wmul-1.c and wmul-2.c tests. > > I think that's a fair tradeoff for fixing the wrong code bug on
Re: [wwwdocs] Changes for LTO and IPA, ver 2
On 18/02/16 05:19 +0100, Jan Hubicka wrote: Hi, it seems I have updated the patch for comments received but did not send updated version to the ML. Here it is. Honza Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v retrieving revision 1.58 diff -c -3 -p -r1.58 changes.html *** changes.html15 Feb 2016 11:32:56 - 1.58 --- changes.html18 Feb 2016 04:16:50 - *** For more information, see the *** 50,55 --- 50,124 of array bounds. In particular, it enables -fsanitize=bounds as well as instrumentation of flexible array member-like arrays. + Type-based alias analysis now disambiguates accesses to different + pointers. This improves precision of the alias oracle by about 20-30% + on higher-level C++ programs. Programs doing invalid type punning + of pointer types may now need -fno-strict-aliasing + to work correctly. + Alias analysis now correctly supports weakref and + alias attributes. This makes it possible to access + both a variable and its alias in one translation unit which is common + with link-time optimization. + Value range propagation now assumes that this pointer s/assumes that/assumes that the/ + of C++ member functions is non-NULL. This eliminates + common NULL pointer checks I would just use "null" not "NULL". "NULL" is just a macro, but "null pointer" and "non-null" are the formal terms used in the C++ standard so I'd use them not NULL. Otherwise it looks good to me.
Re: RFC: Fix ARMv3 support
On 16 February 2016 at 18:19, Nick Clifton wrote: > Hi Richard, Hi Ramana, > > The ARM backend has some problems compiling for the old ARMv3 > architecture. See: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62254 > > for an example of this. v3 is very old now, and I am not sure how > much interest there is in continuing to support it, but I am trying to > help reduce the gcc priority bug list, so here goes... > > The attached patch fixes the problem, albeit not in a very subtle way. > The problem is that arm_reload_[out|in]_hi is being called for a > register->register move because the v3 architecture does not support > 16-bit register moves. Rather than trace the problem back to the real > source and fix it, I chose to just allow the reload functions to > generate an SImode register move instead. Probably not the best > solution, but it appears to work. > > The attached patch also includes the test cases derived from PR 62254 > and PR 69610 (which is a duplicate of PR 62254). Including all three > tests might be overkill, but it seemed like a good idea to be a little > bit paranoid, just in case. > > Whilst testing the patch I also discovered that interworking is > enabled by default, which is a problem for v3 code generation, so I > added a fix to (silently) disable interworking if the target > architecture does not support Thumb instructions. > > Any comments or criticisms before I apply the patch ? Hi Nick, Could you modify your new testcases, such that they call check_effective_target_arm_arm_ok ? Since you force -marm, you need to make sure that the target cpu actually supports it. I'm just realizing that we currently generate arm_arch_vX_ok for X >=4 only. Maybe you should also add v3? Thanks Christophe. > > Cheers > Nick > > gcc/ChangeLog > 2016-02-16 Nick Clifton > > PR target/62554 > PR target/69610 > * config/arm/arm.c (arm_option_override_internal): Disable > interworking if the target does not support thumb instructions. > (arm_reload_in_hi): Handle the case where a register to register > move needs reloading because there is no simple pattern to handle > it. > (arm_reload_out_hi): Likewise. > > gcc/testsuite/ChangeLog > 2016-02-16 Nick Clifton > > PR target/62554 > PR target/69610 > * gcc.target/arm/pr62554.c: New test. > * gcc.target/arm/pr69610-1.c: New test. > * gcc.target/arm/pr69610-2.c: New test. >
Re: [PATCH][AArch64][v2] Skip gcc.target/aarch64/assembler_arch_1.c if assembler does not support it
On 17 February 2016 at 17:06, Kyrill Tkachov wrote: > Hi all, > > I've thought about this check a bit more and I think we can compactly > auto-generate checks > for any aarch64 architecture extension support in the assembler. > This is done in a similar way we autogenerate the arm_arch_*_ok checks for > arm. > > So in this revision we autogenerate aarch64_asm__ok checks for every > architecture extension > using some of the expect machinery. This should make this approach a bit > more general to handle > checks for any .arch_extension argument without much extra cost. > > This still assumes that the assembler supports the .arch_extension > pseudo-op, the effective > target check will fail if it doesn't. This is what we want for this > testcase. > > Is this patch ok instead of > https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01052.html ? > Nice indeed. Regarding the doc, it's not accurate to say that the values of ext are defined in aarch64-option-extensions.def, since that file is not actually parsed by DJ. I mean there is no guarantee the two lists will be kept in sync. In the new test itself, I think that return [check_no_compiler_messages aarch64_lse_assembler object should be: return [check_no_compiler_messages aarch64_FUNC_assembler object for consistency although your patch is functional as-is. Thanks Christophe. > Thanks, > Kyrill > > 2016-02-17 Kyrylo Tkachov > > * lib/target-supports.exp: Define aarch64_asm_FUNC_ok checks > for fp, simd, crypto, crc, lse. > * doc/sourcebuild.texi (AArch64-specific attributes): Document the > above. > * gcc.target/aarch64/assembler_arch_1.c: Add aarch64_asm_lse_ok > effective target check.
Re: [PATCH] Fix driver handling of multiple -ftree-parallelize-loops= options (PR driver/69805)
On Wed, Feb 17, 2016 at 09:07:42AM -0700, Sandra Loosemore wrote: > On 02/17/2016 12:14 AM, Tom de Vries wrote: > > > >Here's the documentation entry for the gt spec function (I forgot to add > >it when introducing the function), using the new semantics. > > > >Copy-pasting from the resulting .info viewed in emacs for a > >human-readable version: > >... > > 'gt' > > The 'gt' (greater than) function takes one or more arguments. > > It returns either NULL or the empty string. If it has one > > argument, it returns NULL. If it has two arguments, it > > compares them: it returns the empty string if the first > > argument is greater than the second argument, otherwise it > > returns NULL. If it has more than two arguments, it behaves > > as if only the last two arguments were passed. It can be used > > f.i. as 'S' in a spec directive %{'S':'X'}: if 'S' is NULL, > > the empty string is substituted, and if 'S' is the empty > > string, 'X' is substituted. > > > >%:gt(%{fsome-option-value=*:%*} 1) > >... > > > >OK for stage4 trunk? > > I'm not an expert on spec strings but from a user perspective, what is > the difference between "NULL" and "the empty string"? The other spec > escapes are documented in terms of pattern substitutions at the point where > the escape appears in the spec string. NULL vs. "" is the internal (C level representation) of %:fn() function returned false vs. true for the purpose of e.g. %{%:fn():...}. Jakub
Re: [PATCH] Fix driver handling of multiple -ftree-parallelize-loops= options (PR driver/69805)
On 17/02/16 17:07, Sandra Loosemore wrote: On 02/17/2016 12:14 AM, Tom de Vries wrote: Here's the documentation entry for the gt spec function (I forgot to add it when introducing the function), using the new semantics. Copy-pasting from the resulting .info viewed in emacs for a human-readable version: ... 'gt' The 'gt' (greater than) function takes one or more arguments. It returns either NULL or the empty string. If it has one argument, it returns NULL. If it has two arguments, it compares them: it returns the empty string if the first argument is greater than the second argument, otherwise it returns NULL. If it has more than two arguments, it behaves as if only the last two arguments were passed. It can be used f.i. as 'S' in a spec directive %{'S':'X'}: if 'S' is NULL, the empty string is substituted, and if 'S' is the empty string, 'X' is substituted. %:gt(%{fsome-option-value=*:%*} 1) ... OK for stage4 trunk? I'm not an expert on spec strings but from a user perspective, what is the difference between "NULL" and "the empty string"? The other spec escapes are documented in terms of pattern substitutions at the point where the escape appears in the spec string. Hi Sandra, yes, that's a good point. I. The first time we used this type of "NULL"/"empty string" distinction was for the sanitize spec function, introduced here ( https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=0016189f560b1dc86734d36826be891562da1c46;hp=e9d78f5ab2b5a1cc50cd20ca2da199f83258b153 ) in the ubsan branch, which was later merged to trunk at r202113 (patch submitted here: https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00510.html ). The commit adds this text to 'The Specs Language' bit in gcc.c: ... %{%:function(args):X} Call function named FUNCTION with args ARGS. If the function returns non-NULL, then X is substituted, if it returns NULL, it isn't substituted. ... I suppose we should document this behavior first in invoke.texi, and we could choose user-oriented names for the NULL/non-NULL distinction. II. Furthermore, the implementation seems to be inconsistent with the specification. If I modify greater_than_spec_func to return non-NULL, but not the empty string: ... @@ -9800,7 +9803,7 @@ greater_than_spec_func gcc_assert (converted != argv[2]); if (arg > lim) -return ""; +return "true"; return NULL; } ... I get: ... $ gcc hello.c -O2 -ftree-parallelize-loops=2 gcc: fatal error: switch ‘true’ does not start with ‘-’ compilation terminated. ... This (otherwise untested) patch fixes that: ... @@ -6068,10 +6068,13 @@ handle_spec_function /* p now points to just past the end of the spec function expression. */ funcval = eval_spec_function (func, args); - if (funcval != NULL && do_spec_1 (funcval, 0, NULL) < 0) -p = NULL; if (retval_nonnull) *retval_nonnull = funcval != NULL; + else +{ + if (funcval != NULL && do_spec_1 (funcval, 0, NULL) < 0) + p = NULL; +} free (func); free (args); ... III. So, a spec function can return: a: NULL, b: empty string, and c: non-empty string. In the context of evaluating a spec function as a condition, NULL designates false, empty string designates true and if we fix the problem mentioned at II then also non-empty string designates true, so: - false (a) - true (b, c) In the context of evaluating a spec function for the resulting string, NULL means an empty string (AFAICT), just like an empty string, so we have: - empty (a, b) - non-empty (c) I think that's going to be hard to explain to the user. It seems originally ( https://gcc.gnu.org/ml/gcc-patches/2013-06/msg01100.html ) the empty/non-empty string distinction was proposed for the semantics of '%{%:function(args):X}'. Perhaps we should consider a rewrite using that distinction. Thanks, - Tom
Re: Fix c/69522, memory management issue in c-parser
On 16 February 2016 at 16:37, Jeff Law wrote: > On 02/16/2016 07:21 AM, Bernd Schmidt wrote: >> >> On 02/08/2016 05:30 PM, Jeff Law wrote: >>> >>> On 01/29/2016 04:40 AM, Bernd Schmidt wrote: c/ PR c/69522 * c-parser.c (c_parser_braced_init): New arg outer_obstack. All callers changed. If nested_p is true, use it to call finish_implicit_inits. * c-tree.h (finish_implicit_inits): Declare. * c-typeck.c (finish_implicit_inits): New function. Move code from ... (push_init_level): ... here. (set_designator, process_init_element): Call finish_implicit_inits. testsuite/ PR c/69522 gcc.dg/pr69522.c: New test. >>> >>> OK. Thanks for tracking this down. >> >> >> Ok for branches too? > Hi, FWIW, I'm seeing the new test failing on the 4.9 branch: gcc/testsuite/gcc.dg/pr69522.c:2:8: error: struct has no members [-Wpedantic] gcc/testsuite/gcc.dg/pr69522.c:9:8: error: ISO C forbids empty initializer braces [-Wpedantic] > Yes. Definitely given it's a memory management issue. > > jeff
Re: [PATCH] Fix driver handling of multiple -ftree-parallelize-loops= options (PR driver/69805)
On 16/02/16 16:24, Jakub Jelinek wrote: Passing the - and ftree-parallelize-loops= stuff looks weird, and we have %* that substitutes just the variable part of the option, so in addition to fixing the case of multiple options I've also changed %:gt() behaviour, so that it now gets just the numbers and compares the last two of them. FTR, version-compare uses yet another variant: It does not expect to be called with an %{S*} or %{S*:%*} expansion, but passes S as an argument: ... %:version-compare(>= 10.3 mmacosx-version-min= -lmx) ... and the spec function itself takes care of finding the last value. Thanks, - Tom
Re: [wwwdocs] Changes for LTO and IPA, ver 2
Hi Honza, sorry for not being able to review the earlier version you had sent in a timely manner. This looks good (= please commit) once you made the two changes Jonathan suggested and the ones below. And impressive list of contributions!! On Thu, 18 Feb 2016, Jan Hubicka wrote: > + Linker plugin was extended to pass information about type of "The linker plugin..." > + binary produced to GCC back-end (that can be also manually controlled "to the GCC back end" (add an article, and omit the dash) > + properly confiugre code generator and support incremental "configure" "the code generator" > + Linking by ld -r will result in object file "in an object file" > + This delays the actual linking to the time final binary is produced "the final binary" > + and thus permits whole program optimization. Linking such object > + file is however slower. "such an object" > + Linking by gcc -r will lead to link time > optimization > + and produce final binary into the object file. Linking such object > + file is fast but avoids any benefirts from whole program > optimization. I believe it's "linking with" (also in other cases). And I'm not sure, "produce final binary into the object file" feels a little odd. "such an object file" or "such object files"? "benefits" Gerald
Re: Fix c/69522, memory management issue in c-parser
On Thu, Feb 18, 2016 at 11:53:04AM +0100, Christophe Lyon wrote: > FWIW, I'm seeing the new test failing on the 4.9 branch: > gcc/testsuite/gcc.dg/pr69522.c:2:8: error: struct has no members [-Wpedantic] > gcc/testsuite/gcc.dg/pr69522.c:9:8: error: ISO C forbids empty > initializer braces [-Wpedantic] 4.9 branch is missing https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00867.html I'll fix. Marek
Re: [RFC, PR68580] Handle pthread_create error in tsan testsuite
On 15/02/16 12:29, Bernd Edlinger wrote: Here is a patch that puts each value on it's own 8-byte aligned memory location. From my experience with tsan tests, sharing shadow memory slots between v and q or o is the most likely explanation for the occasional inability to spot the race condition on v, thus the test case fails, because the return code is 0, and the expected output is not found. Boot-strapped/regression tested on x86_64-linux-gnu. OK for trunk? Hi, Could you add 'PR testsuite/68580' to the log entry when committing? Thanks, - Tom patch-pr68580.diff 2016-02-15 Bernd Edlinger * c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned. --- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj 2015-03-19 08:53:38.0 +0100 +++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c 2016-02-15 11:09:18.852320827 +0100 @@ -7,9 +7,9 @@ #include "tsan_barrier.h" static pthread_barrier_t barrier; -int v; -int q; -int o; +int v __attribute__((aligned(8))); +int q __attribute__((aligned(8))); +int o __attribute__((aligned(8))); extern void baz4 (int *); __attribute__((noinline, noclone)) int
[PATCH] Do not emit red stack zones for a fn with no_sanitize_address (PR sanitizer/69863)
Hi. Following patch was suggested by Jakub (and suggested to be installed in this stage4). I've been thinking about a test-case (which would require an assembler scan of red zone emission). Should I come up with a ?86 test-case that will scan that? Bootstrap and regression tests have been running. Ready after it successfully finishes? Thanks, Martin gcc/ChangeLog: 2016-02-18 Jakub Jelinek Martin Liska PR sanitizer/69863 * cfgexpand.c (asan_sanitize_stack_p): New function. (partition_stack_vars): Use the function. (expand_stack_vars): Likewise. (defer_stack_allocation): Likewise. (expand_used_vars): Likewise. --- gcc/cfgexpand.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 4ac8421..d7cb896 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -868,6 +868,18 @@ union_stack_vars (size_t a, size_t b) } } +/* Return true if a current function should be annotated for ASAN stack + protection. */ + +static inline bool +asan_sanitize_stack_p (void) +{ + return (flag_sanitize & SANITIZE_ADDRESS) +&& ASAN_STACK +&& !lookup_attribute ("no_sanitize_address", + DECL_ATTRIBUTES (current_function_decl)); +} + /* A subroutine of expand_used_vars. Binpack the variables into partitions constrained by the interference graph. The overall algorithm used is as follows: @@ -929,7 +941,7 @@ partition_stack_vars (void) sizes, as the shorter vars wouldn't be adequately protected. Don't do that for "large" (unsupported) alignment objects, those aren't protected anyway. */ - if ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK && isize != jsize + if (asan_sanitize_stack_p () && isize != jsize && ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT) break; @@ -1120,7 +1132,7 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT) { base = virtual_stack_vars_rtx; - if ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK && pred) + if (asan_sanitize_stack_p () && pred) { HOST_WIDE_INT prev_offset = align_base (frame_offset, @@ -1491,7 +1503,7 @@ defer_stack_allocation (tree var, bool toplevel) /* If stack protection is enabled, *all* stack variables must be deferred, so that we can re-order the strings to the top of the frame. Similarly for Address Sanitizer. */ - if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK)) + if (flag_stack_protect || asan_sanitize_stack_p ()) return true; unsigned int align = TREE_CODE (var) == SSA_NAME @@ -2191,7 +2203,7 @@ expand_used_vars (void) expand_stack_vars (stack_protect_decl_phase_2, &data); } - if ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK) + if (asan_sanitize_stack_p ()) /* Phase 3, any partitions that need asan protection in addition to phase 1 and 2. */ expand_stack_vars (asan_decl_phase_3, &data); -- 2.7.0
Re: Fix c/69522, memory management issue in c-parser
On 02/18/2016 12:22 PM, Marek Polacek wrote: On Thu, Feb 18, 2016 at 11:53:04AM +0100, Christophe Lyon wrote: FWIW, I'm seeing the new test failing on the 4.9 branch: gcc/testsuite/gcc.dg/pr69522.c:2:8: error: struct has no members [-Wpedantic] gcc/testsuite/gcc.dg/pr69522.c:9:8: error: ISO C forbids empty initializer braces [-Wpedantic] 4.9 branch is missing https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00867.html I'll fix. I suck. Bernd
Re: [PING] genattrab.c generate switch
On 01/19/2016 12:47 PM, Jesper Broge Jørgensen wrote: Here is the reformatted patch: This will probably have to wait until stage1. + const int code = GET_CODE (op2); + if (code != IOR) +{ + if (code == EQ_ATTR) All the formatting still looks completely mangled. This was probably done by your mailer. Please try attaching the diff as text/plain. Bernd
Re: [PATCH] Do not emit red stack zones for a fn with no_sanitize_address (PR sanitizer/69863)
On Thu, Feb 18, 2016 at 01:02:05PM +0100, Martin Liška wrote: > gcc/ChangeLog: > > 2016-02-18 Jakub Jelinek > Martin Liska > > PR sanitizer/69863 > * cfgexpand.c (asan_sanitize_stack_p): New function. > (partition_stack_vars): Use the function. > (expand_stack_vars): Likewise. > (defer_stack_allocation): Likewise. > (expand_used_vars): Likewise. > --- > gcc/cfgexpand.c | 20 > 1 file changed, 16 insertions(+), 4 deletions(-) > > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index 4ac8421..d7cb896 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -868,6 +868,18 @@ union_stack_vars (size_t a, size_t b) > } > } > > +/* Return true if a current function should be annotated for ASAN stack a/a/the/ s/annotated/instrumented/, perhaps better /* Return true if the current function should have its stack frame protected by address sanitizer. */ > + protection. */ > + > +static inline bool > +asan_sanitize_stack_p (void) > +{ > + return (flag_sanitize & SANITIZE_ADDRESS) > +&& ASAN_STACK > +&& !lookup_attribute ("no_sanitize_address", > + DECL_ATTRIBUTES (current_function_decl)); > +} Please fix up formatting here, the && should be aligned below flag_sanitize, like: return ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK && !lookup_attribute ("no_sanitize_address", DECL_ATTRIBUTES (current_function_decl))); Ok for trunk with those changes. Jakub
Re: RFC: Fix ARMv3 support
Hi Christophe, > Could you modify your new testcases, such that they call > check_effective_target_arm_arm_ok ? Good idea - done. > I'm just realizing that we currently generate arm_arch_vX_ok > for X >=4 only. Maybe you should also add v3? Possibly. I am not at all sure how important v3 support is any more (or support for any ARM architecture prior to v4t). Do you know of anyone who still needs it ? Cheers Nick
Re: RFC: Fix ARMv3 support
On 18 February 2016 at 14:20, Nick Clifton wrote: > Hi Christophe, > >> Could you modify your new testcases, such that they call >> check_effective_target_arm_arm_ok ? > > Good idea - done. > >> I'm just realizing that we currently generate arm_arch_vX_ok >> for X >=4 only. Maybe you should also add v3? > > Possibly. I am not at all sure how important v3 support is any more > (or support for any ARM architecture prior to v4t). Do you know of > anyone who still needs it ? > I am not aware of any such user, but the ARM maintainers would know better for sure. FWIW, the original bug report came from the kernel team, when building all the existing configurations. So it seems the Linux kernel still wants to support v3. > Cheers > Nick
Re: RFC: Fix ARMv3 support
On 18/02/16 13:28, Christophe Lyon wrote: On 18 February 2016 at 14:20, Nick Clifton wrote: Hi Christophe, Could you modify your new testcases, such that they call check_effective_target_arm_arm_ok ? Good idea - done. I'm just realizing that we currently generate arm_arch_vX_ok for X >=4 only. Maybe you should also add v3? Possibly. I am not at all sure how important v3 support is any more (or support for any ARM architecture prior to v4t). Do you know of anyone who still needs it ? I am not aware of any such user, but the ARM maintainers would know better for sure. FWIW, the original bug report came from the kernel team, when building all the existing configurations. So it seems the Linux kernel still wants to support v3. Judging by the Linaro issue linked to the BZ: https://bugs.launchpad.net/gcc-linaro/+bug/1307197 the armv3 build was part of testing with random build configurations, so I don't think that's an indication that real kernels care about this. Kyrill Cheers Nick
Re: [PING] genattrab.c generate switch
On 18/02/16 13:22, Bernd Schmidt wrote: On 01/19/2016 12:47 PM, Jesper Broge Jørgensen wrote: Here is the reformatted patch: This will probably have to wait until stage1. + const int code = GET_CODE (op2); + if (code != IOR) +{ + if (code == EQ_ATTR) All the formatting still looks completely mangled. This was probably done by your mailer. Please try attaching the diff as text/plain. Bernd Here it is as an attatchment. I set deliveryformat to "Plain Text Only" in thunderbird. Please let me know if anything else needs to be changed. gcc/ChangeLog: 2016-01-19 Jesper Broge Jørgensen * genattrtab.c (check_attr_set_switch): New function (write_attr_set): Write a switch instead of if condition if possible diff --git a/gcc/genattrtab.c b/gcc/genattrtab.c index 2caf8f6..8e7f9e6 100644 --- a/gcc/genattrtab.c +++ b/gcc/genattrtab.c @@ -4113,6 +4113,103 @@ eliminate_known_true (rtx known_true, rtx exp, int insn_code, int insn_index) return exp; } +/* Check if exp contains a series of IOR conditions on the same attr_name. + If it does it can be turned into a switch statement and returns true. + If write_cases is true it will write the cases of the switch to outf. */ + +static int +check_attr_set_switch (FILE *outf, rtx exp, unsigned int attrs_cached, + int write_cases, int indent) +{ + if (GET_CODE (exp) != IOR) +return 0; + if (GET_CODE (XEXP (exp, 0)) != EQ_ATTR) +return 0; + + rtx next = exp; + int ior_depth = 0; + int is_first = 1; + + const char *attr_name_cmp = XSTR (XEXP (exp, 0), 0); + + while (1) +{ + rtx op1 = XEXP (next, 0); + rtx op2 = XEXP (next, 1); + + if (GET_CODE (op1) != EQ_ATTR) + return 0; + + const char *attr_name = XSTR (op1, 0); + const char *cmp_val = XSTR (op1, 1); + + /* pointer compare is enough. */ + if (attr_name_cmp != attr_name) + return 0; + + if (write_cases) + { + struct attr_desc *attr = find_attr (&attr_name, 0); + gcc_assert (attr); + if (is_first) + { + fprintf (outf, "("); + is_first = 0; + int i; + for (i = 0; i < cached_attr_count; i++) + if (attr->name == cached_attrs[i]) + break; + + if (i < cached_attr_count && (attrs_cached & (1U << i)) != 0) + fprintf (outf, "cached_%s", attr->name); + else if (i < cached_attr_count && + (attrs_to_cache & (1U << i)) != 0) + fprintf (outf, "(cached_%s = get_attr_%s (insn))", attr->name, +attr->name); + else + fprintf (outf, "get_attr_%s (insn)", attr->name); + fprintf (outf, ")\n"); + write_indent (outf, indent); + fprintf (outf, "{\n"); + } + write_indent (outf, indent); + fprintf (outf, "case "); + write_attr_valueq (outf, attr, cmp_val); + fprintf (outf, ":\n"); + } + + const int code = GET_CODE (op2); + if (code != IOR) + { + if (code == EQ_ATTR) + { + const char *attr_name = XSTR (op2, 0); + const char *cmp_val = XSTR (op2, 1); + + if (attr_name == alternative_name) + return 0; + + struct attr_desc *attr = find_attr (&attr_name, 0); + gcc_assert (attr); + + if (attr->is_const) + return 0; + else if (write_cases) + { + write_indent (outf, indent); + fprintf (outf, "case "); + write_attr_valueq (outf, attr, cmp_val); + fprintf (outf, ":\n"); + } + } + break; + } + next = op2; + ior_depth++; +} + return ior_depth > 2; +} + /* Write out a series of tests and assignment statements to perform tests and sets of an attribute value. We are passed an indentation amount and prefix and suffix strings to write around each attribute value (e.g., "return" @@ -4123,6 +4220,7 @@ write_attr_set (FILE *outf, struct attr_desc *attr, int indent, rtx value, const char *prefix, const char *suffix, rtx known_true, int insn_code, int insn_index, unsigned int attrs_cached) { + int n_switches = 0; if (GET_CODE (value) == COND) { /* Assume the default value will be the default of the COND unless we @@ -4132,6 +4230,7 @@ write_attr_set (FILE *outf, struct attr_desc *attr, int indent, rtx value, rtx newexp; int first_if = 1; int i; + int is_switch = 0; if (cached_attr_count) { @@ -4176,40 +4275,68 @@ write_attr_set (FILE *outf, struct attr_desc *attr, int indent, rtx value, if (inner_true == false_rtx) continue; + is_switch = check_attr_set_switch (outf, testexp, attrs_cached, 0,
Re: [PATCH] Add debug_function_graph_to_file
On Wed, Feb 17, 2016 at 03:00:00PM +0100, Richard Biener wrote: > On Wed, Feb 17, 2016 at 2:51 PM, Marek Polacek wrote: > > On Wed, Feb 17, 2016 at 02:45:36PM +0100, Richard Biener wrote: > >> OTOH I have in my local trees a more convenient form (attached). > >> > >> (gdb) call debug_dot_cfg (cfun, 1<<6) > >> > >> and a X window with the dotted graph opens. > > > > Is there any chance we could get this into the mainline? I'd love to use > > this without having to patch my tree locally. If I remember correctly > > there were some issues with unportable popen()? > > I think I never posted it for inclusion. The ??? would also need > investigation, > probably using a temporary file is easier (though would require manual > cleaning). > > What does it take to write it in python instead? It should be fairly easy. I played with python in gdb more than two years ago and forgot most of it, but the following script still works and still can use dot to generate and feh to immediately display call graph from within gdb, the newly installed gdb command for that is called symtab_visualize: https://github.com/jamborm/gcc-util/blob/master/python/gdbsymtab.py And I also think that using python is the way to go. The only downside is that you only find out that gcc and the script are out of sync when debugging no longer works and fails in mysterious ways and fixing the python script can be a daunting task if you are not its author or have long forgotten how it worked (or how the whole gdb-python thing worked). (And yes, I know it is bad I have my own gdb python scripts and don't use gcc/gdbhooks.py. I forgot why that happened.) Martin
Re: [PATCH][AArch64][v2] Skip gcc.target/aarch64/assembler_arch_1.c if assembler does not support it
On Thu, Feb 18, 2016 at 11:31:02AM +0100, Christophe Lyon wrote: > On 17 February 2016 at 17:06, Kyrill Tkachov > wrote: > > Hi all, > > > > I've thought about this check a bit more and I think we can compactly > > auto-generate checks > > for any aarch64 architecture extension support in the assembler. > > This is done in a similar way we autogenerate the arm_arch_*_ok checks for > > arm. > > > > So in this revision we autogenerate aarch64_asm__ok checks for every > > architecture extension > > using some of the expect machinery. This should make this approach a bit > > more general to handle > > checks for any .arch_extension argument without much extra cost. > > > > This still assumes that the assembler supports the .arch_extension > > pseudo-op, the effective > > target check will fail if it doesn't. This is what we want for this > > testcase. > > > > Is this patch ok instead of > > https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01052.html ? > > > Nice indeed. > > Regarding the doc, it's not accurate to say that the values of ext > are defined in aarch64-option-extensions.def, since that file is not > actually parsed by DJ. I mean there is no guarantee the two lists > will be kept in sync. > > In the new test itself, I think that > return [check_no_compiler_messages aarch64_lse_assembler object > should be: > return [check_no_compiler_messages aarch64_FUNC_assembler object > > for consistency although your patch is functional as-is. Agreed. OK with that change. Thanks, James > > 2016-02-17 Kyrylo Tkachov > > > > * lib/target-supports.exp: Define aarch64_asm_FUNC_ok checks > > for fp, simd, crypto, crc, lse. > > * doc/sourcebuild.texi (AArch64-specific attributes): Document the > > above. > > * gcc.target/aarch64/assembler_arch_1.c: Add aarch64_asm_lse_ok > > effective target check. >
Re: [PATCH] Add debug_function_to_file
On 17/02/16 14:42, Richard Biener wrote: On Wed, Feb 17, 2016 at 1:41 PM, Tom de Vries wrote: Hi, once in a while I'm in a gdb debug session debugging cc1, and want to print the current function to file. There's a debug function debug_function that prints a function to stderr, and there are methods to redirect output of a command to a file ( https://sourceware.org/gdb/onlinedocs/gdb/Logging-Output.html ). And there's a function dump_function_to_file that takes a FILE* parameter, which could be combined with open/close calls in gdb. But I think a short-hand is easier. This patch adds a function debug_function_to_file. It can f.i. be called as: ... (gdb) call debug_function_to_file (cfun.decl, "foo.1.txt", 0) ... Hmm, now I wonder if the order 'cfun.decl, 0, "foo.1.txt"' would make more sense (first two parameters the same as in debug_function). OK for stage1 trunk if bootstrap and reg-test succeeds? Bonus for making this a helper in gdbhooks.py instead, using fopen/fclose and the existing inferior calls. Using attached demonstrator patch, we can use command debug-function-to-file in gdb: ... (gdb) python import sys; sys.path.append('/gcc'); import gdbhooks Successfully loaded GDB hooks for GCC (gdb) b tail_merge_optimize Breakpoint 1 at 0x11be1cc: file gcc/tree-ssa-tail-merge.c, line 1633. (gdb) r Starting program: cc1 -quiet hello.c -O2 -o hello.s Breakpoint 1, tail_merge_optimize (todo=0) at gcc/tree-ssa-tail-merge.c:1633 1633 int nr_bbs_removed_total = 0; (gdb) debug-function-to-file "bla.txt" $1 = 47180688 $2 = $3 = 0 ... and we find that bla.txt contains the result of dump_function_to_file: ... $ cat bla.txt main () { : __builtin_puts (&"hello"[0]); return 0; } ... I would be nice if we could avoid the ${1,2,3} printouts and value history assignments, but I'm not sure how to do that. And given that we have a command with a variable amount of arguments, we should try to handle different number of arguments, say these three giving the same result: - debug-function-to-file "bla.txt" - debug-function-to-file "bla.txt" cfun->decl - debug-function-to-file "bla.txt" cfun->decl dump_flags Or perhaps 0 for flags by default, as in pcfun. Thanks, - Tom Add debug-function-to-file to gdbhooks.py --- gcc/gdbhooks.py | 18 ++ 1 file changed, 18 insertions(+) diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index 0d5cc97..9c230f9 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -589,4 +589,22 @@ class BreakOnPass(gdb.Command): BreakOnPass() +class debug_function_to_file(gdb.Command): +""" +""" +def __init__(self): +gdb.Command.__init__(self, 'debug-function-to-file', gdb.COMMAND_USER) + +def invoke(self, arg, from_tty): +gdb.execute("call fopen (%s, \"w\")" % arg) +fp = gdb.history (0) +gdb.execute("p cfun->decl") +fn = gdb.history (0) +flags = 0 +dumpargs = "(tree)%u, (FILE *)%u, %u" % (fn, fp, flags) +gdb.execute("call dump_function_to_file (%s)" % dumpargs) +gdb.execute("call fclose ((FILE *)%u)" % fp) + +debug_function_to_file() + print('Successfully loaded GDB hooks for GCC')
[PATCH][GCC7] Remove scaling of COMPONENT_REF/ARRAY_REF ops 2/3
The following experiment resulted from looking at making array_ref_low_bound and array_ref_element_size non-mutating. Again I wondered why we do this strange scaling by offset/element alignment. This exposes sth to GIMPLE (the division) that is in the end not necessary while costing extra tree building/folding each time array_ref_low_bound or array_ref_element_size are called (for variable size cases, of course). And those are called a lot via get_inner_reference and get_ref_base_and_extent. So - the following patch gets rid of that scaling. For a "simple" C testcase void bar (void *); void foo (int n) { struct S { struct R { int b[n]; } a[2]; int k; } s; s.k = 1; s.a[1].b[7] = 3; bar (&s); } the difference in .gimple (and also pre expand at -O1) is: @@ -95,10 +93,8 @@ D.1782 = D.1770 * 8; D.1788 = D.1782 + 4; s.1 = __builtin_alloca_with_align (D.1788, 32); - D.1790 = D.1782 /[ex] 4; - s.1->k{off: D.1790 * 4} = 1; - D.1791 = D.1773 /[ex] 4; - s.1->a[1]{lb: 0 sz: D.1791 * 4}.b[7] = 3; + s.1->k{off: D.1782} = 1; + s.1->a[1]{lb: 0 sz: D.1773}.b[7] = 3; s.2 = s.1; bar (s.2); and the generated initial RTL is much smaller: -;; s.1_10->k{off: _11 * 4} = 1; +;; s.1_10->k{off: _7} = 1; -(insn 21 20 22 (parallel [ -(set (reg:DI 108) -(lshiftrt:DI (reg:DI 90 [ _7 ]) -(const_int 2 [0x2]))) -(clobber (reg:CC 17 flags)) -]) t.c:5 -1 - (expr_list:REG_EQUAL (udiv:DI (reg:DI 90 [ _7 ]) -(const_int 4 [0x4])) -(nil))) - -(insn 22 21 0 (set (mem/c:SI (plus:DI (mult:DI (reg:DI 108) -(const_int 4 [0x4])) -(reg/f:DI 92 [ s.1 ])) [3 s.1_10->k{off: _11 * 4}+0 S4 A32]) +(insn 20 19 0 (set (mem/c:SI (plus:DI (reg/f:DI 91 [ s.1 ]) +(reg:DI 89 [ _7 ])) [3 s.1_10->k{off: _7}+0 S4 A32]) (const_int 1 [0x1])) t.c:5 -1 (nil)) ... -;; s.1_10->a[1]{lb: 0 sz: _13 * 4}.b[7] = 3; +Applying pattern match.pd:83, generic-match.c:9034 -(insn 23 22 24 (parallel [ -(set (reg:DI 109) -(ashift:DI (reg:DI 88 [ _5 ]) -(const_int 2 [0x2]))) -(clobber (reg:CC 17 flags)) -]) t.c:4 -1 - (nil)) - -(insn 24 23 25 (parallel [ -(set (reg:DI 110) -(lshiftrt:DI (reg:DI 109) -(const_int 2 [0x2]))) -(clobber (reg:CC 17 flags)) -]) t.c:6 -1 - (expr_list:REG_EQUAL (udiv:DI (reg:DI 109) -(const_int 4 [0x4])) -(nil))) +;; s.1_10->a[1]{lb: 0 sz: _6}.b[7] = 3; -(insn 25 24 26 (parallel [ -(set (reg:DI 111) -(plus:DI (reg:DI 110) -(const_int 7 [0x7]))) -(clobber (reg:CC 17 flags)) -]) t.c:6 -1 - (nil)) - -(insn 26 25 0 (set (mem:SI (plus:DI (mult:DI (reg:DI 111) -(const_int 4 [0x4])) -(reg/f:DI 92 [ s.1 ])) [3 s.1_10->a[1]{lb: 0 sz: _13 * 4}.b+28 S4 A32]) +(insn 21 20 0 (set (mem:SI (plus:DI (plus:DI (mult:DI (reg:DI 87 [ _5 ]) +(const_int 4 [0x4])) +(reg/f:DI 91 [ s.1 ])) +(const_int 28 [0x1c])) [3 s.1_10->a[1]{lb: 0 sz: _6}.b+28 S4 A32]) (const_int 3 [0x3])) t.c:6 -1 (nil)) assembler difference is @@ -11,12 +11,11 @@ movq%rsp, %rbp .cfi_def_cfa_register 6 movslq %edi, %rax - leaq0(,%rax,8), %rcx - leaq22(%rcx), %rdx + leaq22(,%rax,8), %rdx andq$-16, %rdx subq%rdx, %rsp movq%rsp, %rdi - movl$1, (%rsp,%rcx) + movl$1, (%rsp,%rax,8) movl$3, 28(%rsp,%rax,4) callbar leave which may be a mixed bag - I suppose the testcase is too simple to evaluate code changes. I tried to go back in time and finding relevant discussions about this but didn't find anything as the whole (very large) discussion ended up about increasing the size of the COMPONENT_REF / ARRAY_REF trees So - I hope somebody from Adacore can evaluate this patch code-generation wise. Bootstrapped and tested on x86_64-unknown-linux-gnu. Queued for GCC 7 if I hear nothing negative from the Ada folks. Back to making array_ref_element_size/array_ref_low_bound non-mutating (for the PR69553 fix). Bah, which doesn't work - substituting placeholders during gimplification only - as PLACEHOLDER_EXPRs leak into the IL via folding (calling get_inner_reference for example) before gimplification. Even the ADA FE itself does this. Thanks, Richard. 2016-02-18 Richard Biener * tree.def (COMPONENT_REF): Adjust docs for operand 2. (ARRAY_REF): Adjust docs for operand 3. * gimplify.c (gimplify_compound_lval): Remove scaling of COMPONENT_REF and ARRAY_REF operands. * tree.c (array_ref_element_size): Remove scaling of operand 3. (compone
[PATCH, rs6000] PR 66337: Improve Compliance with Power ABI
This patch has bootstrapped and tested on powerpc64le-unknown-linux-gnu and powerpc64be-unknown-linux-gnu (both 32-bit and 64-bit) and powerpc64-unknown-freebsd11.0 (big endian) with no regressions. Is it ok to fix this on the trunk? The problem described in PR66337 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D66337) is that compiling for PowerPC targets with the -malign-power command-line option results in invalid field offsets for certain structure members. As identified in the problem report, this results from a macro definition present in both config/rs6000/{freebsd64,linux64}.h, which in both cases is introduced by the comment: /* PowerPC64 Linux word-aligns FP doubles when -malign-power is given. */ I have consulted various ABI documents, including "64-bit PowerPC ELF Application Binary Interface Supplement 1.9" (http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html), "Power Architecture 64-Bit ELF V2 ABI Specification" (https://members.openpowerfoundation.org/document/dl/576), and "Power Architecture(R) 32-bit Application Binary Interface Supplement 1.0 - Linux(R) & Embedded" (https://www.power.org/documentation/power-architecture-32-bit-abi-supplement-1-0-embeddedlinuxunified/). I have not been able to find any basis for this comment and thus am concluding that the comment and existing implementation are incorrect. The implemented patch removes the comment and changes the macro definition so that field alignment calculations on 64-bit architectures ignore the -malign-power command-line option. With this fix, the test case identified in the PR behaves as was expected by the submitter. gcc/ChangeLog: 2016-02-17 Kelvin Nilsen PR target/66337 * config/rs6000/freebsd64.h: Remove erroneous comment and correct definition of ADJUST_FIELD_ALIGN macro. * config/rs6000/linux64.h: Remove erroneous comment and correct definition of ADJUST_FIELD_ALIGN macro. gcc/testsuite/ChangeLog: 2016-02-17 Kelvin Nilsen PR target/66337 * g++.dg/pr66337-1.C: New test. Index: gcc/testsuite/g++.dg/pr66337-1.C === --- gcc/testsuite/g++.dg/pr66337-1.C(revision 0) +++ gcc/testsuite/g++.dg/pr66337-1.C(revision 233507) @@ -0,0 +1,77 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-options "-std=c++11 -malign-power -O2" } */ + +/* Power ABI for 32-bit and 64-bit compilers place the same alignment + restrictions on long longs and doubles. */ + +typedef double _Tp; + +struct _Tp2 { + char b; + int i; + char c; + long long l; + _Tp _M_t; +}; + +extern _Tp2 tp2e; + +struct _ST1 { + char a; +}; + +struct _ST2 { + int b; +}; + +struct _ST3 { + float f; +}; + +struct _ST4 { + double d; +}; + +struct _ST5 { + char a; + double d; +}; + +struct _ST6 { + double d; + char a; +}; + +int main () +{ + int a = alignof (_Tp2); + int b = __alignof__(_Tp2::_M_t); + int c = alignof(_Tp); + int d = __alignof__(tp2e._M_t); + int e = alignof(_Tp2::_M_t); + + int f = __alignof__(_Tp2::l); + int g = alignof (long long); + int h = __alignof__(tp2e.l); + int i = alignof(_Tp2::l); + + if ((a != 8) || (b != 8) || (c != 8) || (d != 8) || (e != 8) + || (f != 8) || (g != 8) || (h != 8) || (i != 8)) +return -1; + + a = sizeof (_ST1); + b = sizeof (_ST2); + c = sizeof (_ST3); + d = sizeof (_ST4); + e = sizeof (_ST5); + f = sizeof (_ST6); + g = sizeof (_Tp2); + + if ((a != 1) || (b != 4) || (c != 4) || (d != 8) + || (e != 16) || (f != 16) || (g != 32)) +return -2; + + /* success */ + return 0; +} + Index: gcc/config/rs6000/freebsd64.h === --- gcc/config/rs6000/freebsd64.h (revision 233308) +++ gcc/config/rs6000/freebsd64.h (working copy) @@ -363,16 +363,10 @@ extern int dot_symbols; /* Use standard DWARF numbering for DWARF debugging information. */ #define RS6000_USE_DWARF_NUMBERING -/* PowerPC64 Linux word-aligns FP doubles when -malign-power is given. */ #undef ADJUST_FIELD_ALIGN #define ADJUST_FIELD_ALIGN(FIELD, COMPUTED) \ (rs6000_special_adjust_field_align_p ((FIELD), (COMPUTED)) \ - ? 128\ - : (TARGET_64BIT \ - && TARGET_ALIGN_NATURAL == 0 \ - && TYPE_MODE (strip_array_types (TREE_TYPE (FIELD))) == DFmode) \ - ? MIN ((COMPUTED), 32) \ - : (COMPUTED)) + ? 128: (COMPUTED)) #undef TOC_SECTION_ASM_OP #define TOC_SECTION_ASM_OP \ Index: gcc/config/rs6000/linux64.h === --- gcc/config/rs6000/linux64.h (revision 233308) +++ gcc/config/rs6000/linux64.h (working copy) @@ -290,16 +290,10 @@ extern int dot_symbols; #define PROFILE_HOOK(LABEL)
Re: [PATCH] Add debug_function_to_file
On Thu, Feb 18, 2016 at 3:29 PM, Tom de Vries wrote: > On 17/02/16 14:42, Richard Biener wrote: >> >> On Wed, Feb 17, 2016 at 1:41 PM, Tom de Vries >> wrote: >>> >>> Hi, >>> >>> once in a while I'm in a gdb debug session debugging cc1, and want to >>> print >>> the current function to file. >>> >>> There's a debug function debug_function that prints a function to stderr, >>> and there are methods to redirect output of a command to a file ( >>> https://sourceware.org/gdb/onlinedocs/gdb/Logging-Output.html ). >>> >>> And there's a function dump_function_to_file that takes a FILE* >>> parameter, >>> which could be combined with open/close calls in gdb. >>> >>> But I think a short-hand is easier. >>> >>> This patch adds a function debug_function_to_file. It can f.i. be called >>> as: >>> ... >>> (gdb) call debug_function_to_file (cfun.decl, "foo.1.txt", 0) >>> ... >>> >>> Hmm, now I wonder if the order 'cfun.decl, 0, "foo.1.txt"' would make >>> more >>> sense (first two parameters the same as in debug_function). >>> >>> OK for stage1 trunk if bootstrap and reg-test succeeds? >> >> >> Bonus for making this a helper in gdbhooks.py instead, using >> fopen/fclose and the existing inferior calls. >> > > Using attached demonstrator patch, we can use command debug-function-to-file > in gdb: > ... > (gdb) python import sys; sys.path.append('/gcc'); import gdbhooks > Successfully loaded GDB hooks for GCC > (gdb) b tail_merge_optimize > Breakpoint 1 at 0x11be1cc: file gcc/tree-ssa-tail-merge.c, line 1633. > (gdb) r > Starting program: cc1 -quiet hello.c -O2 -o hello.s > Breakpoint 1, tail_merge_optimize (todo=0) at gcc/tree-ssa-tail-merge.c:1633 > 1633 int nr_bbs_removed_total = 0; > (gdb) debug-function-to-file "bla.txt" > $1 = 47180688 > $2 = > $3 = 0 > ... > > and we find that bla.txt contains the result of dump_function_to_file: > ... > $ cat bla.txt > main () > { > : > __builtin_puts (&"hello"[0]); > return 0; > > } > ... > > I would be nice if we could avoid the ${1,2,3} printouts and value history > assignments, but I'm not sure how to do that. > > And given that we have a command with a variable amount of arguments, we > should try to handle different number of arguments, say these three giving > the same result: > - debug-function-to-file "bla.txt" > - debug-function-to-file "bla.txt" cfun->decl > - debug-function-to-file "bla.txt" cfun->decl dump_flags > Or perhaps 0 for flags by default, as in pcfun. Thanks for the example. Still trying to figure out how to handle multi vs. one arg case. Looks like the argument is passed as string ... Anyway, below is a dot-fn command which will only require refactoring of print_graph_cfg in graph.c. It still lacks a way to specify dump flags w/o blowing up though ... Richard. class dot_fn(gdb.Command): """ """ def __init__(self): gdb.Command.__init__(self, 'dot-fn', gdb.COMMAND_USER) def invoke(self, arg, from_tty): gdb.execute("call popen(\"dot -Tx11\", \"w\")") fp = gdb.history (0) gdb.execute("call start_graph_dump ((FILE *)%u, \"\")" % fp) gdb.execute("call print_graph_cfg_1 ((FILE *)%u, %s)" % (fp, arg)) gdb.execute("call end_graph_dump ((FILE *)%u)" % fp) gdb.execute("call fflush ((FILE *)%u)" % fp) gdb.execute("call close (fileno ((FILE *)%u))" % fp) gdb.execute("call pclose ((FILE *)%u)" % fp) dot_fn() > Thanks, > - Tom
Re: [PATCH] Fix PR37448 (and dups?)
> > This fixes the IPA inline analysis quadraticness that arises when we > inline functions into the same function many times via > inline_to_all_callers as we do in the testcase for PR37448. In that > case updating the overall summary of the caller is done for each > inlined call instead of just once. > > The solution is to keep track of which nodes we inlined to and > delay the summary update. > > Honza, I believe this is safe as the callers summary should only > matter for further inline decisions and not this one (inlining > into all callers). Double-checking is appreciated - there should > be no code changes by this patch (I verified this on the PR37448 > testcase). I think it is not the case if one function contains multiple calls and eventually hits large function growth. In that case we inline just some callers and not the others. For next stage1 I plan to finish the overahaul to sreals that will let me to make update_overall_summary incremental (i.e. O(size_of_inlined_function) and not O(size_of_inlined_function+size_of_function_inlined_to)) that will also solve these issues. One can probably do a pesimistic estimate on code size of the function (i.e. add the size of inlined function) and only if that hits the large function growth do the exact calculation. Or we can just decide that it is OK to ignore large function growht in this partiuclar case. Honza > > This brings down compile-time of ipa inlining heuristics from > > ipa inlining heuristics : 260.14 (84%) usr 0.50 (23%) sys 260.91 (84%) > wall 102217 kB (10%) ggc > > (that's an optimized non-checking compiler) to > > ipa inlining heuristics : 3.52 ( 3%) usr 0.55 (12%) sys 4.21 ( 3%) > wall 102216 kB (12%) ggc > > (that's an unoptimized, checking enabled compiler, "before" on that > unoptimized compiler is ipa inlining heuristics : 935.06 (89%)). > > We still do a lot of inlining here so we see > > integration : 21.22 (17%) usr 0.45 (10%) sys 21.68 (17%) > wall 142979 kB (17%) ggc > > which is supposedly expected (same unoptimized, checking enabled > compiler). > > Bootstrap & regtest is running on x86_64-unknown-linux-gnu, ok for trunk > and branch(es)? > > Thanks, > Richard. > > 2016-02-18 Richard Biener > > PR ipa/37448 > * ipa-inline.c (inline_to_all_callers_1): Add to the callers > hash-set, do not update overall summaries here. Renamed from ... > (inline_to_all_callers): ... this which is now wrapping the > above and performing delayed overall summary update. > > Index: gcc/ipa-inline.c > === > *** gcc/ipa-inline.c (revision 233498) > --- gcc/ipa-inline.c (working copy) > *** flatten_function (struct cgraph_node *no > *** 2163,2169 > recursion. */ > > static bool > ! inline_to_all_callers (struct cgraph_node *node, void *data) > { > int *num_calls = (int *)data; > bool callee_removed = false; > --- 2163,2170 > recursion. */ > > static bool > ! inline_to_all_callers_1 (struct cgraph_node *node, void *data, > ! hash_set *callers) > { > int *num_calls = (int *)data; > bool callee_removed = false; > *** inline_to_all_callers (struct cgraph_nod > *** 2193,2199 > inline_summaries->get (node->callers->caller)->size); > } > > ! inline_call (node->callers, true, NULL, NULL, true, &callee_removed); > if (dump_file) > fprintf (dump_file, >" Inlined into %s which now has %i size\n", > --- 2194,2203 > inline_summaries->get (node->callers->caller)->size); > } > > ! /* Remember which callers we inlined to, delaying updating the > ! overall summary. */ > ! callers->add (node->callers->caller); > ! inline_call (node->callers, true, NULL, NULL, false, &callee_removed); > if (dump_file) > fprintf (dump_file, >" Inlined into %s which now has %i size\n", > *** inline_to_all_callers (struct cgraph_nod > *** 2211,2216 > --- 2215,2237 > return false; > } > > + /* Wrapper around inline_to_all_callers_1 doing delayed overall summary > +update. */ > + > + static bool > + inline_to_all_callers (struct cgraph_node *node, void *data) > + { > + hash_set callers; > + bool res = inline_to_all_callers_1 (node, data, &callers); > + /* Perform the delayed update of the overall summary of all callers > + processed. This avoids quadratic behavior in the cases where > + we have a lot of calls to the same function. */ > + for (hash_set::iterator i = callers.begin (); > +i != callers.end (); ++i) > + inline_update_overall_summary (*i); > + return res; > + } > + > /* Output overall time estimate. */ > static void > dump_overall_stats (void)
Re: [PATCH] Add debug_function_to_file
On 18/02/16 16:10, Richard Biener wrote: On Thu, Feb 18, 2016 at 3:29 PM, Tom de Vries wrote: >On 17/02/16 14:42, Richard Biener wrote: >> >>On Wed, Feb 17, 2016 at 1:41 PM, Tom de Vries >>wrote: >>> >>>Hi, >>> >>>once in a while I'm in a gdb debug session debugging cc1, and want to >>>print >>>the current function to file. >>> >>>There's a debug function debug_function that prints a function to stderr, >>>and there are methods to redirect output of a command to a file ( >>>https://sourceware.org/gdb/onlinedocs/gdb/Logging-Output.html ). >>> >>>And there's a function dump_function_to_file that takes a FILE* >>>parameter, >>>which could be combined with open/close calls in gdb. >>> >>>But I think a short-hand is easier. >>> >>>This patch adds a function debug_function_to_file. It can f.i. be called >>>as: >>>... >>>(gdb) call debug_function_to_file (cfun.decl, "foo.1.txt", 0) >>>... >>> >>>Hmm, now I wonder if the order 'cfun.decl, 0, "foo.1.txt"' would make >>>more >>>sense (first two parameters the same as in debug_function). >>> >>>OK for stage1 trunk if bootstrap and reg-test succeeds? >> >> >>Bonus for making this a helper in gdbhooks.py instead, using >>fopen/fclose and the existing inferior calls. >> > >Using attached demonstrator patch, we can use command debug-function-to-file >in gdb: >... >(gdb) python import sys; sys.path.append('/gcc'); import gdbhooks >Successfully loaded GDB hooks for GCC >(gdb) b tail_merge_optimize >Breakpoint 1 at 0x11be1cc: file gcc/tree-ssa-tail-merge.c, line 1633. >(gdb) r >Starting program: cc1 -quiet hello.c -O2 -o hello.s >Breakpoint 1, tail_merge_optimize (todo=0) at gcc/tree-ssa-tail-merge.c:1633 >1633 int nr_bbs_removed_total = 0; >(gdb) debug-function-to-file "bla.txt" >$1 = 47180688 >$2 = >$3 = 0 >... > >and we find that bla.txt contains the result of dump_function_to_file: >... >$ cat bla.txt >main () >{ > : > __builtin_puts (&"hello"[0]); > return 0; > >} >... > >I would be nice if we could avoid the ${1,2,3} printouts and value history >assignments, but I'm not sure how to do that. > Using gdb.parse_and_eval does the trick. >And given that we have a command with a variable amount of arguments, we >should try to handle different number of arguments, say these three giving >the same result: >- debug-function-to-file "bla.txt" >- debug-function-to-file "bla.txt" cfun->decl >- debug-function-to-file "bla.txt" cfun->decl dump_flags >Or perhaps 0 for flags by default, as in pcfun. Thanks for the example. Still trying to figure out how to handle multi vs. one arg case. Looks like the argument is passed as string ... https://sourceware.org/gdb/current/onlinedocs/gdb/Commands-In-Python.html#Commands-In-Python : ... — Function: Command.invoke (argument, from_tty) ... To break argument up into an argv-like string use gdb.string_to_argv. This function behaves identically to gdb's internal argument lexer buildargv. It is recommended to use this for consistency. Arguments are separated by spaces and may be quoted. Example: print gdb.string_to_argv ("1 2\ \\\"3 '4 \"5' \"6 '7\"") ['1', '2 "3', '4 "5', "6 '7"] ... Thanks, - Tom
Re: [PATCH] Fix PR37448 (and dups?)
On Thu, 18 Feb 2016, Jan Hubicka wrote: > > > > This fixes the IPA inline analysis quadraticness that arises when we > > inline functions into the same function many times via > > inline_to_all_callers as we do in the testcase for PR37448. In that > > case updating the overall summary of the caller is done for each > > inlined call instead of just once. > > > > The solution is to keep track of which nodes we inlined to and > > delay the summary update. > > > > Honza, I believe this is safe as the callers summary should only > > matter for further inline decisions and not this one (inlining > > into all callers). Double-checking is appreciated - there should > > be no code changes by this patch (I verified this on the PR37448 > > testcase). > > I think it is not the case if one function contains multiple calls > and eventually hits large function growth. In that case we inline > just some callers and not the others. Are you sure? I thought we compute that up-front ... at least we make sure we can_inline_edge_p all calls before even trying to inline all calls - that looks somewhat redundant then if it can happen that we give up anyway. Ah, so can_inline_edge_p has /* Check if caller growth allows the inlining. */ else if (!DECL_DISREGARD_INLINE_LIMITS (callee->decl) && !disregard_limits && !lookup_attribute ("flatten", DECL_ATTRIBUTES (caller->decl)) && !caller_growth_limits (e)) inlinable = false; and we check that from when we do the inlining and upfront from want_inline_function_to_all_callers_p ... we also give up if we inline a recursive function it seems. > For next stage1 I plan to finish the overahaul to sreals that will let > me to make update_overall_summary incremental (i.e. > O(size_of_inlined_function) > and not O(size_of_inlined_function+size_of_function_inlined_to)) that will > also solve these issues. Hopefully. > One can probably do a pesimistic estimate on code size of the function > (i.e. add the size of inlined function) and only if that hits the large > function growth do the exact calculation. Or we can just decide that it > is OK to ignore large function growht in this partiuclar case. Ignoring it is probably not a good idea and will just lead to a different degenerate case. As we update the summary afterwards it's probably ok to do incremental (pessimistic) updates on the size anyway? In inline_merge_summary I mean? Or should I open-code that into inline_call if ! update_overall_summary? Richard. > Honza > > > > This brings down compile-time of ipa inlining heuristics from > > > > ipa inlining heuristics : 260.14 (84%) usr 0.50 (23%) sys 260.91 (84%) > > wall 102217 kB (10%) ggc > > > > (that's an optimized non-checking compiler) to > > > > ipa inlining heuristics : 3.52 ( 3%) usr 0.55 (12%) sys 4.21 ( 3%) > > wall 102216 kB (12%) ggc > > > > (that's an unoptimized, checking enabled compiler, "before" on that > > unoptimized compiler is ipa inlining heuristics : 935.06 (89%)). > > > > We still do a lot of inlining here so we see > > > > integration : 21.22 (17%) usr 0.45 (10%) sys 21.68 (17%) > > wall 142979 kB (17%) ggc > > > > which is supposedly expected (same unoptimized, checking enabled > > compiler). > > > > Bootstrap & regtest is running on x86_64-unknown-linux-gnu, ok for trunk > > and branch(es)? > > > > Thanks, > > Richard. > > > > 2016-02-18 Richard Biener > > > > PR ipa/37448 > > * ipa-inline.c (inline_to_all_callers_1): Add to the callers > > hash-set, do not update overall summaries here. Renamed from ... > > (inline_to_all_callers): ... this which is now wrapping the > > above and performing delayed overall summary update. > > > > Index: gcc/ipa-inline.c > > === > > *** gcc/ipa-inline.c(revision 233498) > > --- gcc/ipa-inline.c(working copy) > > *** flatten_function (struct cgraph_node *no > > *** 2163,2169 > > recursion. */ > > > > static bool > > ! inline_to_all_callers (struct cgraph_node *node, void *data) > > { > > int *num_calls = (int *)data; > > bool callee_removed = false; > > --- 2163,2170 > > recursion. */ > > > > static bool > > ! inline_to_all_callers_1 (struct cgraph_node *node, void *data, > > !hash_set *callers) > > { > > int *num_calls = (int *)data; > > bool callee_removed = false; > > *** inline_to_all_callers (struct cgraph_nod > > *** 2193,2199 > >inline_summaries->get (node->callers->caller)->size); > > } > > > > ! inline_call (node->callers, true, NULL, NULL, true, > > &callee_removed); > > if (dump_file) > > fprintf (dump_file, > > " Inlined into %s which now has %i size\n", > > --- 2194,2203 > >inline_summaries->get (node-
Re: [PATCH] Add debug_function_to_file
On Thu, Feb 18, 2016 at 4:18 PM, Tom de Vries wrote: > On 18/02/16 16:10, Richard Biener wrote: >> >> On Thu, Feb 18, 2016 at 3:29 PM, Tom de Vries >> wrote: >>> >>> >On 17/02/16 14:42, Richard Biener wrote: >> >>On Wed, Feb 17, 2016 at 1:41 PM, Tom de Vries >>wrote: > > >>> > >>>Hi, > >>> > >>>once in a while I'm in a gdb debug session debugging cc1, and want > >>> to > >>>print > >>>the current function to file. > >>> > >>>There's a debug function debug_function that prints a function to > >>> stderr, > >>>and there are methods to redirect output of a command to a file ( > >>>https://sourceware.org/gdb/onlinedocs/gdb/Logging-Output.html ). > >>> > >>>And there's a function dump_function_to_file that takes a FILE* > >>>parameter, > >>>which could be combined with open/close calls in gdb. > >>> > >>>But I think a short-hand is easier. > >>> > >>>This patch adds a function debug_function_to_file. It can f.i. be > >>> called > >>>as: > >>>... > >>>(gdb) call debug_function_to_file (cfun.decl, "foo.1.txt", 0) > >>>... > >>> > >>>Hmm, now I wonder if the order 'cfun.decl, 0, "foo.1.txt"' would > >>> make > >>>more > >>>sense (first two parameters the same as in debug_function). > >>> > >>>OK for stage1 trunk if bootstrap and reg-test succeeds? >> >> >>Bonus for making this a helper in gdbhooks.py instead, using >>fopen/fclose and the existing inferior calls. >> >>> >>> > >>> >Using attached demonstrator patch, we can use command >>> > debug-function-to-file >>> >in gdb: >>> >... >>> >(gdb) python import sys; sys.path.append('/gcc'); import gdbhooks >>> >Successfully loaded GDB hooks for GCC >>> >(gdb) b tail_merge_optimize >>> >Breakpoint 1 at 0x11be1cc: file gcc/tree-ssa-tail-merge.c, line 1633. >>> >(gdb) r >>> >Starting program: cc1 -quiet hello.c -O2 -o hello.s >>> >Breakpoint 1, tail_merge_optimize (todo=0) at >>> > gcc/tree-ssa-tail-merge.c:1633 >>> >1633 int nr_bbs_removed_total = 0; >>> >(gdb) debug-function-to-file "bla.txt" >>> >$1 = 47180688 >>> >$2 = >>> >$3 = 0 >>> >... >>> > >>> >and we find that bla.txt contains the result of dump_function_to_file: >>> >... >>> >$ cat bla.txt >>> >main () >>> >{ >>> > : >>> > __builtin_puts (&"hello"[0]); >>> > return 0; >>> > >>> >} >>> >... >>> > >>> >I would be nice if we could avoid the ${1,2,3} printouts and value >>> > history >>> >assignments, but I'm not sure how to do that. >>> > > > > Using gdb.parse_and_eval does the trick. > >>> >And given that we have a command with a variable amount of arguments, we >>> >should try to handle different number of arguments, say these three >>> > giving >>> >the same result: >>> >- debug-function-to-file "bla.txt" >>> >- debug-function-to-file "bla.txt" cfun->decl >>> >- debug-function-to-file "bla.txt" cfun->decl dump_flags >>> >Or perhaps 0 for flags by default, as in pcfun. >> >> Thanks for the example. Still trying to figure out how to handle >> multi vs. one arg case. >> Looks like the argument is passed as string ... >> > > https://sourceware.org/gdb/current/onlinedocs/gdb/Commands-In-Python.html#Commands-In-Python > : > ... > — Function: Command.invoke (argument, from_tty) > > ... > > To break argument up into an argv-like string use gdb.string_to_argv. > This function behaves identically to gdb's internal argument lexer > buildargv. It is recommended to use this for consistency. Arguments are > separated by spaces and may be quoted. Example: > > print gdb.string_to_argv ("1 2\ \\\"3 '4 \"5' \"6 '7\"") > ['1', '2 "3', '4 "5', "6 '7"] > ... Thanks. Attached is what I have for now, it works if you call it like (gdb) dot-fn cfun (gdb) dot-fn cfun, 1<<6 w/o that arg parsing ;) I'll play with it some more tomorrow. Richard. 2016-02-18 Richard Biener * graph.c: Include dumpfile.h. (print_graph_cfg): Split into three overloads. * gdbhooks.py (dot-fn): New command. > Thanks, > - Tom p Description: Binary data
Re: [PATCH] Add debug_function_to_file
On 18/02/16 16:27, Richard Biener wrote: I would be nice if we could avoid the ${1,2,3} printouts and value >>> >history >>> >assignments, but I'm not sure how to do that. >>> > Using gdb.parse_and_eval does the trick. This updated version uses gdb.parse_and_eval, and adds error handling. Thanks, - Tom Add debug-function-to-file to gdbhooks.py --- gcc/gdbhooks.py | 32 1 file changed, 32 insertions(+) diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index 0d5cc97..c19e823 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -589,4 +589,36 @@ class BreakOnPass(gdb.Command): BreakOnPass() +class debug_function_to_file(gdb.Command): +def __init__(self): +gdb.Command.__init__(self, 'debug-function-to-file', gdb.COMMAND_USER) + +def invoke(self, arg, from_tty): +# Find function +fn = gdb.parse_and_eval("cfun ? cfun->decl : current_function_decl") +if fn == 0: +print ("Could not find current function") +return + +# Open file +file = arg +fp = gdb.parse_and_eval("fopen (%s, \"w\")" % file) +if fp == 0: +print ("Could not open file: %s" % file) +return + +# Set flags +flags = 0 + +# Dump function to file +dumpargs = "(tree)%u, (FILE *)%u, %u" % (fn, fp, flags) +_ = gdb.parse_and_eval("dump_function_to_file (%s)" % dumpargs) + +# Close file +ret = gdb.parse_and_eval("fclose ((FILE *)%u)" % fp) +if ret != 0: +print ("Could not close file: %s" % file) + +debug_function_to_file() + print('Successfully loaded GDB hooks for GCC')
[PING] Re: [PATCH] [wwwdocs] Add a "Plugin issues" section to the GCC 6 porting guide
Ping: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00765.html On Thu, 2016-02-11 at 10:12 -0500, David Malcolm wrote: > I've (mostly) ported gcc-python-plugin to gcc 6. The attached patch > for the gcc website starts a new "Plugin issues" section, and covers > the biggest issue I ran into (FWIW the suggested compatibility > typedef > is the one I committed to gcc-python-plugin). > > Validates. > > OK to commit?
Re: [PATCH] Add debug_function_to_file
On 18/02/16 16:27, Richard Biener wrote: Attached is what I have for now, it works if you call it like (gdb) dot-fn cfun (gdb) dot-fn cfun, 1<<6 w/o that arg parsing ;) I'll play with it some more tomorrow. Richard. 2016-02-18 Richard Biener + /* Overload with additional flag argument. */ + + void DEBUG_FUNCTION + print_graph_cfg (FILE *fp, struct function *fun, int flags) + { + int saved_dump_flags = dump_flags; + dump_flags = flags; + print_graph_cfg (fp, fun); + dump_flags = saved_dump_flags; + } + + FTR, this is what I wrote for adding the flags argument to dump_bb_for_graph. I ran out of steam at rtl_dump_bb_for_graph, and resorted to the same save-restore dump_flags trick, although it felt a bit hacky to me. Thanks, - Tom Use flags parameter to debug_function_graph_to_file 2016-02-17 Tom de Vries * cfghooks.c (dump_bb_for_graph): Add and handle flags parameter. * cfghooks.h (struct cfg_hooks): Add int parameter in dump_bb_for_graph declaration. * gimple-pretty-print.c (gimple_dump_bb_for_graph): Add and handle flags parameter. * gimple-pretty-print.h (gimple_dump_bb_for_graph): Add int parameter. * graph.c (draw_cfg_node, draw_cfg_nodes_no_loops) (draw_cfg_nodes_for_loop, draw_cfg_nodes, print_graph_cfg_fp) (print_graph_cfg, debug_function_graph_to_file): Add and handle flags parameter. * graph.h (debug_function_graph_to_file): Add int parameter. * print-rtl.c (rtl_dump_bb_for_graph): Add and handle flags parameter. * print-rtl.h (rtl_dump_bb_for_graph): Add int parameter. --- gcc/cfghooks.c| 4 ++-- gcc/cfghooks.h| 4 ++-- gcc/gimple-pretty-print.c | 8 gcc/gimple-pretty-print.h | 2 +- gcc/graph.c | 33 + gcc/graph.h | 2 +- gcc/print-rtl.c | 11 --- gcc/print-rtl.h | 2 +- 8 files changed, 36 insertions(+), 30 deletions(-) diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c index bbb1017..9993de1 100644 --- a/gcc/cfghooks.c +++ b/gcc/cfghooks.c @@ -304,7 +304,7 @@ debug (basic_block_def *ptr) escaped, using pp_write_text_as_dot_label_to_stream(). */ void -dump_bb_for_graph (pretty_printer *pp, basic_block bb) +dump_bb_for_graph (pretty_printer *pp, basic_block bb, int flags) { if (!cfg_hooks->dump_bb_for_graph) internal_error ("%s does not support dump_bb_for_graph", @@ -314,7 +314,7 @@ dump_bb_for_graph (pretty_printer *pp, basic_block bb) pp_printf (pp, " FREQ:%i |", bb->frequency); pp_write_text_to_stream (pp); if (!(dump_flags & TDF_SLIM)) -cfg_hooks->dump_bb_for_graph (pp, bb); +cfg_hooks->dump_bb_for_graph (pp, bb, flags); } /* Dump the complete CFG to FILE. FLAGS are the TDF_* flags in dumpfile.h. */ diff --git a/gcc/cfghooks.h b/gcc/cfghooks.h index b7912a1..3eaee1b 100644 --- a/gcc/cfghooks.h +++ b/gcc/cfghooks.h @@ -63,7 +63,7 @@ struct cfg_hooks /* Debugging. */ int (*verify_flow_info) (void); void (*dump_bb) (FILE *, basic_block, int, int); - void (*dump_bb_for_graph) (pretty_printer *, basic_block); + void (*dump_bb_for_graph) (pretty_printer *, basic_block, int); /* Basic CFG manipulation. */ @@ -199,7 +199,7 @@ checking_verify_flow_info (void) } extern void dump_bb (FILE *, basic_block, int, int); -extern void dump_bb_for_graph (pretty_printer *, basic_block); +extern void dump_bb_for_graph (pretty_printer *, basic_block, int); extern void dump_flow_info (FILE *, int); extern edge redirect_edge_and_branch (edge, basic_block); diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index e27214f..b0de032 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -2577,13 +2577,13 @@ gimple_dump_bb (FILE *file, basic_block bb, int indent, int flags) dump_gimple_bb_footer (file, bb, indent, flags); } -/* Dumps basic block BB to pretty-printer PP with default dump flags and +/* Dumps basic block BB to pretty-printer PP with dump flags FLAGS and no indentation, for use as a label of a DOT graph record-node. ??? Should just use gimple_dump_bb_buff here, except that value profiling histogram dumping doesn't know about pretty-printers. */ void -gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb) +gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb, int flags) { pp_printf (pp, ":\n", bb->index); pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true); @@ -2598,7 +2598,7 @@ gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb) pp_bar (pp); pp_write_text_to_stream (pp); pp_string (pp, "# "); - pp_gimple_stmt_1 (pp, phi, 0, dump_flags); + pp_gimple_stmt_1 (pp, phi, 0, flags); pp_newline (pp); pp_write_text_as_dot_label_to_stream (pp, /*for_record=*/true); } @@ -2610,7 +2610,7 @@ gimple_dump_bb_for_graph (pretty_printer *pp, basic_block bb) gimple *stmt = gsi_stmt (gsi); pp_bar (pp); pp_write_text_to_stream (pp); - pp_gimp
Re: [PATCH], PR 68404 patch #4 (fix earlyclobber problem on power8 fusion)
This patch to rs6000.md (which is essentially the same as #3) fixes the problem by removing the early clobber. The patches to predicates.md, and the fusion tests revert my changes on February 9th that originally 'solved' the problem by not allowing fusion of ADDI values. We have tested the fix using a combine profiled and LTO bootstrap build and it does not cause any regressions. Because machine independent changes can mask the bug at times, we also did a profiled/LTO build on the subversion revision that showed up the problem. Is this ok to install in the trunk? Since gcc 5 contains the exact same early clobber, I would like to also back port the change to GCC 5. [gcc] 2016-02-18 Michael Meissner PR target/68404 * config/rs6000/predicates.md (fusion_gpr_addis): Revert 2016-02-09 change. * config/rs6000/rs6000.md (fusion_gpr_load_): Remove earlyclobber from target. Use wF constraint for fused memory address. (fusion_gpr___load): Likewise. [gcc/testsuites] 2016-02-18 Michael Meissner PR target/68404 * gcc.target/powerpc/fusion.c: Revert the 2016-02-09 change. * gcc.target/powerpc/fusion3.c: Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/predicates.md === --- gcc/config/rs6000/predicates.md (revision 233351) +++ gcc/config/rs6000/predicates.md (working copy) @@ -1708,14 +1708,23 @@ (define_predicate "fusion_gpr_addis" (match_code "const_int,high,plus") { HOST_WIDE_INT value; + rtx int_const; if (GET_CODE (op) == HIGH) return 1; - if (!CONST_INT_P (op)) + if (CONST_INT_P (op)) +int_const = op; + + else if (GET_CODE (op) == PLUS + && base_reg_operand (XEXP (op, 0), Pmode) + && CONST_INT_P (XEXP (op, 1))) +int_const = XEXP (op, 1); + + else return 0; - value = INTVAL (op); + value = INTVAL (int_const); if ((value & (HOST_WIDE_INT)0x) != 0) return 0; Index: gcc/config/rs6000/rs6000.md === --- gcc/config/rs6000/rs6000.md (revision 233351) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -12915,8 +12915,8 @@ (define_peephole2 ;; reload) (define_insn "fusion_gpr_load_" - [(set (match_operand:INT1 0 "base_reg_operand" "=&b") - (unspec:INT1 [(match_operand:INT1 1 "fusion_addis_mem_combo_load" "")] + [(set (match_operand:INT1 0 "base_reg_operand" "=b") + (unspec:INT1 [(match_operand:INT1 1 "fusion_addis_mem_combo_load" "wF")] UNSPEC_FUSION_GPR))] "TARGET_P8_FUSION" { @@ -12987,7 +12987,7 @@ (define_insn "fusion_gpr__
Re: [PATCH] Do not emit red stack zones for a fn with no_sanitize_address (PR sanitizer/69863)
On 02/18/2016 01:59 PM, Jakub Jelinek wrote: > Ok for trunk with those changes. > > Jakub Thank you for review, installed as r233524. Martin
[C++ PATCH] Fix handling of C++11 attributes (PR c++/67767)
Hi! While this likely isn't a regression, it looks severe enough that IMHO we should fix this even in stage4 - apparently cp_parser_std_attribute_spec_seq can drop some of the attributes on the floor, by blindly assuming that there is just a single attribute inside of [[ ]], but there could be many, and the list is already in the right order (chained and nreversed at the end). So, we need to chainon the sequences, but either that would be quadratic, or we'd need to nreverse each chain again, then chainon and then nreverse again. So IMHO remembering instead also the last chain element and chaining that way is cleanest and cheapest. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-02-18 Jakub Jelinek PR c++/67767 * parser.c (cp_parser_std_attribute_spec_seq): Don't assume attr_spec is always single element chain, chain all the attributes properly together in the right order. * g++.dg/cpp0x/pr67767.C: New test. --- gcc/cp/parser.c.jj 2016-02-17 23:26:00.0 +0100 +++ gcc/cp/parser.c 2016-02-18 09:48:52.896766040 +0100 @@ -24104,7 +24104,8 @@ cp_parser_std_attribute_spec (cp_parser static tree cp_parser_std_attribute_spec_seq (cp_parser *parser) { - tree attr_specs = NULL; + tree attr_specs = NULL_TREE; + tree attr_last = NULL_TREE; while (true) { @@ -24114,11 +24115,13 @@ cp_parser_std_attribute_spec_seq (cp_par if (attr_spec == error_mark_node) return error_mark_node; - TREE_CHAIN (attr_spec) = attr_specs; - attr_specs = attr_spec; + if (attr_last) + TREE_CHAIN (attr_last) = attr_spec; + else + attr_specs = attr_last = attr_spec; + attr_last = tree_last (attr_last); } - attr_specs = nreverse (attr_specs); return attr_specs; } --- gcc/testsuite/g++.dg/cpp0x/pr67767.C.jj 2016-02-18 09:53:12.808187281 +0100 +++ gcc/testsuite/g++.dg/cpp0x/pr67767.C2016-02-18 09:53:39.311822349 +0100 @@ -0,0 +1,10 @@ +// PR c++/67767 +// { dg-do compile { target c++11 } } +// { dg-options "-Wsuggest-attribute=noreturn" } + +void foo [[gnu::cold, gnu::noreturn]] (); + +void foo ()// { dg-bogus "function might be candidate for attribute" } +{ + throw 1; +} Jakub
[C++ PATCH] Some further -Wnonnull-compare fixes (PR c++/69850)
Hi! As the testcases show, delete[] this; can also introduce artificial comparison that -Wnonnull-compare incorrectly warns on, and the folding can also drop TREE_NO_WARNING if optimizing the operands of the comparison. There is another case in rtti, but I'll need to still test a fix for that, so will post it incrementally. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-02-18 Jakub Jelinek PR c++/69850 * init.c (build_vec_delete_1): Set TREE_NO_WARNING on the NE_EXPR condition. * cp-gimplify.c (cp_fold): Propagate TREE_NO_WARNING from binary operators if folding preserved the binop, just with different arguments. * g++.dg/warn/Wnonnull-compare-2.C: New test. * g++.dg/warn/Wnonnull-compare-3.C: New test. --- gcc/cp/init.c.jj2016-02-17 23:26:29.0 +0100 +++ gcc/cp/init.c 2016-02-18 16:08:35.793958624 +0100 @@ -3678,12 +3678,15 @@ build_vec_delete_1 (tree base, tree maxi body = integer_zero_node; /* Outermost wrapper: If pointer is null, punt. */ + tree cond += fold_build2_loc (input_location, NE_EXPR, boolean_type_node, base, + fold_convert (TREE_TYPE (base), nullptr_node)); + /* This is a compiler generated comparison, don't emit + e.g. -Wnonnull-compare warning for it. */ + if (TREE_CODE (cond) == NE_EXPR) +TREE_NO_WARNING (cond) = 1; body = fold_build3_loc (input_location, COND_EXPR, void_type_node, - fold_build2_loc (input_location, - NE_EXPR, boolean_type_node, base, - fold_convert (TREE_TYPE (base), -nullptr_node)), - body, integer_zero_node); + cond, body, integer_zero_node); body = build1 (NOP_EXPR, void_type_node, body); if (controller) --- gcc/cp/cp-gimplify.c.jj 2016-02-08 18:39:18.0 +0100 +++ gcc/cp/cp-gimplify.c2016-02-18 11:20:08.491365580 +0100 @@ -2068,6 +2068,9 @@ cp_fold (tree x) else x = fold (x); + if (TREE_NO_WARNING (org_x) + && TREE_CODE (x) == TREE_CODE (org_x)) + TREE_NO_WARNING (x) = 1; break; case VEC_COND_EXPR: --- gcc/testsuite/g++.dg/warn/Wnonnull-compare-2.C.jj 2016-02-18 14:07:34.575846172 +0100 +++ gcc/testsuite/g++.dg/warn/Wnonnull-compare-2.C 2016-02-18 14:07:27.526942306 +0100 @@ -0,0 +1,27 @@ +// PR c++/69850 +// { dg-do compile } +// { dg-options "-Wnonnull-compare" } + +struct D { + virtual ~D (); + void foo () const { delete this; } // { dg-bogus "nonnull argument" } + template friend struct A; +}; +template struct A { + static void bar (T *x) { x->foo (); } +}; +template struct B { + T b; + void baz () { A::bar (&b); } +}; +class C { + class E : public D { ~E (); }; + void baz (); + B c; +}; + +void +C::baz () +{ + c.baz (); +} --- gcc/testsuite/g++.dg/warn/Wnonnull-compare-3.C.jj 2016-02-18 16:14:32.846084103 +0100 +++ gcc/testsuite/g++.dg/warn/Wnonnull-compare-3.C 2016-02-18 16:14:43.679936198 +0100 @@ -0,0 +1,28 @@ +// PR c++/69850 +// { dg-do compile } +// { dg-options "-Wnonnull-compare" } + +template +struct A { + static void foo (T *x) { x->bar (); } +}; +template +struct B { + T b; + void operator= (B) { A::foo (&b); } +}; +struct C { + void bar () { delete[] this; } // { dg-bogus "nonnull argument" } +}; +struct D { B d; }; +struct G { + D g[6]; + void baz (); +}; +int a; + +void +G::baz () +{ + g[a] = g[1]; +} Jakub
Re: [PATCH] Add debug_function_to_file
On 18/02/16 16:43, Tom de Vries wrote: On 18/02/16 16:27, Richard Biener wrote: I would be nice if we could avoid the ${1,2,3} printouts and value >>> >history >>> >assignments, but I'm not sure how to do that. >>> > Using gdb.parse_and_eval does the trick. This updated version uses gdb.parse_and_eval, and adds error handling. And this updated version adds handling different number of arguments, and a help text. I think this could be ready for committing. Is a bootstrap/regtest useful/necessary? OK for stage4/stage1? Thanks, - Tom 2016-02-18 Tom de Vries * gdbhooks.py (class debug_function_to_file): Add and instantiate. Add debug-function-to-file to gdbhooks.py --- gcc/gdbhooks.py | 64 + 1 file changed, 64 insertions(+) diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index 0d5cc97..cffac2a 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -589,4 +589,68 @@ class BreakOnPass(gdb.Command): BreakOnPass() +class debug_function_to_file(gdb.Command): +""" +A custom command to dump a gimple/rtl function to file. By default, it +dumps the current function using 0 as dump_flags, but the function and flags +can also be specified. + +Examples of use: + (gdb) debug-function-to-file foo.1 + (gdb) debug-function-to-file foo.1 cfun->decl + (gdb) debug-function-to-file foo.1 cfun->decl 0 + (gdb) debug-function-to-file foo.1 cfun->decl dump_flags +""" + +def __init__(self): +gdb.Command.__init__(self, 'debug-function-to-file', gdb.COMMAND_USER) + +def invoke(self, arg, from_tty): +# Parse args, check number of args +args = gdb.string_to_argv(arg) +if len(args) > 3: +print ("Too many arguments") +return +if len(args) == 0: +print ("Too little arguments") +return + +# Set filename +filename = args[0] + +# Set func +if len(args) >= 2: +funcname = args[1] +printfuncname = "function %s" % funcname +else: +funcname = "cfun ? cfun->decl : current_function_decl" +printfuncname = "current function" +func = gdb.parse_and_eval(funcname) +if func == 0: +print ("Could not find %s" % printfuncname) +return + +# Set flags +if len(args) >= 3: +flags = gdb.parse_and_eval(args[2]) +else: +flags = 0 + +# Open file +fp = gdb.parse_and_eval("fopen (\"%s\", \"w\")" % filename) +if fp == 0: +print ("Could not open file: %s" % filename) +return + +# Dump function to file +dumpargs = "(tree)%u, (FILE *)%u, %u" % (func, fp, flags) +_ = gdb.parse_and_eval("dump_function_to_file (%s)" % dumpargs) + +# Close file +ret = gdb.parse_and_eval("fclose ((FILE *)%u)" % fp) +if ret != 0: +print ("Could not close file: %s" % filename) + +debug_function_to_file() + print('Successfully loaded GDB hooks for GCC')
Re: [PATCH, rs6000] PR 66337: Improve Compliance with Power ABI
Kevin Nilsen wrote: > This patch has bootstrapped and tested on powerpc64le-unknown-linux-gnu and= > powerpc64be-unknown-linux-gnu (both 32-bit and 64-bit) and=20 > powerpc64-unknown-freebsd11.0 (big endian) with no regressions. Is it ok t= > o fix this on the trunk? > > The problem described in PR66337 (https://gcc.gnu.org/bugzilla/show_bug.cgi= > ?id=3D3D66337) is that compiling for PowerPC targets with the -malign-power= > command-line option results in invalid field offsets for certain structure= > members. As identified in the problem report, this results from a macro = > definition present in both config/rs6000/{freebsd64,linux64}.h, which in bo= > th cases is introduced by the comment: > > /* PowerPC64 Linux word-aligns FP doubles when -malign-power is given. */ > > I have consulted various ABI documents, including "64-bit PowerPC ELF Appli= > cation Binary Interface Supplement 1.9" (http://refspecs.linuxfoundation.or= > g/ELF/ppc64/PPC-elf64abi.html), "Power Architecture 64-Bit ELF V2 ABI Speci= > fication" (https://members.openpowerfoundation.org/document/dl/576), and "P= > ower > Architecture(R) 32-bit Application Binary Interface Supplement 1.0 - Linux(= > R) & Embedded" (https://www.power.org/documentation/power-architecture-32-b= > it-abi-supplement-1-0-embeddedlinuxunified/). I have not been able to find= > any basis for this comment and thus am concluding that the comment and exi= > sting implementation are incorrect. > > The implemented patch removes the comment and changes the macro definition = > so that field alignment calculations on 64-bit architectures ignore the -ma= > lign-power command-line option. With this fix, the test case identified in= > the PR behaves as was expected by the submitter. There seems to be some confusion here. First of all, on Linux and FreeBSD, the *default* behavior is -malign-natural, which matches what the Linux ABI specifies. Using -malign-power on *Linux* is an explicit instruction to the compiler to *deviate* from the documented ABI. The only effect that the deviation has on Linux is to change the alignment requirements for certain structure elements. Your patch removes this change, making -malign-power fully a no-op on Linux and FreeBSD. This doesn't seem to be particularly useful ... If you don't want the effect, you can simply not use that switch. To my understanding, the intent of providing that switch was to allow creating code that is compatible code produced by some other compilers that do not adhere to the Linux ABI, but some other ABI. In particular, my understanding is that the *AIX* ABI has these alignment requirements. And in fact, GCC on *AIX* defaults to -malign-power. Looking at PR 66337, the submitter actually refers to the behaviour of GCC on AIX, so I'm not sure how Linux is even relevant here. (Maybe there is something wrong in how GCC implements the AIX ABI. But I'm not really familar with AIX, so I can't help much with that.) Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
Merge from GCC trunk to gccgo branch
I merged GCC trunk revision 233515 to the gccgo branch. Ian
Re: [PATCH], PR 68404 patch #4 (fix earlyclobber problem on power8 fusion)
On Thu, Feb 18, 2016 at 11:45 AM, Michael Meissner wrote: > This patch to rs6000.md (which is essentially the same as #3) fixes the > problem > by removing the early clobber. The patches to predicates.md, and the fusion > tests revert my changes on February 9th that originally 'solved' the problem > by > not allowing fusion of ADDI values. We have tested the fix using a combine > profiled and LTO bootstrap build and it does not cause any regressions. > Because machine independent changes can mask the bug at times, we also did a > profiled/LTO build on the subversion revision that showed up the problem. Is > this ok to install in the trunk? > > Since gcc 5 contains the exact same early clobber, I would like to also back > port the change to GCC 5. > > [gcc] > 2016-02-18 Michael Meissner > > PR target/68404 > * config/rs6000/predicates.md (fusion_gpr_addis): Revert > 2016-02-09 change. > > * config/rs6000/rs6000.md (fusion_gpr_load_): Remove > earlyclobber from target. Use wF constraint for fused memory > address. > (fusion_gpr___load): Likewise. > > [gcc/testsuites] > 2016-02-18 Michael Meissner > > PR target/68404 > * gcc.target/powerpc/fusion.c: Revert the 2016-02-09 change. > * gcc.target/powerpc/fusion3.c: Likewise. This is okay for trunk and GCC 5 branch. Thanks, David
Re: [PATCH] Fix PR c++/68948 (wrong code generation due to invalid constructor call)
On Wed, Feb 17, 2016 at 10:51 PM, Jason Merrill wrote: > OK. Is this an approval of the 2nd patch for next stage 1?
Re: [PATCH] Add PR c++/11814 test case to testsuite
On Thu, Feb 11, 2016 at 11:20 PM, Patrick Palka wrote: > Hi Jason, > > Your recent fix for PR c++/10200 seems to have fixed this longstanding > PR too. Should I add its test case to the testsuite and close the PR? Never mind, this PR is no longer resolved following the most recent changes to c++/10200.
Re: [PATCH] Add debug_function_to_file
On Thu, 2016-02-18 at 18:26 +0100, Tom de Vries wrote: > On 18/02/16 16:43, Tom de Vries wrote: > > On 18/02/16 16:27, Richard Biener wrote: > > > > > > I would be nice if we could avoid the ${1,2,3} printouts > > > > > > and value > > > > > > > > > > history > > > > > > > > > > assignments, but I'm not sure how to do that. > > > > > > > > > > > > > > > > > > > > > > Using gdb.parse_and_eval does the trick. > > > > > > > > This updated version uses gdb.parse_and_eval, and adds error > > handling. > > And this updated version adds handling different number of arguments, > and a help text. I think this could be ready for committing. > > Is a bootstrap/regtest useful/necessary? I don't think so; I don't think we have any automated coverage for gdb_hooks.py. Presumably you've tested it by hand. What version of Python do you have embedded in gdb? (IIRC some people have Python 2, others Python 3) > +print ("Too little arguments") Nit: can you change this to "Not enough arguments". > OK for stage4/stage1? Looks reasonable to me.
Re: [PATCH, rs6000] PR 66337: Improve Compliance with Power ABI
> Ulrich Weigand wrote: >> Kevin Nilsen wrote: > >> This patch has bootstrapped and tested on powerpc64le-unknown-linux-gnu and= >> powerpc64be-unknown-linux-gnu (both 32-bit and 64-bit) and=20 >> powerpc64-unknown-freebsd11.0 (big endian) with no regressions. Is it ok to >> fix this on the trunk? >> >> The problem described in PR66337 (https://gcc.gnu.org/bugzilla/show_bug.cgi= >> ?id=3D3D66337) is that compiling for PowerPC targets with the -malign-power= >> command-line option results in invalid field offsets for certain structure= >> members. As identified in the problem report, this results from a macro = >> definition present in both config/rs6000/{freebsd64,linux64}.h, which in bo= >> th cases is introduced by the comment: >> >> /* PowerPC64 Linux word-aligns FP doubles when -malign-power is given. */ >> >> I have consulted various ABI documents, including "64-bit PowerPC ELF Appli= >> cation Binary Interface Supplement 1.9" (http://refspecs.linuxfoundation.or= >> g/ELF/ppc64/PPC-elf64abi.html), "Power Architecture 64-Bit ELF V2 ABI Speci= >> fication" (https://members.openpowerfoundation.org/document/dl/576), and "P= >> ower >> Architecture(R) 32-bit Application Binary Interface Supplement 1.0 - Linux(= >> R) & Embedded" (https://www.power.org/documentation/power-architecture-32-b= >> it-abi-supplement-1-0-embeddedlinuxunified/). I have not been able to find= >> any basis for this comment and thus am concluding that the comment and exi= >> sting implementation are incorrect. >> >> The implemented patch removes the comment and changes the macro definition = >> so that field alignment calculations on 64-bit architectures ignore the -ma= >> lign-power command-line option. With this fix, the test case identified in= >> the PR behaves as was expected by the submitter. > > There seems to be some confusion here. First of all, on Linux and FreeBSD, > the *default* behavior is -malign-natural, which matches what the Linux ABI > specifies. Using -malign-power on *Linux* is an explicit instruction to > the compiler to *deviate* from the documented ABI. > > The only effect that the deviation has on Linux is to change the alignment > requirements for certain structure elements. Your patch removes this change, > making -malign-power fully a no-op on Linux and FreeBSD. This doesn't seem > to be particularly useful ... If you don't want the effect, you can simply > not use that switch. > > To my understanding, the intent of providing that switch was to allow > creating code that is compatible code produced by some other compilers > that do not adhere to the Linux ABI, but some other ABI. In particular, > my understanding is that the *AIX* ABI has these alignment requirements. > And in fact, GCC on *AIX* defaults to -malign-power. > > Looking at PR 66337, the submitter actually refers to the behaviour of > GCC on AIX, so I'm not sure how Linux is even relevant here. (Maybe > there is something wrong in how GCC implements the AIX ABI. But I'm > not really familar with AIX, so I can't help much with that.) AIX does not use natural alignment. For historical reasons, the maximum alignment of double is word alignment. In an attempt to correct the alignment mistake, the AIX POWER ABI increases the alignment of structures who first element is double to double word. XLC increases the alignment of the member but GCC does not. GCC allows use of AIX POWER ABI alignment in ELF for some early customers. The option has nothing to do with Linux ABIs nor embedded ABIs. Thanks for the patch, but it is not addressing the correct problem. The issue is specifically about GCC compatibility with XLC for AIX ABI. Thanks, David
Re: [RFC, PR68580] Handle pthread_create error in tsan testsuite
On 18.02.2016 12:36, Tom de Vries wrote: > On 15/02/16 12:29, Bernd Edlinger wrote: >> Here is a patch that puts each value on it's own 8-byte aligned memory >> location. From my experience with tsan tests, sharing shadow memory >> slots between v and q or o is the most likely explanation for the >> occasional >> inability to spot the race condition on v, thus the test case fails, >> because >> the return code is 0, and the expected output is not found. >> >> Boot-strapped/regression tested on x86_64-linux-gnu. >> >> OK for trunk? >> > > Hi, > > Could you add 'PR testsuite/68580' to the log entry when committing? > Yes, of course, thanks. Could someone take the time and review this patch? I don't think it can cause any trouble for gcc-6 and/or gcc-5 even at stage 4. Is it OK for trunk and gcc-5-branch? Thanks Bernd. > Thanks, > - Tom > >> patch-pr68580.diff >> >> >> 2016-02-15 Bernd Edlinger >> >> * c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned. >> >> --- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj2015-03-19 >> 08:53:38.0 +0100 >> +++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c2016-02-15 >> 11:09:18.852320827 +0100 >> @@ -7,9 +7,9 @@ >> #include "tsan_barrier.h" >> >> static pthread_barrier_t barrier; >> -int v; >> -int q; >> -int o; >> +int v __attribute__((aligned(8))); >> +int q __attribute__((aligned(8))); >> +int o __attribute__((aligned(8))); >> extern void baz4 (int *); >> >> __attribute__((noinline, noclone)) int >
Re: [PATCH, rs6000] Fix PR63354
Hi Andreas, Sorry I haven't responded sooner; I was on vacation and have been unpiling things since then. The test case had already been updated since the patch you cited, adding /* { dg-require-effective-target powerpc64 } */ Is this the version you're testing with? Thanks, Bill On Sat, 2016-02-06 at 21:35 +0100, Andreas Schwab wrote: > Bill Schmidt writes: > > > Index: gcc/testsuite/gcc.target/powerpc/pr63354.c > > === > > --- gcc/testsuite/gcc.target/powerpc/pr63354.c (revision 0) > > +++ gcc/testsuite/gcc.target/powerpc/pr63354.c (working copy) > > @@ -0,0 +1,11 @@ > > +/* Verify that we don't stack a frame for leaf functions when using > > + -pg -mprofile-kernel. */ > > + > > +/* { dg-do compile { target { powerpc64*-*-* } } } */ > > +/* { dg-options "-O2 -pg -mprofile-kernel" } */ > > +/* { dg-final { scan-assembler-not "mtlr" } } */ > > + > > +int foo(void) > > +{ > > + return 1; > > +} > > With -m32: > > FAIL: gcc.target/powerpc/pr63354.c (test for excess errors) > Excess errors: > /daten/gcc/gcc-20160205/gcc/testsuite/gcc.target/powerpc/pr63354.c:1:0: > error: -mprofile-kernel not supported in this configuration > > Andreas. >
Re: [RFC, PR68580] Handle pthread_create error in tsan testsuite
On Thu, Feb 18, 2016 at 08:19:22PM +, Bernd Edlinger wrote: > > Could you add 'PR testsuite/68580' to the log entry when committing? > > > > Yes, of course, thanks. > > Could someone take the time and review this patch? > I don't think it can cause any trouble for gcc-6 and/or gcc-5 > even at stage 4. > > Is it OK for trunk and gcc-5-branch? Ok. > >> 2016-02-15 Bernd Edlinger > >> > >> * c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned. > >> > >> --- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj2015-03-19 > >> 08:53:38.0 +0100 > >> +++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c2016-02-15 > >> 11:09:18.852320827 +0100 > >> @@ -7,9 +7,9 @@ > >> #include "tsan_barrier.h" > >> > >> static pthread_barrier_t barrier; > >> -int v; > >> -int q; > >> -int o; > >> +int v __attribute__((aligned(8))); > >> +int q __attribute__((aligned(8))); > >> +int o __attribute__((aligned(8))); > >> extern void baz4 (int *); > >> > >> __attribute__((noinline, noclone)) int > > Jakub
[PATCH] Fix c_parser_for_statement for ObjC (PR objc/69844)
Hi! Here is an attempt to fix up the token reclassification after for statement, where we lexed the next token with the declaration from for in scope and need to undo that if it wasn't else. If token->id_kind is C_ID_CLASSNAME (ObjC only), then token->value has changed already, but in that case I think it means we can just keep it as is, it can't be shadowed by the for init declaration (because it would be C_ID_ID or C_ID_TYPENAME? otherwise). Otherwise, this patch tries to model the handling closer to what c_lex_one_token does, i.e. set it to C_ID_ID, except when decl is non-NULL and TYPE_DECL, or for the ObjC case where for init declaration shadowed a class declaration. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-02-18 Jakub Jelinek PR objc/69844 * c-parser.c (c_parser_for_statement): Properly handle ObjC classes in id_kind reclassification. * objc.dg/pr69844.m: New test. --- gcc/c/c-parser.c.jj 2016-02-16 16:29:54.0 +0100 +++ gcc/c/c-parser.c2016-02-18 17:36:55.025067859 +0100 @@ -5887,12 +5887,27 @@ c_parser_for_statement (c_parser *parser { c_token *token = c_parser_peek_token (parser); tree decl = lookup_name (token->value); - if (decl == NULL_TREE || VAR_P (decl)) - /* If DECL is null, we don't know what this token might be. Treat - it as an ID for better diagnostics; we'll error later on. */ - token->id_kind = C_ID_ID; - else if (TREE_CODE (decl) == TYPE_DECL) - token->id_kind = C_ID_TYPENAME; + if (token->id_kind != C_ID_CLASSNAME) + { + token->id_kind = C_ID_ID; + if (decl) + { + if (TREE_CODE (decl) == TYPE_DECL) + token->id_kind = C_ID_TYPENAME; + } + else if (c_dialect_objc ()) + { + tree objc_interface_decl = objc_is_class_name (token->value); + /* Objective-C class names are in the same namespace as +variables and typedefs, and hence are shadowed by local +declarations. */ + if (objc_interface_decl) + { + token->value = objc_interface_decl; + token->id_kind = C_ID_CLASSNAME; + } + } + } } token_indent_info next_tinfo --- gcc/testsuite/objc.dg/pr69844.m.jj 2016-02-18 17:44:40.896624579 +0100 +++ gcc/testsuite/objc.dg/pr69844.m 2016-02-18 17:45:20.465078855 +0100 @@ -0,0 +1,24 @@ +/* PR objc/69844 */ +/* { dg-do compile } */ +/* { dg-options "-std=gnu99" } */ + +@class D; + +void +foo (void) +{ + for (;;) +; + D *d = (id) 0; + (void) d; +} + +void +bar (void) +{ + for (int D = 0; D < 30; D++) +if (1) + ; + D *d = (id) 0; + (void) d; +} Jakub
[C++ PATCH] Fix -Wnonnull-compare warning from dynamic_cast <...> (this) (PR c++/69850)
Hi! And here is a fix for the dynamic_cast issue I've mentioned recently. Alternatively, ifnonnull could build instead NE_EXPR with swapped last two arguments on the COND_EXPR, and then I believe the cp_fold change would not be needed. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or do you prefer the NE_EXPR variant? 2016-02-18 Jakub Jelinek PR c++/69850 * rtti.c (ifnonnull): Set TREE_NO_WARNING on the EQ_EXPR condition. * cp-gimplify.c (cp_fold) : Preserve TREE_NO_WARNING on comparison if folding swaps arguments. * g++.dg/warn/Wnonnull-compare-4.C: New test. --- gcc/cp/rtti.c.jj2016-02-11 10:50:58.0 +0100 +++ gcc/cp/rtti.c 2016-02-18 18:04:24.611297291 +0100 @@ -507,10 +507,12 @@ get_typeid (tree type, tsubst_flags_t co static tree ifnonnull (tree test, tree result, tsubst_flags_t complain) { - return build3 (COND_EXPR, TREE_TYPE (result), -build2 (EQ_EXPR, boolean_type_node, test, -cp_convert (TREE_TYPE (test), nullptr_node, -complain)), + tree cond = build2 (EQ_EXPR, boolean_type_node, test, + cp_convert (TREE_TYPE (test), nullptr_node, complain)); + /* This is a compiler generated comparison, don't emit + e.g. -Wnonnull-compare warning for it. */ + TREE_NO_WARNING (cond) = 1; + return build3 (COND_EXPR, TREE_TYPE (result), cond, cp_convert (TREE_TYPE (result), nullptr_node, complain), result); } --- gcc/cp/cp-gimplify.c.jj 2016-02-18 11:20:08.0 +0100 +++ gcc/cp/cp-gimplify.c2016-02-18 18:14:26.213996843 +0100 @@ -2101,6 +2101,11 @@ cp_fold (tree x) else x = fold (x); + if (COMPARISON_CLASS_P (TREE_OPERAND (org_x, 0)) + && TREE_NO_WARNING (TREE_OPERAND (org_x, 0)) + && TREE_CODE (x) == TREE_CODE (org_x) + && COMPARISON_CLASS_P (TREE_OPERAND (x, 0))) + TREE_NO_WARNING (TREE_OPERAND (x, 0)) = 1; break; case CALL_EXPR: --- gcc/testsuite/g++.dg/warn/Wnonnull-compare-4.C.jj 2016-02-18 18:03:04.777398778 +0100 +++ gcc/testsuite/g++.dg/warn/Wnonnull-compare-4.C 2016-02-18 18:02:46.414652133 +0100 @@ -0,0 +1,14 @@ +// PR c++/69850 +// { dg-do compile } +// { dg-options "-Wnonnull-compare" } + +struct A { virtual ~A (); int foo (); }; +struct B { virtual ~B () { } }; +struct C : B, A { }; + +int +A::foo () +{ + C *c = dynamic_cast (this); // { dg-bogus "nonnull argument" } + return !c; +} Jakub
Re: [C++ PATCH] Fix -Wnonnull-compare warning from dynamic_cast <...> (this) (PR c++/69850)
On Thu, Feb 18, 2016 at 10:45:46PM +0100, Jakub Jelinek wrote: > Hi! > > And here is a fix for the dynamic_cast issue I've mentioned recently. > Alternatively, ifnonnull could build instead NE_EXPR with swapped last > two arguments on the COND_EXPR, and then I believe the cp_fold change would > not be needed. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > Or do you prefer the NE_EXPR variant? Here is the variant with NE_EXPR, it works on *nonnull* tests, haven't tested it more yet, but will overnight. Personally I think this is better, because folding wants to canonicalize the COND_EXPR with EQ_EXPR and no code in the then branch to NE_EXPR with no code in the else branch, so it means we waste some more GC memory. 2016-02-18 Jakub Jelinek PR c++/69850 * rtti.c (ifnonnull): Set TREE_NO_WARNING on the condition, use NE_EXPR instead of EQ_EXPR and swap last two arguments on COND_EXPR. * g++.dg/warn/Wnonnull-compare-4.C: New test. --- gcc/cp/rtti.c.jj2016-02-11 10:50:58.0 +0100 +++ gcc/cp/rtti.c 2016-02-18 22:47:04.023079812 +0100 @@ -507,12 +507,13 @@ get_typeid (tree type, tsubst_flags_t co static tree ifnonnull (tree test, tree result, tsubst_flags_t complain) { - return build3 (COND_EXPR, TREE_TYPE (result), -build2 (EQ_EXPR, boolean_type_node, test, -cp_convert (TREE_TYPE (test), nullptr_node, -complain)), -cp_convert (TREE_TYPE (result), nullptr_node, complain), -result); + tree cond = build2 (NE_EXPR, boolean_type_node, test, + cp_convert (TREE_TYPE (test), nullptr_node, complain)); + /* This is a compiler generated comparison, don't emit + e.g. -Wnonnull-compare warning for it. */ + TREE_NO_WARNING (cond) = 1; + return build3 (COND_EXPR, TREE_TYPE (result), cond, result, +cp_convert (TREE_TYPE (result), nullptr_node, complain)); } /* Execute a dynamic cast, as described in section 5.2.6 of the 9/93 working --- gcc/testsuite/g++.dg/warn/Wnonnull-compare-4.C.jj 2016-02-18 18:03:04.777398778 +0100 +++ gcc/testsuite/g++.dg/warn/Wnonnull-compare-4.C 2016-02-18 18:02:46.414652133 +0100 @@ -0,0 +1,14 @@ +// PR c++/69850 +// { dg-do compile } +// { dg-options "-Wnonnull-compare" } + +struct A { virtual ~A (); int foo (); }; +struct B { virtual ~B () { } }; +struct C : B, A { }; + +int +A::foo () +{ + C *c = dynamic_cast (this); // { dg-bogus "nonnull argument" } + return !c; +} Jakub
Re: [RFC] [P2] [PR tree-optimization/33562] Lowering more complex assignments.
On 02/18/2016 02:56 AM, Richard Biener wrote: Right. But handling that has never been part of DSE's design goals. Once there's a use, DSE has always given up. Yeah, which is why I in the end said we need a "better" DSE ... Well, I don't see a rewrite/redesign in the mid-term horizon. Having said that... That is, you don't make use of the live_bytes in the ref_maybe_used_by_stmt_p check (you can skip uses of only dead bytes). Not sure if it makes a difference in practice (compared to the cost it would take). Not sure either. It doesn't appear that it would be hard to experiment with that to see if it's worth the effort. My gut feeling is we're not going to see this often, if at all, in practice. Yeah, I think the case we're after and that happens most is sth like a = { aggregate init }; a.a = ...; a.b = ...; ... and what you add is the ability to remove the aggregate init completely. Essentially, yes. What would be nice to have is to remove it partly as well, as for struct { int i; int j; int k; } a = {}; a.i = 1; a.k = 3; we'd like to remove the whole-a zeroing but we need to keep zeroing of a.j. I believe that SRA already has most of the analysis part, what it is lacking is that SRA works not flow-sensitive (it just gathers function-wide data) and that it doesn't consider base objects that have their address taken or are pointer-based. I guess we could look at the live bytes at the end of the loop in dse_possible_dead_store_p and perhaps re-write the original statement. That said, your patch addresses a very specific case nothing else in the compiler handles right now, but it's also quite limited. So evaluating the compile-time impact is important here as well as trying to cover more cases of "partial inits after full inits" optimization. It's fairly narrow. Most of the stuff it's finding are just the clobbers and removing those is totally uninteresting from a code generation standpoint. I'm going to do another round of statistics gathering, filtering out all the clobbers. Some light poking would indicate there's still real world codes where this will help (in libstdc++, not surprisingly), but it's not going to be as broad as I'd like. I don't believe we need to rush this into GCC 6. Not looking to rush this -- it seems simple enough and fixes a long standing regression. I think it's appropriate, but I'm not going to get bent out of shape if you disagree and we table it until GCC 7. jeff
Re: [PATCH] Add debug_function_to_file
On 18/02/16 16:27, Richard Biener wrote: Attached is what I have for now, it works if you call it like (gdb) dot-fn cfun (gdb) dot-fn cfun, 1<<6 w/o that arg parsing;) I'll play with it some more tomorrow. This version: - uses arg parsing - adds error handling - uses a temp file instead of a pipe - uses python os.system to call dot - adds documentation Thanks, - Tom Add dot-fn to gdbhooks.py 2016-02-18 Richard Biener * graph.c: Include dumpfile.h. (print_graph_cfg): Split into three overloads. * gdbhooks.py (dot-fn): New command. --- gcc/gdbhooks.py | 74 + gcc/graph.c | 32 + 2 files changed, 102 insertions(+), 4 deletions(-) diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index cffac2a..bc2407f 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -133,6 +133,7 @@ Instead (for now) you must access m_vecdata: import os.path import re import sys +import tempfile import gdb import gdb.printing @@ -653,4 +654,77 @@ class debug_function_to_file(gdb.Command): debug_function_to_file() +class dot_fn(gdb.Command): +""" +A custom command to show a gimple/rtl function control flow graph. +By default, it show the current function, but the function can also be +specified. + +Examples of use: + (gdb) dot-fn + (gdb) dot-fn cfun + (gdb) dot-fn cfun 0 + (gdb) dot-fn cfun dump_flags +""" +def __init__(self): +gdb.Command.__init__(self, 'dot-fn', gdb.COMMAND_USER) + +def invoke(self, arg, from_tty): +# Parse args, check number of args +args = gdb.string_to_argv(arg) +if len(args) > 2: +print("Too many arguments") +return + +# Set func +if len(args) >= 1: +funcname = args[0] +printfuncname = "function %s" % funcname +else: +funcname = "cfun" +printfuncname = "current function" +func = gdb.parse_and_eval(funcname) +if func == 0: +print("Could not find %s" % printfuncname) +return +func = "(struct function *)%s" % func + +# Set flags +if len(args) >= 2: +flags = gdb.parse_and_eval(args[1]) +else: +flags = 0 + +# Get temp file +f = tempfile.NamedTemporaryFile(delete=False) +filename = f.name + +# Close and reopen temp file to get C FILE* +f.close() +fp = gdb.parse_and_eval("fopen (\"%s\", \"w\")" % filename) +if fp == 0: +print("Cannot open temp file") +return +fp = "(FILE *)%u" % fp + +# Write graph to temp file +_ = gdb.parse_and_eval("start_graph_dump (%s, \"\")" % fp) +_ = gdb.parse_and_eval("print_graph_cfg (%s, %s, %u)" + % (fp, func, flags)) +_ = gdb.parse_and_eval("end_graph_dump (%s)" % fp) + +# Close temp file +ret = gdb.parse_and_eval("fclose (%s)" % fp) +if ret != 0: +print("Could not close temp file: %s" % filename) +return + +# Show graph in temp file +os.system("dot -Tx11 %s" % filename) + +# Remove temp file +os.remove(filename) + +dot_fn() + print('Successfully loaded GDB hooks for GCC') diff --git a/gcc/graph.c b/gcc/graph.c index 1b28c67..dd5bc4e 100644 --- a/gcc/graph.c +++ b/gcc/graph.c @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. If not see #include "cfganal.h" #include "cfgloop.h" #include "graph.h" +#include "dumpfile.h" /* DOT files with the .dot extension are recognized as document templates by a well-known piece of word processing software out of Redmond, WA. @@ -272,14 +273,13 @@ draw_cfg_edges (pretty_printer *pp, struct function *fun) subgraphs right for GraphViz, which requires nodes to be defined before edges to cluster nodes properly. */ -void -print_graph_cfg (const char *base, struct function *fun) +void DEBUG_FUNCTION +print_graph_cfg (FILE *fp, struct function *fun) { - const char *funcname = function_name (fun); - FILE *fp = open_graph_file (base, "a"); pretty_printer graph_slim_pp; graph_slim_pp.buffer->stream = fp; pretty_printer *const pp = &graph_slim_pp; + const char *funcname = function_name (fun); pp_printf (pp, "subgraph \"cluster_%s\" {\n" "\tstyle=\"dashed\";\n" "\tcolor=\"black\";\n" @@ -289,6 +289,30 @@ print_graph_cfg (const char *base, struct function *fun) draw_cfg_edges (pp, fun); pp_printf (pp, "}\n"); pp_flush (pp); +} + +/* Overload with additional flag argument. */ + +void DEBUG_FUNCTION +print_graph_cfg (FILE *fp, struct function *fun, int flags) +{ + int saved_dump_flags = dump_flags; + dump_flags = flags; + print_graph_cfg (fp, fun); + dump_flags = saved_dump_flags; +} + + +/* Print a graphical representation of the CFG of function FUN. + First print all basic blocks. Draw all edges a
Re: [PATCH] 19705 - -fno-branch-count-reg doesn't prevent decrement and branch instructions on a count register
Ping: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00809.html On 02/11/2016 02:58 PM, Martin Sebor wrote: The more than decennnial rtl-optimization/19705 - -fno-branch-count-reg doesn't prevent decrement and branch instructions on a count register points out that the documentation of the option leads one to expect that it prevents the decrement and branch instruction from appearing in the instruction stream. This isn't the case The option prevents a dedicated pass from running that introduces such instructions, but it doesn't prevent other passes from introducing it. The attached updates the documentation to clarify this. Martin
Re: [wwwdocs] Describe behavior of -flifetime-dse in class constructors
> Hello. > > As I finally hunted issue in Firefox that was responsible for start-up > segfault, I would like > to describe a new behavior of the compiler that emits clobbers to class > constructors (w/ -flifetime-dse). > As also Richi spotted quite similar issue in openjade package, I think it > worth for mentioning in porting: Hi, thank you for working this out and writting summary. I think in a shorter form this would make excellent entry for changes.html, too. We tell about the new feature and warn users about fallout that is always good. Honza > > Ok? > Thanks, > Martin > Index: htdocs/gcc-6/porting_to.html > === > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v > retrieving revision 1.14 > diff --unified -r1.14 porting_to.html > --- htdocs/gcc-6/porting_to.html 14 Feb 2016 13:13:43 - 1.14 > +++ htdocs/gcc-6/porting_to.html 16 Feb 2016 14:41:10 - > @@ -316,7 +316,7 @@ > > > > -Finally, the type and mangling of flexible array members has changed > +Furthermore, the type and mangling of flexible array members has changed > from previous releases. While in GCC 5 and prior the type of a flexible > array member is an array of zero elements (a GCC extension), in GCC 6 it > is that of an array of an unspecified bound (i.e., T[] as opposed > @@ -324,6 +324,50 @@ > -fabi-version or -Wabi option to disable or warn about. > > > + > +Finally, the C++ compiler (with enabled -flifetime-dse) > +has been more aggressive in dead-store elimination in situations where > +a memory store to a location precedes a constructor to the > +memory location. Described situation can be commonly found in programs > +which zero a memory that is eventually passed to a placement new operator: > + > + > +#include> +#include > +#include > + > +struct A > +{ > + A () {} > + void *operator new (size_t s) > + { > +void *ptr = malloc (s); > +memset (ptr, 0, s); > +return ptr; > + } > + > + int value; > +}; > + > +A * > +__attribute__ ((noinline)) > +build (void) > +{ > + return new A (); > +} > + > +int main() > +{ > + A *a = build (); > + assert (a->value == 0); /* Use of uninitialized value */ > + free (a); > +} > + > + > +If the program cannot be fixed to remove the undefined behavior then > +the option -fno-lifetime-dse can be used to disable > +this optimization. > + > -Wmisleading-indentation > > A new warning -Wmisleading-indentation was added
Re: RFC: [Patch, PR Bug 60818] - ICE in validate_condition_mode on powerpc*-linux-gnu* ]
On Thu, Feb 18, 2016 at 03:43:07AM -0600, Segher Boessenkool wrote: > Either combine should delete the note (my current patch), or it can Works for me. I'm not sure I'd want to promise that combine won't ever create what you call "invalid RTL", in notes. -- Alan Modra Australia Development Lab, IBM
Re: C++ PATCHes for abi_tag bugs
On 01/31/2016 06:49 AM, Jason Merrill wrote: These patches fix a couple of abi_tag bugs that were reported to me recently. ... The second is an issue where the multiple-initialization guard for a tagged variable was not itself tagged. That didn't completely fix the issue. Trying again: commit 09ebcab51e037bc236bb804ca5cc5d27a2f1fb9b Author: Jason Merrill Date: Fri Feb 19 00:51:35 2016 -0500 * mangle.c (maybe_check_abi_tags): Add for_decl parm. Call mangle_decl. (mangle_decl): Call maybe_check_abi_tags for function scope. (mangle_guard_variable): Call maybe_check_abi_tags here. (write_guarded_var_name): Not here. diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c index 410c7f4..5d38373 100644 --- a/gcc/cp/mangle.c +++ b/gcc/cp/mangle.c @@ -223,6 +223,7 @@ static void write_local_name (tree, const tree, const tree); static void dump_substitution_candidates (void); static tree mangle_decl_string (const tree); static int local_class_index (tree); +static void maybe_check_abi_tags (tree, tree = NULL_TREE); /* Control functions. */ @@ -3599,6 +3600,9 @@ mangle_decl (const tree decl) { gcc_assert (TREE_CODE (decl) != TYPE_DECL || !no_linkage_check (TREE_TYPE (decl), true)); + if (abi_version_at_least (10)) + if (tree fn = decl_function_context (decl)) + maybe_check_abi_tags (fn, decl); id = get_mangled_id (decl); } SET_DECL_ASSEMBLER_NAME (decl, id); @@ -3937,26 +3941,39 @@ mangle_conv_op_name_for_type (const tree type) /* Handle ABI backwards compatibility for past bugs where we didn't call check_abi_tags in places where it's needed: call check_abi_tags and warn if - it makes a difference. */ + it makes a difference. If FOR_DECL is non-null, it's the declaration + that we're actually trying to mangle; if it's null, we're mangling the + guard variable for T. */ static void -maybe_check_abi_tags (tree t) +maybe_check_abi_tags (tree t, tree for_decl) { + if (DECL_ASSEMBLER_NAME_SET_P (t)) +return; + tree attr = lookup_attribute ("abi_tag", DECL_ATTRIBUTES (t)); tree oldtags = NULL_TREE; if (attr) oldtags = TREE_VALUE (attr); - check_abi_tags (t); + mangle_decl (t); if (!attr) attr = lookup_attribute ("abi_tag", DECL_ATTRIBUTES (t)); if (attr && TREE_VALUE (attr) != oldtags && abi_version_crosses (10)) -warning_at (DECL_SOURCE_LOCATION (t), OPT_Wabi, - "the mangled name of the initialization guard variable for" - "%qD changes between -fabi-version=%d and -fabi-version=%d", - t, flag_abi_version, warn_abi_version); +{ + if (for_decl) + warning_at (DECL_SOURCE_LOCATION (for_decl), OPT_Wabi, + "the mangled name of %qD changes between " + "-fabi-version=%d and -fabi-version=%d", + for_decl, flag_abi_version, warn_abi_version); + else + warning_at (DECL_SOURCE_LOCATION (t), OPT_Wabi, + "the mangled name of the initialization guard variable for" + "%qD changes between -fabi-version=%d and -fabi-version=%d", + t, flag_abi_version, warn_abi_version); +} } /* Write out the appropriate string for this variable when generating @@ -3971,15 +3988,7 @@ write_guarded_var_name (const tree variable) to the reference, not the temporary. */ write_string (IDENTIFIER_POINTER (DECL_NAME (variable)) + 4); else -{ - /* Before ABI v10 we were failing to call check_abi_tags here. So if - we're in pre-10 mode, wait until after write_name to call it. */ - if (abi_version_at_least (10)) - maybe_check_abi_tags (variable); - write_name (variable, /*ignore_local_scope=*/0); - if (!abi_version_at_least (10)) - maybe_check_abi_tags (variable); -} +write_name (variable, /*ignore_local_scope=*/0); } /* Return an identifier for the name of an initialization guard @@ -3988,6 +3997,8 @@ write_guarded_var_name (const tree variable) tree mangle_guard_variable (const tree variable) { + if (abi_version_at_least (10)) +maybe_check_abi_tags (variable); start_mangling (variable); write_string ("_ZGV"); write_guarded_var_name (variable); diff --git a/gcc/testsuite/g++.dg/abi/abi-tag16a.C b/gcc/testsuite/g++.dg/abi/abi-tag16a.C index b02e856..12fe312 100644 --- a/gcc/testsuite/g++.dg/abi/abi-tag16a.C +++ b/gcc/testsuite/g++.dg/abi/abi-tag16a.C @@ -1,4 +1,4 @@ -// { dg-options "-fabi-version=9 -Wabi" } +// { dg-options "-fabi-version=9" } // { dg-final { scan-assembler "_ZGVZN1N1FEvE4Name" } } namespace std { __extension__ inline namespace __cxx11 __attribute__((abi_tag("cxx11"))) { @@ -10,7 +10,7 @@ namespace std { namespace N { inline void F() { { - static std::String Name; // { dg-warning "mangled name" } + static std::String Name; } } void F2() { diff --git a/gcc/testsuite/g++.dg/abi/abi-tag18.C b/gcc/testsuite/g++.dg/abi/abi-tag18.C new file mode 100644 index 000..89ee737 --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/abi-tag18.C @@ -0,0 +1,20 @@
Re: [PATCH] Fix up avx512* regressions caused by the cse.c one-liner change (PR target/69671)
Hello Jakub, On 17 Feb 17:46, Jakub Jelinek wrote: > Hi! > > As I wrote in the PR, fwprop is able to forward CONST0_RTX back into > instructions even if CSE optimized them, but the problem in that case is > that for vector_move_operand "0C" operands if they appear inside of > (vec_select ... (parallel [(const_int 0) ... ])) the result is also > simplified, so one gets instead another CONST0_RTX (in the mode of > the VEC_SELECT). Because the patterns expect a vec_select and "C" operand > inside of it, it is therefore not matched, it maybe attached as REG_EQUAL > note. I went through other vector_move_operand "0C" and "0C,0" operands > and I don't think they suffer from similar problem, if fwprop or cprop etc. > attempts to propagate a constant into them, it shouldn't be possible it will > be simplified into something different. > > Anyway, the fix IMHO is to just duplicate the affected 8 define_insns > with the simplification applied. IMHO once we know it is {z}, it is worth > to keep it as {z}, there is no benefit to allow the RA to use "0" > operand instead. > > Bootstrapped/regtested on x86_64-linux and i686-linux, on both fixes > the testcases that started failing with r233133, ok for trunk? Patch is ok for trunk. Thanks a lot for fixing this! -- Regards, K
Re: [C++ PATCH] Fix -Wnonnull-compare warning from dynamic_cast <...> (this) (PR c++/69850)
On Thu, Feb 18, 2016 at 10:51:48PM +0100, Jakub Jelinek wrote: > On Thu, Feb 18, 2016 at 10:45:46PM +0100, Jakub Jelinek wrote: > > Hi! > > > > And here is a fix for the dynamic_cast issue I've mentioned recently. > > Alternatively, ifnonnull could build instead NE_EXPR with swapped last > > two arguments on the COND_EXPR, and then I believe the cp_fold change would > > not be needed. > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Or do you prefer the NE_EXPR variant? > > Here is the variant with NE_EXPR, it works on *nonnull* tests, haven't > tested it more yet, but will overnight. > Personally I think this is better, because folding wants to canonicalize > the COND_EXPR with EQ_EXPR and no code in the then branch to NE_EXPR > with no code in the else branch, so it means we waste some more GC memory. And even this patch passed bootstrap/regtest on x86_64-linux and i686-linux. > 2016-02-18 Jakub Jelinek > > PR c++/69850 > * rtti.c (ifnonnull): Set TREE_NO_WARNING on the condition, use > NE_EXPR instead of EQ_EXPR and swap last two arguments on COND_EXPR. > > * g++.dg/warn/Wnonnull-compare-4.C: New test. Jakub