[PATCH] Fix PR56150
This fixes PR56150 - mixed store/loads are not equal. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2013-01-30 Richard Biener PR tree-optimization/56150 * tree-ssa-tail-merge.c (gimple_equal_p): Properly handle mixed store non-store stmts. * gcc.dg/torture/pr56150.c: New testcase. Index: gcc/tree-ssa-tail-merge.c === *** gcc/tree-ssa-tail-merge.c (revision 195574) --- gcc/tree-ssa-tail-merge.c (working copy) *** gimple_equal_p (same_succ same_succ, gim *** 1119,1135 case GIMPLE_ASSIGN: lhs1 = gimple_get_lhs (s1); lhs2 = gimple_get_lhs (s2); ! if (gimple_vdef (s1)) ! { ! if (vn_valueize (gimple_vdef (s1)) != vn_valueize (gimple_vdef (s2))) ! return false; ! if (TREE_CODE (lhs1) != SSA_NAME ! && TREE_CODE (lhs2) != SSA_NAME) ! return true; ! } ! return (TREE_CODE (lhs1) == SSA_NAME ! && TREE_CODE (lhs2) == SSA_NAME ! && vn_valueize (lhs1) == vn_valueize (lhs2)); case GIMPLE_COND: t1 = gimple_cond_lhs (s1); --- 1119,1132 case GIMPLE_ASSIGN: lhs1 = gimple_get_lhs (s1); lhs2 = gimple_get_lhs (s2); ! if (TREE_CODE (lhs1) != SSA_NAME ! && TREE_CODE (lhs2) != SSA_NAME) ! return (vn_valueize (gimple_vdef (s1)) ! == vn_valueize (gimple_vdef (s2))); ! else if (TREE_CODE (lhs1) == SSA_NAME ! && TREE_CODE (lhs2) == SSA_NAME) ! return vn_valueize (lhs1) == vn_valueize (lhs2); ! return false; case GIMPLE_COND: t1 = gimple_cond_lhs (s1); Index: gcc/testsuite/gcc.dg/torture/pr56150.c === *** gcc/testsuite/gcc.dg/torture/pr56150.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr56150.c (working copy) *** *** 0 --- 1,17 + /* { dg-do compile } */ + + struct { + int f4; + } g1; + + long g2; + + volatile long g3; + + void func_1 () + { + if (g2) + g1 = g1; + else + g3; + }
[PATCH] More PR56150 slow compile-time fixes
This reverts the change to go into loop-closed SSA form for virtual operands. Nothing relys on that and the verifier doesn't verify it works. Furthermore SSA updating will destroy loop-closed virtual SSA form very quickly. As the PR shows it can be quite costly to go into loop-closed SSA form for virtuals (50% of compile-time). So the following patch reverts the change for now. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2013-01-31 Richard Biener PR tree-optimization/56150 * tree-ssa-loop-manip.c (find_uses_to_rename_stmt): Do not visit virtual operands. (find_uses_to_rename_bb): Likewise. Index: gcc/tree-ssa-loop-manip.c === *** gcc/tree-ssa-loop-manip.c (revision 195574) --- gcc/tree-ssa-loop-manip.c (working copy) *** find_uses_to_rename_stmt (gimple stmt, b *** 402,408 if (is_gimple_debug (stmt)) return; ! FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_USES) find_uses_to_rename_use (bb, var, use_blocks, need_phis); } --- 402,408 if (is_gimple_debug (stmt)) return; ! FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_USE) find_uses_to_rename_use (bb, var, use_blocks, need_phis); } *** find_uses_to_rename_bb (basic_block bb, *** 422,429 for (bsi = gsi_start_phis (e->dest); !gsi_end_p (bsi); gsi_next (&bsi)) { gimple phi = gsi_stmt (bsi); ! find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e), !use_blocks, need_phis); } for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) --- 422,430 for (bsi = gsi_start_phis (e->dest); !gsi_end_p (bsi); gsi_next (&bsi)) { gimple phi = gsi_stmt (bsi); ! if (! virtual_operand_p (gimple_phi_result (phi))) ! find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e), ! use_blocks, need_phis); } for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
Re: [PATCH][RFC] Add -fno-aggressive-loop-optimizations
On Wed, 30 Jan 2013, Pat Haugen wrote: > On 01/29/2013 04:53 AM, Richard Biener wrote: > > I'm curious about the affect of -fno-aggressive-loop-optimizations > > on SPEC CPU 2006 numbers (not curious enough to try for myself > > though). Both on extra PASSes for official latest sources > > (I have no access to those) and on performance. > The patch/option result in both 464.h264ref and 416.gamess passing (as opposed > to infinite loop). As for performance, I didn't see any difference outside of > noise range for both 32-bit and 64-bit runs on PowerPC. Ok, I'll go ahead and apply the patch then. Richard.
Re: [PATCH][RFC] Add -fno-aggressive-loop-optimizations
On Thu, 31 Jan 2013, Richard Biener wrote: > On Wed, 30 Jan 2013, Pat Haugen wrote: > > > On 01/29/2013 04:53 AM, Richard Biener wrote: > > > I'm curious about the affect of -fno-aggressive-loop-optimizations > > > on SPEC CPU 2006 numbers (not curious enough to try for myself > > > though). Both on extra PASSes for official latest sources > > > (I have no access to those) and on performance. > > The patch/option result in both 464.h264ref and 416.gamess passing (as > > opposed > > to infinite loop). As for performance, I didn't see any difference outside > > of > > noise range for both 32-bit and 64-bit runs on PowerPC. > > Ok, I'll go ahead and apply the patch then. Done. I propose the following addition to changes.html. Ok? Thanks, Richard. 2013-01-31 Richard Biener * changes.html: Mention -fno-aggressive-loop-optimizations. Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.89 diff -u -r1.89 changes.html --- changes.html27 Jan 2013 13:44:12 - 1.89 +++ changes.html31 Jan 2013 09:05:20 - @@ -30,6 +30,13 @@ The G++ namespace association extension, __attribute ((strong)), has been deprecated. Inline namespaces should be used instead. +GCC now uses more a aggressive analysis to derive an upper bound for +the number of iterations of loops using constraints imposed by language +standards. This may cause non-conforming programs to no longer work as +expected, such as SPEC CPU 2006 464.h264ref and 416.gamess. A new +option, -fno-aggressive-loop-optimizations, was added +to disable this aggressive analysis. + On ARM, a bug has been fixed in GCC's implementation of the AAPCS rules for the layout of vectors that could lead to wrong code being generated. Vectors larger than 8 bytes in size are now by default
Re: [patch libiberty]: Fix PR 543413
Kai Tietz writes: > this patch fixes wrong handling of cases that bitness of size_t is > wider as 32-bit. > > ChangeLog > > 2013-01-30 Kai Tietz > > PR other/543413 ^ this is clearly wrong (6 digits) > * md5.c (md5_process_block): Handle case that size_t is > a wider-integer-scalar a 32-bit unsigned integer. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: SLP for vectors
On Wed, Jan 30, 2013 at 9:45 PM, Marc Glisse wrote: > On Tue, 29 Jan 2013, Richard Biener wrote: > >> On Sun, Jan 27, 2013 at 4:28 PM, Marc Glisse wrote: >>> >>> Hello, >>> >>> this message is to check that I am not doing something absurd and ask for >>> a >>> bit of advice. >>> >>> In the attached patch, I let SLP recognize vector loads/stores just like >>> it >>> recognizes those in an array. It has a few issues: the cost of the >>> vectorized version is overestimated, the base object is wrong (doesn't >>> strip >>> the bit_field_ref, but since the base address is right and the base >>> object >>> is almost unused...), but most importantly it only works if the vectors >>> have >>> their address taken, otherwise (after asking gimple_vuse?) SLP doesn't >>> detect their use as loads or stores at all. >> >> >> Yes, if they have not their address taken they are not loads. >> >>> Also, it only works if you write >>> result[0]=..., result[1]=... and does not recognize a constructor as a >>> series of stores. >>> >>> Is slowly extending SLP the right way to go? Should >>> get_references_in_stmt >>> report vector element accesses? >> >> >> It does if it is a memory access. >> >> You didn't provide a new testcase so it's hard for me to guess what you >> are trying to do. I suppose you want to vectorize >> >> w[0] = v[0] + v[0]; >> w[1] = v[1] + v[1]; >> >> into >> >> w = v + v; >> >> As it would work if w and v are array accesses instead of vector >> subscripts? > > > Yes, sorry for not being more precise. Vectorization that works for an array > should work (even better) for a vector. > > >> Note that similar issues (and bugreports) exist for complex component >> accesses. > > > One extension at a time :-) > > >> As of handling non-memory BIT_FIELD_REFs - I suggest to split out >> the test of whether a stmt is considered a "load" for the purpose of >> vectorization and simply special-case this therein, not requiring a VUSE. > > > Ok. The hardest part will be preventing the pass from creating ADDR_EXPR of > those vectors, or at least folding them before the pass is done. Oh ... I suppose the "loads"/"stores" need simply be specially handled. > >> I suppose the data-ref analysis parts are not strictly required, nor >> the get_addr_base_and_unit_offset_1 parts? > > > Actually it is necessary (at least the get_addr_base_and_unit_offset_1 part > is) otherwise I get a gimple verification failure because we have an > ADDR_EXPR of a BIT_FIELD_REF. Hm. Certainly handling BIT_FIELD_REF in get_addr_base_and_unit_offset_1 is ok, but we should ensure by other means that the address of a BIT_FIELD_REF is never taken ... >> They should be split out >> and separately tested anyway. The data-ref part at least will confuse >> analysis of 'a[0]' or 'a[1]' vs. 'a', but I suppose independent of the >> patch. > > > Ah... Could you be more specific? Are you thinking about arrays of vectors? No, I was thinking of code that accesses both the whole vector and the vector piecewise. Mixing this kind of accesses will confuse data-ref analysis I believe. But it also will disable vectorization because there are already vector types in the IL ... so it's probably a minor issue for now. Richard. > Thanks for the help, > > -- > Marc Glisse
Re: [patch libiberty]: Fix PR 543413
2013/1/31 Rainer Orth : > Kai Tietz writes: > >> this patch fixes wrong handling of cases that bitness of size_t is >> wider as 32-bit. >> >> ChangeLog >> >> 2013-01-30 Kai Tietz >> >> PR other/543413 > > ^ this is clearly wrong (6 digits) > >> * md5.c (md5_process_block): Handle case that size_t is >> a wider-integer-scalar a 32-bit unsigned integer. > > Rainer > > -- > - > Rainer Orth, Center for Biotechnology, Bielefeld University Thanks, your heads up is too late. corrected this already. Cheers, Kai
[PATCH] Avoid .debug_loc loclist entries that look like loclist terminators (PR debug/56154)
Hi! As the pr56154-{1,2,3}.c testcases show, sometimes we start a location list with an empty range (either for to PR49382, or because the first insn in the function is inline asm (or many of them) which has zero size and the compiler doesn't know it). If !have_multiple_function_sections, we emit .debug_loc addresses relative to .Ltext0 symbol and DW_AT_low_pc of the CU is .Ltext0, otherwise they are emitted as absolute addresses, there is DW_AT_ranges on the CU and DW_AT_low_pc is 0. If an empty range, either known by the compiler or unknown (because of empty inline asm, or say inline asm which is .section someothersection; emits lots of stuff there; .previous) is emitted for the beginning of the first function emitted for the CU, then we can end up with e.g. .LVL0 - .Ltext0 .LVL0 - .Ltext0 range where .LVL0 == .Ltext0, or say .LVL0 - .Ltext0 .LVL23 - .Ltext0 range where .LVL0 == .LVL1 == ... == .LVL23 == .Ltext0. Unfortunately that is 0 0 and 0, 0 is location list terminator entry, rather than valid empty range. In that case, neither the empty range is usable by the debug info consumer, nor if there are any other ranges in the location list after it. E.g. readelf -wo then complains there is garbage hole in the .debug_loc section. Fixed by checking for this on the first function, and if any empty range is detected at the beginning of the function in some location list, we force have_multiple_function_sections (and thus, DW_AT_ranges for the CU, DW_AT_low_pc 0 and absolute .debug_loc addresses). This patch doesn't seem to affect .debug_{info,loc,ranges} sizes on stage3 cc1plus binaries on either x86_64-linux and i686-linux, so I guess it doesn't hit that often, and even when it hits, it results in tiny decrease of size of .debug_info, tiny growth in .debug_ranges, no growth in .debug_loc, just possibly many relocations against .debug_loc (but that affects just *.o/*.a files). The alternative discussed was to force in such loclists a base selection entry, but that would be more work on the gcc side and .debug_loc size could grow more. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-31 Jakub Jelinek PR debug/56154 * dwarf2out.c (dwarf2_debug_hooks): Set end_function hook to dwarf2out_end_function. (in_first_function_p, maybe_at_text_label_p, first_loclabel_num_not_at_text_label): New variables. (dwarf2out_var_location): In the first function find out lowest loclabel_num N where .LVLN is known not to be equal to .Ltext0. (find_empty_loc_ranges_at_text_label, dwarf2out_end_function): New functions. * gcc.dg/guality/pr56154-1.c: New test. * gcc.dg/guality/pr56154-2.c: New test. * gcc.dg/guality/pr56154-3.c: New test. * gcc.dg/guality/pr56154-4.c: New test. * gcc.dg/guality/pr56154-aux.c: New file. --- gcc/dwarf2out.c.jj 2013-01-11 09:02:48.0 +0100 +++ gcc/dwarf2out.c 2013-01-30 16:18:58.362552894 +0100 @@ -2351,6 +2351,7 @@ static void dwarf2out_imported_module_or static void dwarf2out_abstract_function (tree); static void dwarf2out_var_location (rtx); static void dwarf2out_begin_function (tree); +static void dwarf2out_end_function (unsigned int); static void dwarf2out_set_name (tree, tree); /* The debug hooks structure. */ @@ -2378,7 +2379,7 @@ const struct gcc_debug_hooks dwarf2_debu #endif dwarf2out_end_epilogue, dwarf2out_begin_function, - debug_nothing_int, /* end_function */ + dwarf2out_end_function, /* end_function */ dwarf2out_function_decl, /* function_decl */ dwarf2out_global_decl, dwarf2out_type_decl, /* type_decl */ @@ -20627,6 +20628,14 @@ dwarf2out_set_name (tree decl, tree name add_name_attribute (die, dname); } +/* True if before or during processing of the first function being emitted. */ +static bool in_first_function_p = true; +/* True if loc_note during dwarf2out_var_location call might still be + before first real instruction at address equal to .Ltext0. */ +static bool maybe_at_text_label_p = true; +/* One above highest N where .LVLN label might be equal to .Ltext0 label. */ +static unsigned int first_loclabel_num_not_at_text_label; + /* Called by the final INSN scan whenever we see a var location. We use it to drop labels in the right places, and throw the location in our lookup table. */ @@ -20734,6 +20743,45 @@ dwarf2out_var_location (rtx loc_note) ASM_OUTPUT_DEBUG_LABEL (asm_out_file, "LVL", loclabel_num); loclabel_num++; last_label = ggc_strdup (loclabel); + /* See if loclabel might be equal to .Ltext0. If yes, +bump first_loclabel_num_not_at_text_label. */ + if (!have_multiple_function_sections + && in_first_function_p + && maybe_at_text_label_p) + { + static rtx last_start; + rtx insn; + for (insn = loc_note; insn; insn = previous_insn
[PATCH] If possible, include range of profile hunk before prologue in .debug_loc ranges (PR debug/54793)
Hi! The Linux kernel started (recently?) using -mfentry -p on x86_64/i686? for some instrumentation, I believe they normally overwrite the -mfentry prologues with nops and only when some function needs to be instrumented, overwrite it with some call or jump. The problem in this PR is that var-tracking notes that are meant to live on the very first insn in the function (info about fn arguments and the like) no longer do so, they all start after the profile prologue, so when *gdb/systemtap or similar debug info consumer inspects debug info at the beginning of the function, it thinks the arguments are optimized away. This patch attempts to extend those ranges over the profiler snippet before prologue on some targets. I'm not doing it everywhere, because some targets emit some assembly in their targetm.asm_out.function_prologue hook even for HAVE_prologue, and it is likely the profiler hook isn't prepared to handle it (think about e.g. ia64 or a few other targets). Also, sometimes the profiler code might e.g. use push/pop in it and invalidate potential sp based locations, either we could stop doing this if cfun->returns_struct or cfun->static_chain_decl, but Alex argued that it isn't any different from inline asm which uses push/pop inside of it. The debug info there is also correct only on the first insn in the inline asm and right after the inline asm, not necessary inside of it, and similarly with this patch it isn't necessarily right inside of the profiler snippet (the kernel won't care, as it overwrites the snippet with nops or call/jmp). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-31 Jakub Jelinek PR debug/54793 * final.c (need_profile_function): New variable. (final_start_function): Drop ATTRIBUTE_UNUSED from first argument. If first of NOTE_INSN_BASIC_BLOCK or NOTE_INSN_FUNCTION_BEG is only preceeded by NOTE_INSN_VAR_LOCATION or NOTE_INSN_DELETED notes, targetm.asm_out.function_prologue doesn't emit anything, HAVE_prologue and profiler should be emitted before prologue, set need_profile_function instead of emitting it. (final_scan_insn): If need_profile_function, emit profile_function on the first NOTE_INSN_BASIC_BLOCK or NOTE_INSN_FUNCTION_BEG note. --- gcc/final.c.jj 2013-01-15 17:20:37.0 +0100 +++ gcc/final.c 2013-01-25 10:31:48.848897310 +0100 @@ -200,6 +200,9 @@ rtx current_insn_predicate; /* True if printing into -fdump-final-insns= dump. */ bool final_insns_dump_p; +/* True if profile_function should be called, but hasn't been called yet. */ +static bool need_profile_function; + static int asm_insn_count (rtx); static void profile_function (FILE *); static void profile_after_prologue (FILE *); @@ -1668,13 +1671,15 @@ reemit_insn_block_notes (void) test and compare insns. */ void -final_start_function (rtx first ATTRIBUTE_UNUSED, FILE *file, +final_start_function (rtx first, FILE *file, int optimize_p ATTRIBUTE_UNUSED) { block_depth = 0; this_is_asm_operands = 0; + need_profile_function = false; + last_filename = LOCATION_FILE (prologue_location); last_linenum = LOCATION_LINE (prologue_location); last_discriminator = discriminator = 0; @@ -1695,7 +1700,41 @@ final_start_function (rtx first ATTRIBUT /* The Sun386i and perhaps other machines don't work right if the profiling code comes after the prologue. */ if (targetm.profile_before_prologue () && crtl->profile) -profile_function (file); +{ + if (targetm.asm_out.function_prologue + == default_function_pro_epilogue +#ifdef HAVE_prologue + && HAVE_prologue +#endif +) + { + rtx insn; + for (insn = first; insn; insn = NEXT_INSN (insn)) + if (!NOTE_P (insn)) + { + insn = NULL_RTX; + break; + } + else if (NOTE_KIND (insn) == NOTE_INSN_BASIC_BLOCK +|| NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG) + break; + else if (NOTE_KIND (insn) == NOTE_INSN_DELETED +|| NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION) + continue; + else + { + insn = NULL_RTX; + break; + } + + if (insn) + need_profile_function = true; + else + profile_function (file); + } + else + profile_function (file); +} /* If debugging, assign block numbers to all of the blocks in this function. */ @@ -2075,6 +2114,12 @@ final_scan_insn (rtx insn, FILE *file, i break; case NOTE_INSN_BASIC_BLOCK: + if (need_profile_function) + { + profile_function (asm_out_file); + need_profile_function = false; + } + if (targetm.asm_out.unwind_emit) targetm.asm_out.unwind_emit (
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
Hello, Sorry for the long delay (ref http://patchwork.ozlabs.org/patch/199397/) On 6 December 2012 20:26, Teresa Johnson wrote: > > > > On Wed, Nov 28, 2012 at 7:48 AM, Christophe Lyon > wrote: >> >> I have updated my trunk checkout, and I can confirm that eval.c now >> compiles with your patch (and the other 4 patches I added to PR55121). > > > good > >> >> >> Now, when looking at the whole Spec2k results: >> - vpr passes now (used to fail) > > > good > >> >> - gcc, parser, perlbmk bzip2 and twolf no longer build: they all fail >> with the same error from gas: >> can't resolve `.text.unlikely' {.text.unlikely section} - `.LBB171' >> {.text section} >> - gap still does not build (same error as above) >> >> I haven't looked in detail, so I may be missing an obvious patch here. > > > Finally had a chance to get back to this. I was able to reproduce the > failure using x86_64 linux with "-freorder-blocks-and-partition -g". > However, I am also getting the same failure with a pristine copy of trunk. > Can you confirm whether you were seeing any of these failures without my > patches, because I believe they are probably a limitation with function > splitting and debug info that is orthogonal to my patch. > Yes I confirm that I see these failures without your patch too; and both -freorder-blocks-and-partition and -g are present in my command-line. And now gap's integer.c fails to compile with a similar error message too. >> >> And I still observe runtime mis-behaviour on crafty, galgel, facerec and >> fma3d. > > > I'm not seeing this on x86_64, unfortunately, so it might require some > follow-on work to triage and fix. > > I'll look into the gas failure, but if someone could review this patch in > the meantime given that it does improve things considerably (at least > without -g), that would be great. > Indeed. > Thanks, > Teresa > Thanks Christophe
Fix for PR55561 race condition in libgomp
The attached patch fixes a race condition in libgomp. Based on the suggestions by Dmitry, and verified that it fixes the corresponding sanitizer warnings. 2012-12-30 Dmitry Vyukov PR libgomp/55561 * config/linux/wait.h (do_spin): Use atomic load for addr. * config/linux/ptrlock.c (gomp_ptrlock_get_slow): Use atomic for intptr and ptrlock. * onfig/linux/ptrlock.h (gomp_ptrlock_get): Use atomic load for ptrlock. Index: libgomp/config/linux/wait.h === --- libgomp/config/linux/wait.h (revision 194730) +++ libgomp/config/linux/wait.h (working copy) @@ -51,7 +51,7 @@ static inline int do_spin (int *addr, in if (__builtin_expect (gomp_managed_threads > gomp_available_cpus, 0)) count = gomp_throttled_spin_count_var; for (i = 0; i < count; i++) -if (__builtin_expect (*addr != val, 0)) +if (__builtin_expect (__atomic_load_n (addr, MEMMODEL_RELAXED) != val, 0)) return 0; else cpu_relax (); Index: libgomp/config/linux/ptrlock.c === --- libgomp/config/linux/ptrlock.c (revision 194730) +++ libgomp/config/linux/ptrlock.c (working copy) @@ -50,9 +50,9 @@ gomp_ptrlock_get_slow (gomp_ptrlock_t *p #endif do do_wait (intptr, 2); - while (*intptr == 2); + while (__atomic_load_n (intptr, MEMMODEL_RELAXED) == 2); __asm volatile ("" : : : "memory"); - return *ptrlock; + return (void *) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE); } void Index: libgomp/config/linux/ptrlock.h === --- libgomp/config/linux/ptrlock.h (revision 194730) +++ libgomp/config/linux/ptrlock.h (working copy) @@ -48,8 +48,9 @@ static inline void *gomp_ptrlock_get (go { uintptr_t oldval; - if ((uintptr_t) *ptrlock > 2) -return *ptrlock; + uintptr_t v = (uintptr_t) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE); + if (v > 2) +return (void *) v; oldval = 0; if (__atomic_compare_exchange_n (ptrlock, &oldval, 1, false,
Re: Fix for PR55561 race condition in libgomp
On Thu, Jan 31, 2013 at 02:50:51PM +, VandeVondele Joost wrote: > The attached patch fixes a race condition in libgomp. Based on the > suggestions by Dmitry, and verified that it fixes the corresponding sanitizer > warnings. > > 2012-12-30 Dmitry Vyukov Ok, but please use current date rather than 2012-12-30, and also add your name and email address below Dmitry's. Jakub
C++ PATCH for c++/56162 (pmf1.C failure on arm)
In my 56104 patch I failed to notice that in the ARM (vbit in delta) case, we need to adjust the delta even if we know we're doing a non-virtual call. Tested x86_64-pc-linux-gnu, applying to trunk. commit 6a52d34876377448990d550cb763171180f4 Author: Jason Merrill Date: Thu Jan 31 09:45:49 2013 -0500 PR c++/56162 PR c++/56104 * typeck.c (get_member_function_from_ptrfunc): Fix ptrmemfunc_vbit_in_delta case. diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index bfac394..688c266 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -3176,9 +3176,7 @@ get_member_function_from_ptrfunc (tree *instance_ptrptr, tree function, e3 = pfn_from_ptrmemfunc (function); delta = delta_from_ptrmemfunc (function); idx = build1 (NOP_EXPR, vtable_index_type, e3); - if (nonvirtual) - e1 = integer_zero_node; - else switch (TARGET_PTRMEMFUNC_VBIT_LOCATION) + switch (TARGET_PTRMEMFUNC_VBIT_LOCATION) { case ptrmemfunc_vbit_in_pfn: e1 = cp_build_binary_op (input_location,
C++ PATCH for c++/54410 (duplicate template bindings in DWARF)
We ended up inserting the type in the array of types to add template parameters to twice, so we added the parameters twice. Fixed by only adding it the first time; it doesn't matter whether we happen to be creating a declaration or definition DIE at this point, since when we get to EOF we'll add the bindings to the definition die (if any). Tested x86_64-pc-linux-gnu, applying to trunk. commit 1e9546939b355bd9f84e1928e1599b3a9f3bbe21 Author: Jason Merrill Date: Wed Jan 30 17:21:49 2013 -0500 PR c++/54410 * dwarf2out.c (gen_struct_or_union_type_die): Always schedule template parameters the first time. (gen_scheduled_generic_parms_dies): Check completeness here. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 5a03280..3106dd9 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -19061,6 +19061,10 @@ gen_struct_or_union_type_die (tree type, dw_die_ref context_die, scope_die = scope_die_for (type, context_die); + /* Generate child dies for template paramaters. */ + if (!type_die && debug_info_level > DINFO_LEVEL_TERSE) +schedule_generic_params_dies_gen (type); + if (! type_die || (nested && is_cu_die (scope_die))) /* First occurrence of type or toplevel definition of nested class. */ { @@ -19078,11 +19082,6 @@ gen_struct_or_union_type_die (tree type, dw_die_ref context_die, else remove_AT (type_die, DW_AT_declaration); - /* Generate child dies for template paramaters. */ - if (debug_info_level > DINFO_LEVEL_TERSE - && COMPLETE_TYPE_P (type)) -schedule_generic_params_dies_gen (type); - /* If this type has been completed, then give it a byte_size attribute and then give a list of members. */ if (complete && !ns_decl) @@ -20592,7 +20591,8 @@ gen_scheduled_generic_parms_dies (void) return; FOR_EACH_VEC_ELT (*generic_type_instances, i, t) -gen_generic_params_dies (t); +if (COMPLETE_TYPE_P (t)) + gen_generic_params_dies (t); } diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/template-params-11.C b/gcc/testsuite/g++.dg/debug/dwarf2/template-params-11.C new file mode 100644 index 000..8000295 --- /dev/null +++ b/gcc/testsuite/g++.dg/debug/dwarf2/template-params-11.C @@ -0,0 +1,9 @@ +// PR c++/54410 +// { dg-options "-g -dA" } +// { dg-final { scan-assembler-times "DIE \\(\[^\n\]*\\) DW_TAG_template_type_param" 1 } } + +namespace N { + template struct A { }; +} + +N::A a;
[PATCH] Fix PR56157
This fixes PR56157 - when looking up operands for a vectorized SLP stmt we have to match up SLP childs with operands. We do so by looking for the operand in the DEF position of the SLP child. But that get's complicated in face of either stmt being a pattern stmt and its operands being DEFed by pattern stmts (in case this is a multiple stmt pattern) or by the scalar variant (of eventually another pattern the SLP child points to). I ended up giving up trying to be clever and trying to handle all situations with a single compare and just compare with pattern and non-pattern def. False matches are not possible, just we must not miss a match. Bootstrapped and tested on x86_64-unknown-linux-gnu, ok? Thanks, Richard. 2013-01-31 Richard Biener PR tree-optimization/ * tree-vect-slp.c (vect_get_slp_defs): More thoroughly try to match up operand with SLP child. * gcc.dg/torture/pr56157.c: New testcase. Index: gcc/tree-vect-slp.c === *** gcc/tree-vect-slp.c (revision 195610) --- gcc/tree-vect-slp.c (working copy) *** void *** 2616,2628 vect_get_slp_defs (vec ops, slp_tree slp_node, vec *vec_oprnds, int reduc_index) { ! gimple first_stmt, first_def; int number_of_vects = 0, i; unsigned int child_index = 0; HOST_WIDE_INT lhs_size_unit, rhs_size_unit; slp_tree child = NULL; vec *vec_defs; ! tree oprnd, def_lhs; bool vectorized_defs; first_stmt = SLP_TREE_SCALAR_STMTS (slp_node)[0]; --- 2616,2628 vect_get_slp_defs (vec ops, slp_tree slp_node, vec *vec_oprnds, int reduc_index) { ! gimple first_stmt; int number_of_vects = 0, i; unsigned int child_index = 0; HOST_WIDE_INT lhs_size_unit, rhs_size_unit; slp_tree child = NULL; vec *vec_defs; ! tree oprnd; bool vectorized_defs; first_stmt = SLP_TREE_SCALAR_STMTS (slp_node)[0]; *** vect_get_slp_defs (vec ops, slp_tr *** 2638,2666 if (SLP_TREE_CHILDREN (slp_node).length () > child_index) { child = (slp_tree) SLP_TREE_CHILDREN (slp_node)[child_index]; - first_def = SLP_TREE_SCALAR_STMTS (child)[0]; ! /* In the end of a pattern sequence we have a use of the original stmt, !so we need to compare OPRND with the original def. */ ! if (is_pattern_stmt_p (vinfo_for_stmt (first_def)) ! && !STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (first_stmt)) ! && !is_pattern_stmt_p (vinfo_for_stmt (first_stmt))) ! first_def = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (first_def)); ! if (is_gimple_call (first_def)) ! def_lhs = gimple_call_lhs (first_def); ! else ! def_lhs = gimple_assign_lhs (first_def); ! ! if (operand_equal_p (oprnd, def_lhs, 0)) ! { ! /* The number of vector defs is determined by the number of ! vector statements in the node from which we get those statements. */ ! number_of_vects = SLP_TREE_NUMBER_OF_VEC_STMTS (child); ! vectorized_defs = true; child_index++; ! } } if (!vectorized_defs) --- 2638,2659 if (SLP_TREE_CHILDREN (slp_node).length () > child_index) { child = (slp_tree) SLP_TREE_CHILDREN (slp_node)[child_index]; ! /* We have to check both pattern and original def, if available. */ ! gimple first_def = SLP_TREE_SCALAR_STMTS (child)[0]; ! gimple related = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (first_def)); ! if (operand_equal_p (oprnd, gimple_get_lhs (first_def), 0) ! || (related ! && operand_equal_p (oprnd, gimple_get_lhs (related), 0))) ! { ! /* The number of vector defs is determined by the number of !vector statements in the node from which we get those statements. */ ! number_of_vects = SLP_TREE_NUMBER_OF_VEC_STMTS (child); ! vectorized_defs = true; child_index++; ! } } if (!vectorized_defs) Index: gcc/testsuite/gcc.dg/torture/pr56157.c === *** gcc/testsuite/gcc.dg/torture/pr56157.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr56157.c (working copy) *** *** 0 --- 1,25 + /* { dg-do compile } */ + /* { dg-options "-ftree-vectorize" } */ + + struct Pixel { + unsigned short r; + unsigned short g; + unsigned short b; + unsigned short a; + }; + + void fn(unsigned char * __restrict dst, const unsigned char * __restrict src) + { + unsigned x; + for(x = 0; x < 1024; x += 1) + { + struct Pixel pixel; + pixel.r = (unsigned short)(((unsigned)src[0]) * 0x
libgo patch committed: Don't allocate during backtrace
The libbacktrace interfaces says that the strings passed to the callback routine may disappear. As it happens, in the current implementation they will not. The Go backtrace routine needs to hang on to the strings, and I've discovered that it can't reliably allocate memory, since it might be called in the middle of an existing memory allocation. So at least for now just assume that the backtrace strings will hang around. I'll fix this if I can figure out a way to allocate memory there. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 368d10817c19 libgo/runtime/go-callers.c --- a/libgo/runtime/go-callers.c Wed Jan 30 14:00:04 2013 -0800 +++ b/libgo/runtime/go-callers.c Thu Jan 31 08:38:26 2013 -0800 @@ -43,8 +43,14 @@ loc = &arg->locbuf[arg->index]; loc->pc = pc; - loc->filename = runtime_gostring ((const byte *) filename); - loc->function = runtime_gostring ((const byte *) function); + + /* The libbacktrace library says that these strings might disappear, + but with the current implementation they won't. We can't easily + allocate memory here, so for now assume that we can save a + pointer to the strings. */ + loc->filename = runtime_gostringnocopy ((const byte *) filename); + loc->function = runtime_gostringnocopy ((const byte *) function); + loc->lineno = lineno; ++arg->index; return arg->index >= arg->max;
Re: [PATCH] Fix PR56157
On Thu, Jan 31, 2013 at 05:31:12PM +0100, Richard Biener wrote: > 2013-01-31 Richard Biener > > PR tree-optimization/ > * tree-vect-slp.c (vect_get_slp_defs): More thoroughly try to > match up operand with SLP child. > > * gcc.dg/torture/pr56157.c: New testcase. Ok, thanks. > *** gcc/testsuite/gcc.dg/torture/pr56157.c(revision 0) > --- gcc/testsuite/gcc.dg/torture/pr56157.c(working copy) I'd perhaps remove some more cruft from the testcase, /* { dg-do compile } */ /* { dg-options "-ftree-vectorize" } */ struct Pixel { unsigned short r, g, b, a; }; void fn (unsigned char * __restrict dst, const unsigned char * __restrict src) { unsigned x; for (x = 0; x < 1024; x++) { struct Pixel pixel; pixel.r = src[0] * 0xU / 0xff; pixel.g = src[1] * 0xU / 0xff; pixel.b = src[2] * 0xU / 0xff; pixel.a = src[3] * 0xU / 0xff; __builtin_memcpy (dst, &pixel, sizeof pixel); src += 4; dst += 8; } } is less obfuscated and still does the same thing, but your call. Jakub
RE: Fix for PR55561 race condition in libgomp
The updated changelog entry is below, but somebody with write access should do the commit, please. 2013-01-31 Dmitry Vyukov Joost VandeVondele PR libgomp/55561 * config/linux/wait.h (do_spin): Use atomic load for addr. * config/linux/ptrlock.c (gomp_ptrlock_get_slow): Use atomic for intptr and ptrlock. * config/linux/ptrlock.h (gomp_ptrlock_get): Use atomic load for ptrlock.
Re: Cortex-A15 vfnma/vfnms test patch
On 01/28/13 14:03, amol pise wrote: Dear Ramana, Thank You very much for the changelog and commit of my patch in gcc. I will follow the steps mentioned by you. There are no vector forms for the vfnma and vfnms instructions. A co-worker (thanks Kyryll) just pointed out to me that I'd misread the ARM-ARM when I checked this and it looks like the test run I did had failed but it looks I looked I checked the wrong work area. Looking at the output a bit more carefully now compared to what I did the other day I see only the scalar forms being generated. Also the ARM-ARM specifies that these instructions only have the scalar forms (Section 8.8.318) vfnm<>.f64 Dd, Dn, Dm vfnm<>.f32 Sd, Sn, Sm instructions. I have now reverted this patch as obvious. Sorry about the inconvenience caused. regards Ramana Thank You, Amol Pise On Mon, Jan 28, 2013 at 4:18 PM, Ramana Radhakrishnan wrote: [Taking gcc-help off this thread.] Amol, I have tested these instruction with GCC and these instructions are generated. Please review and marge this test support patch in gcc main trunk. Thanks for this patch and sorry about the delay in getting around to this. This is ok and I'll take this under the 10 line rule this time . If you intend to continue to submit patches to gcc can I ask that you start the process for copyright assignments or confirm that you have a copyright assignment on file ? http://gcc.gnu.org/contribute.html#legal If you don't, send an email to g...@gcc.gnu.org with a request for copyright assignment papers and a maintainer will send you these. http://gcc.gnu.org/contribute.html in general is a good summary of the process related to contributing patches to GCC in general . Please do read that and follow up on g...@gcc.gnu.org if you have any more questions. And finally don't forget to add a changelog to your patches as documented in links from the above mentioned page. Since this is your first time I've added the following Changelog entry for your patch and applied it. regards Ramana 2013-01-27 Amol Pise * gcc.target/arm/neon-vfnms-1.c: New test. * gcc.target/arm/neon-vfnma-1.c: New test.
libgo patch committed: Block signals when creating new thread
This patch to libgo disables signals while creating a new thread. Otherwise if a signal comes in between the time the thread starts and the time the thread initializes its m and g TLS variables, the program will crash with an error "signal received on thread not created by Go." Signals are already enabled by the thread after it has been initialized. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 4716407240c2 libgo/runtime/proc.c --- a/libgo/runtime/proc.c Thu Jan 31 08:39:01 2013 -0800 +++ b/libgo/runtime/proc.c Thu Jan 31 09:24:12 2013 -0800 @@ -3,6 +3,7 @@ // license that can be found in the LICENSE file. #include +#include #include #include #include @@ -1217,6 +1218,9 @@ pthread_attr_t attr; pthread_t tid; size_t stacksize; + sigset_t clear; + sigset_t old; + int ret; #if 0 static const Type *mtype; // The Go type M @@ -1249,7 +1253,15 @@ if(pthread_attr_setstacksize(&attr, stacksize) != 0) runtime_throw("pthread_attr_setstacksize"); - if(pthread_create(&tid, &attr, runtime_mstart, mp) != 0) + // Block signals during pthread_create so that the new thread + // starts with signals disabled. It will enable them in minit. + sigfillset(&clear); + sigemptyset(&old); + sigprocmask(SIG_BLOCK, &clear, &old); + ret = pthread_create(&tid, &attr, runtime_mstart, mp); + sigprocmask(SIG_SETMASK, &old, nil); + + if (ret != 0) runtime_throw("pthread_create"); return mp;
libbacktrace patch committed: Fix threaded race
Using libbacktrace more extensively from Go has revealed a race condition in libbacktrace. I cleverly save some memory by reusing a vector in the DWARF data structure. However, that is fairly hard to do if multiple threads are simultaneously getting a backtrace. This patch changes the code to always use a new vector when threaded. This will introduce additional memory fragmentation, but that seems impossible to avoid. Bootstrapped and ran libbacktrace and Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2013-01-31 Ian Lance Taylor * dwarf.c (read_function_info): Permit fvec parameter to be NULL. (dwarf_lookup_pc): Don't use ddata->fvec if threaded. Index: dwarf.c === --- dwarf.c (revision 195478) +++ dwarf.c (working copy) @@ -2473,10 +2473,21 @@ read_function_info (struct backtrace_sta struct function_addrs **ret_addrs, size_t *ret_addrs_count) { + struct function_vector lvec; + struct function_vector *pfvec; struct dwarf_buf unit_buf; struct function_addrs *addrs; size_t addrs_count; + /* Use FVEC if it is not NULL. Otherwise use our own vector. */ + if (fvec != NULL) +pfvec = fvec; + else +{ + memset (&lvec, 0, sizeof lvec); + pfvec = &lvec; +} + unit_buf.name = ".debug_info"; unit_buf.start = ddata->dwarf_info; unit_buf.buf = u->unit_data; @@ -2489,20 +2500,28 @@ read_function_info (struct backtrace_sta while (unit_buf.left > 0) { if (!read_function_entry (state, ddata, u, 0, &unit_buf, lhdr, -error_callback, data, fvec)) +error_callback, data, pfvec)) return; } - if (fvec->count == 0) + if (pfvec->count == 0) return; - addrs = (struct function_addrs *) fvec->vec.base; - addrs_count = fvec->count; + addrs = (struct function_addrs *) pfvec->vec.base; + addrs_count = pfvec->count; - /* Finish this list of addresses, but leave the remaining space in - the vector available for the next function unit. */ - backtrace_vector_finish (state, &fvec->vec); - fvec->count = 0; + if (fvec == NULL) +{ + if (!backtrace_vector_release (state, &lvec.vec, error_callback, data)) + return; +} + else +{ + /* Finish this list of addresses, but leave the remaining space in + the vector available for the next function unit. */ + backtrace_vector_finish (state, &fvec->vec); + fvec->count = 0; +} qsort (addrs, addrs_count, sizeof (struct function_addrs), function_addrs_compare); @@ -2663,8 +2682,16 @@ dwarf_lookup_pc (struct backtrace_state if (read_line_info (state, ddata, error_callback, data, entry->u, &lhdr, &lines, &count)) { + struct function_vector *pfvec; + + /* If not threaded, reuse DDATA->FVEC for better memory + consumption. */ + if (state->threaded) + pfvec = NULL; + else + pfvec = &ddata->fvec; read_function_info (state, ddata, &lhdr, error_callback, data, - entry->u, &ddata->fvec, &function_addrs, + entry->u, pfvec, &function_addrs, &function_addrs_count); free_line_header (state, &lhdr, error_callback, data); new_data = 1;
Commit: V850: Add support for the E3V5 architecture variant
Hi Guys, I am applying the patch below to add support for the E3V5 architecture variant to the V850 backend. This patch was originally developed by Renesas, but it is only recently that their copyright assignment was completed so that it could be contributed. I realize that the current mainline sources are only accepting regression fixes at the moment, but I asked for and received permission to apply the patch, with the caveat that: "if you break your arch we'll not wait for you to fix it ;)" The patch does actually fix a few problems with the other V850 architecture variants, such that the gcc testsuite shows 31 fewer unexpected failures for each multilib. Support for the e3v5 architecture variant is already in the binutils and simulator sources. Tested with no regressions (and several improvements) on a v850e-elf toolchain. Cheers Nick gcc/ChangeLog 2013-01-31 Hiroyuki Ono Nick Clifton * config/v850/constraints.md (Q): Define as a memory constraint. * config/v850/predicates.md (label_ref_operand): New predicate. (e3v5_shift_operand): New predicate. (ior_operator): New predicate. * config/v850/t-v850: Add e3v5 multilib. * config/v850/v850-protos.h (v850_adjust_insn_length): Prototype. (v850_gen_movdi): Prototype. * config/v850/v850.c: Add support for e3v5 architecture. Rename all uses of TARGET_V850E || TARGET_V850E2_ALL to TARGET_V850E_UP. (construct_save_jarl): Add e3v5 long JARL support. (v850_adjust_insn_length): New function. Adjust length of call insns when using e3v5 instructions. (v850_gen_movdi): New function: Generate instructions to move a DImode value. * config/v850/v850.h (TARGET_CPU_v850e3v5): Define. (CPP_SPEC): Define __v850e3v5__ as appropriate. (TARGET_USE_FPU): Enable for e3v5. (CONST_OK_FOR_W): New macro. (ADJUST_INSN_LENGTH): Define. * config/v850/v850.md (UNSPEC_LOOP): Define. (attr cpu): Add v850e3v5. Rename all uses of TARGET_V850E2 to TARGET_V850E2V3_UP. (movdi): New pattern. (movdi_internal): New pattern. (cbranchsf4): Conditionalize on TARGET_USE_FPU. (cbranchdf4): Conditionalize on TARGET_USE_FPU. (cstoresf4): Likewise. (cstoredf4): Likewise. (insv): New pattern. (rotlso3_a): New pattern. (rotlsi3_b): New pattern (rotlsi3_v850e3v5): New pattern. (doloop_begin): New pattern. (fix_loop_counter): New pattern. (doloop_end): New pattern. (branch_normal): Add e3v5 long branch support. (branch_invert): Likewise. (branch_z_normal): Likewise. (branch_z_invert): Likewise. (branch_nz_normal): Likewise. (branch_nz_invert): Likewise. (call_internal_short): Add e3v5 register-indirect JARL support. (call_internal_long): Likewise. (call_value_internal_short): Likewise. (call_value_internal_long): Likewise. * config/v850/v850.opt (mv850e3v5, mv850e2v4): New options. (mloop): New option. * config.gcc: Add support for configuring v840e3v5 target. * doc/invoke.texi: Document new v850 specific command line options. libgcc/ChangeLog 2013-01-31 Nick Clifton * config/v850/lib1funcs.S: Add support for e3v5 architecture variant. v850e3v5.gcc.patch.xz Description: application/xz
Merge from trunk to gccgo branch
I've merged trunk revision 195620 to the gccgo branch. Ian
[PATCH] GCC 4.9, powerpc, add more debugging to -mdebug=reg and -mdebug=addr
None of these changes affect the code, but they provide some more information that I've found useful when using the -mdebug=reg and -mdebug=addr options. When GCC 4.9 opens up, can I install these patches in the source tree> 2013-01-31 Michael Meissner * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print MODES_TIEABLE_P for selected modes. Print the numerical value of the various virtual registers. Use GPR/FPR first/last values, instead of hard coding the register numbers. Print which modes have reload functions registered. (rs6000_option_override_internal): If -mdebug=reg, trace the options settings before/after setting cpu, target and subtarget settings. (rs6000_secondary_reload_trace): Improve the RTL dump for -mdebug=addr and for secondary reload failures in rs6000_secondary_reload_inner. (rs6000_secondary_reload_fail): Likewise. (rs6000_secondary_reload_inner): Likewise. * config/rs6000/rs6000.md (FIRST_GPR_REGNO): Add convenience macros for first/last GPR and FPR registers. (LAST_GPR_REGNO): Likewise. (FIRST_FPR_REGNO): Likewise. (LAST_FPR_REGNO): Likewise. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 195592) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1693,6 +1693,7 @@ rs6000_debug_reg_global (void) static const char *const tf[2] = { "false", "true" }; const char *nl = (const char *)0; int m; + size_t m1, m2, v; char costly_num[20]; char nop_num[20]; char flags_buffer[40]; @@ -1713,10 +1714,66 @@ rs6000_debug_reg_global (void) "other" }; - fprintf (stderr, "Register information: (last virtual reg = %d)\n", - LAST_VIRTUAL_REGISTER); - rs6000_debug_reg_print (0, 31, "gr"); - rs6000_debug_reg_print (32, 63, "fp"); + /* Modes we want tieable information on. */ + static const enum machine_mode print_tieable_modes[] = { +QImode, +HImode, +SImode, +DImode, +TImode, +SFmode, +DFmode, +TFmode, +SDmode, +DDmode, +TDmode, +V8QImode, +V4HImode, +V2SImode, +V16QImode, +V8HImode, +V4SImode, +V2DImode, +V32QImode, +V16HImode, +V8SImode, +V4DImode, +V2SFmode, +V4SFmode, +V2DFmode, +V8SFmode, +V4DFmode, +CCmode, +CCUNSmode, +CCEQmode, + }; + + /* Virtual regs we are interested in. */ + const static struct { +int regno; /* register number. */ +const char *name; /* register name. */ + } virtual_regs[] = { +{ STACK_POINTER_REGNUM,"stack pointer:" }, +{ TOC_REGNUM, "toc: " }, +{ STATIC_CHAIN_REGNUM, "static chain: " }, +{ RS6000_PIC_OFFSET_TABLE_REGNUM, "pic offset: " }, +{ HARD_FRAME_POINTER_REGNUM, "hard frame: " }, +{ ARG_POINTER_REGNUM, "arg pointer: " }, +{ FRAME_POINTER_REGNUM,"frame pointer:" }, +{ FIRST_PSEUDO_REGISTER, "first pseudo: " }, +{ FIRST_VIRTUAL_REGISTER, "first virtual:" }, +{ VIRTUAL_INCOMING_ARGS_REGNUM,"incoming_args:" }, +{ VIRTUAL_STACK_VARS_REGNUM, "stack_vars: " }, +{ VIRTUAL_STACK_DYNAMIC_REGNUM,"stack_dynamic:" }, +{ VIRTUAL_OUTGOING_ARGS_REGNUM,"outgoing_args:" }, +{ VIRTUAL_CFA_REGNUM, "cfa (frame): " }, +{ VIRTUAL_PREFERRED_STACK_BOUNDARY_REGNUM, "stack boundry:" }, +{ LAST_VIRTUAL_REGISTER, "last virtual: " }, + }; + + fputs ("\nHard register information:\n", stderr); + rs6000_debug_reg_print (FIRST_GPR_REGNO, LAST_GPR_REGNO, "gr"); + rs6000_debug_reg_print (FIRST_FPR_REGNO, LAST_FPR_REGNO, "fp"); rs6000_debug_reg_print (FIRST_ALTIVEC_REGNO, LAST_ALTIVEC_REGNO, "vs"); @@ -1729,6 +1786,10 @@ rs6000_debug_reg_global (void) rs6000_debug_reg_print (SPE_ACC_REGNO, SPE_ACC_REGNO, "spe_a"); rs6000_debug_reg_print (SPEFSCR_REGNO, SPEFSCR_REGNO, "spe_f"); + fputs ("\nVirtual/stack/frame registers:\n", stderr); + for (v = 0; v < ARRAY_SIZE (virtual_regs); v++) +fprintf (stderr, "%s regno = %3d\n", virtual_regs[v].name, virtual_regs[v].regno); + fprintf (stderr, "\n" "d reg_class = %s\n" @@ -1756,18 +1817,53 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]); for (m = 0; m < NUM_MACHINE_MODES; ++m) -if (rs6000_vector_unit[m] || rs6000_vector_mem[m]) +if (rs6000_vector_unit[m] || rs6000_vector_mem[m] +
libgo patch committed: Remove confusion about split-stack backtrace
In an earlier patch I tried to remove confusion about split-stack functions in backtraces, but it still doesn't work in conjunction with skipping functions. Both runtime.Callers and backtrace_full can skip stack frames. backtrace_full will count split-stack frames in the skip count, but runtime.Callers should not. In order to keep this consistent, this patch moves all the skip count handling to runtime.Callers. Bootstrapped and ran Go testsuite in x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r c205459490e5 libgo/runtime/go-callers.c --- a/libgo/runtime/go-callers.c Thu Jan 31 09:24:55 2013 -0800 +++ b/libgo/runtime/go-callers.c Thu Jan 31 11:42:48 2013 -0800 @@ -16,6 +16,7 @@ struct callers_data { Location *locbuf; + int skip; int index; int max; }; @@ -41,6 +42,12 @@ return 0; } + if (arg->skip > 0) +{ + --arg->skip; + return 0; +} + loc = &arg->locbuf[arg->index]; loc->pc = pc; @@ -75,10 +82,11 @@ struct callers_data data; data.locbuf = locbuf; + data.skip = skip + 1; data.index = 0; data.max = m; - backtrace_full (__go_get_backtrace_state (), skip + 1, callback, - error_callback, &data); + backtrace_full (__go_get_backtrace_state (), 0, callback, error_callback, + &data); return data.index; }
Merged trunk to gccgo branch
Continuing my quest for some really stable gccgo sources, I've merged trunk revision 195627 to the gccgo branch. Ian
Re: Commit: V850: Add support for the E3V5 architecture variant
Hi Nick, the web page update required a tweak: goes at the end of the entry. Committed. Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.91 diff -u -3 -p -r1.91 changes.html --- changes.html31 Jan 2013 18:34:10 - 1.91 +++ changes.html31 Jan 2013 20:51:10 - @@ -698,9 +698,9 @@ B b(42); // OK This target now supports the E3V5 architecture via the use -of the new -mv850e3v5 command-line option. It also has -experimental support for the e3v5 LOOP instruction which can be -enabled via the new -mloop command-line option. +of the new -mv850e3v5 command-line option. It also has +experimental support for the e3v5 LOOP instruction which +can be enabled via the new -mloop command-line option. XStormy16
Re: [PATCH] testsuite tcl portability fix - avoid lreverse
On Jan 30, 2013, at 12:18 AM, Jakub Jelinek wrote: > I've noticed > ERROR: (DejaGnu) proc "lreverse {{ASAN_OPTIONS 0}}" does not exist. > > errors when regtesting gcc 4.8 on RHEL 5, it seems lreverse has been added > to tcl only in version 8.5. > Fixed thusly, ok for trunk? Ok.
libgo patch committed: Recognize morestack.S if no function name
When using an old assembler it's possible to get debug info that says that a PC is in morestack.S but does not have the function name. This patch changes the Go backtrace code to recognize this case. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 388d41792584 libgo/runtime/go-callers.c --- a/libgo/runtime/go-callers.c Thu Jan 31 11:43:44 2013 -0800 +++ b/libgo/runtime/go-callers.c Thu Jan 31 15:11:01 2013 -0800 @@ -34,13 +34,24 @@ /* Skip split stack functions. */ if (function != NULL) { - const char *p = function; + const char *p; + p = function; if (__builtin_strncmp (p, "___", 3) == 0) ++p; if (__builtin_strncmp (p, "__morestack_", 12) == 0) return 0; } + else if (filename != NULL) +{ + const char *p; + + p = strrchr (filename, '/'); + if (p == NULL) + p = filename; + if (__builtin_strncmp (p, "morestack.S", 11) == 0) + return 0; +} if (arg->skip > 0) {
Re: patch to fix PR56144
On Wed, Jan 30, 2013 at 6:24 PM, Vladimir Makarov wrote: > The following patch fixes > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56144 > > The patch was successfully bootstrapped and tested on x86/x86-64. Hello Vlad, Can you please put this patch on the lra-branch too, so that the auto-testers can pick it up? Ciao! Steven
Re: [PATCH] Vtable pointer verification (corruption/attach detection -- new feature
cmt...@google.com On Wed, Jan 30, 2013 at 2:09 AM, Florian Weimer wrote: > On 11/01/2012 09:07 PM, Caroline Tice wrote: >> >> We have been developing a new security hardening feature for GCC that >> is designed to detect and handle (during program execution) when a >> vtable pointer that is about to be used for a virtual function call is >> not a valid vtable pointer for that call (i.e. it has become >> corrupted, possibly due to a hacker attack). We gave a presentation >> on this work at the Gnu Tools Cauldron in Prague last July. We now >> have the implementation fully working and are submitting this patch >> for review. We would like to get this into the next release of GCC if >> possible. > > > Thanks for posting this collection of interesting patches. > > As far as I understand it, this changes ABI in the sense that every object > file needs to contain the constructors that register the vtables, otherwise > verification will later fail. Yes. > If this data could be emitted in a > declarative fashion, it might be possible to emit it by default, in a > separate ELF section. This way, it is always there when needed, and it > wouldn't have any performance impact if not used. That might be possible; it would need to be carefully organized though. If is not enough to have a list of all vtable addresses; we need to know, for each virtual class (where by virtual class I mean a class for which a vtable might be generated) the set of vtables that is is legal for an object of that class to point to (i.e. for a base class, it can point to its own vtable or to the vtable of any of its descendants in the inheritance hierarchy). > > I didn't look at the actual permitted-vtable set lookup in detail. How > expensive is it? The set up is expected to be somewhat expensive, although we are trying to make it as efficient as possible. We are in the process of gathering performance data. > C++ virtual method calls are efficient, but they have > their problems. The vtable needs relocation (contributing to startup cost), > and we have the fragile base class problem (which severely constrains the > evolution of separately-compiled base classes). Assuming that we need the > vtable check and it has non-trivial cost, it might make sense to revamp C++ > virtual method dispatch altogether, addressing both security and modularity > issues. That might be best, but we don't have the authority to unilaterally make a change of this magnitude to the standard. ;-) > > (Yes, I understand these two paragraphs go off in entirely different > directions. 8-) > Let me know if you have any more questions, or if you would like any further clarifications. -- Caroline Tice cmt...@google.com > -- > Florian Weimer / Red Hat Product Security Team
gccgo patch committed: Don't emit methods for identical unnamed structs
When a source file has multiple identical unnamed structs, and those structs have methods, gccgo would emit multiple copies of the methods. That would be bad because each copy would have the same name, causing an assembler error about multiply defined symbols. This patch fixes the problem. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 73fde5ee3282 go/types.cc --- a/go/types.cc Thu Jan 31 15:11:34 2013 -0800 +++ b/go/types.cc Thu Jan 31 16:20:32 2013 -0800 @@ -4170,6 +4170,11 @@ // Class Struct_type. +// A hash table used to find identical unnamed structs so that they +// share method tables. + +Struct_type::Identical_structs Struct_type::identical_structs; + // Traversal. int @@ -4596,6 +4601,21 @@ { if (this->all_methods_ != NULL) return; + + // It is possible to have multiple identical structs that have + // methods. We want them to share method tables. Otherwise we will + // emit identical methods more than once, which is bad since they + // will even have the same names. + std::pair ins = +Struct_type::identical_structs.insert(std::make_pair(this, this)); + if (!ins.second) +{ + // An identical struct was already entered into the hash table. + // Note that finalize_methods is, fortunately, not recursive. + this->all_methods_ = ins.first->second->all_methods_; + return; +} + Type::finalize_methods(gogo, this, this->location_, &this->all_methods_); } diff -r 73fde5ee3282 go/types.h --- a/go/types.h Thu Jan 31 15:11:34 2013 -0800 +++ b/go/types.h Thu Jan 31 16:20:32 2013 -0800 @@ -2184,6 +2184,12 @@ do_export(Export*) const; private: + // Used to merge method sets of identical unnamed structs. + typedef Unordered_map_hash(Struct_type*, Struct_type*, Type_hash_identical, + Type_identical) Identical_structs; + + static Identical_structs identical_structs; + // Used to avoid infinite loops in field_reference_depth. struct Saw_named_type {
Re: [PATCH] Vtable pointer verification, C++ front end changes (patch 1 of 3)
On Wed, Jan 30, 2013 at 9:26 AM, Jason Merrill wrote: > I'm also touching on the middle-end and library changes, but would > appreciate a more thorough review from others in those areas. > > On 01/23/2013 05:25 PM, Caroline Tice wrote: >> >> Index: gcc/cp/g++spec.c > > > Changes to g++spec.c only affect the g++ driver, not the gcc driver. Are you > sure this is what you want? Can't you handle this stuff directly in the > specs like address sanitizer does? > >> @@ -17954,6 +17954,10 @@ mark_class_instantiated (tree t, int ext >>if (! extern_p) >> { >>CLASSTYPE_DEBUG_REQUESTED (t) = 1; >> + >> + if (flag_vtable_verify) >> +vtv_save_class_info (t); > > Why do you need this here as well as in finish_struct_1? > If we don't have this in both places, then we miss getting vtable pointers for instantiated templates. >> -static tree >> +tree >> get_mangled_id (tree decl) > > > It doesn't look like you call get_mangled_id anywhere, so I think you don't > need this change anymore. You are right; this is leftover from an earlier version where we were calling this function directly; I'll remove the change. > >>finalize_compilation_unit (); >> >> + if (flag_vtable_verify) >> +{ >> + /* Generate the special constructor initialization function that >> + calls __VLTRegisterPairs, and give it a very high initialization >> + priority. */ >> + vtv_generate_init_routine (); >> +} > > > Why do you want to do this after cgraph finalization? Is it so that you > know which vtables are really being emitted? Please explain in the comment. Exactly. We need to know which vtables are really emitted before we output __VLTRegisterPair calls; otherwise we got in trouble emitting calls for vtables that weren't output, or causing vtables to be output when otherwise they wouldn't be. > >> + if (flag_vtable_verify >> + && strstr (IDENTIFIER_POINTER (DECL_NAME (fn)), ".vtable")) >> +return fn; > > > How would this function end up with .vtable at the end of its name? > Whoops. This is leftover from older code when we modified start_objects and added that to the constructor initialization function. I will change this. >> + if (TREE_CHAIN (base_class)) >> +class_type_decl = TREE_CHAIN (base_class); >> + else >> +class_type_decl = TYPE_NAME (base_class); >> + >> + /* Temporarily remove any qualifiers on type. */ >> + type_decl_type = TREE_TYPE (class_type_decl); >> + save_quals = TYPE_QUALS (type_decl_type); >> + reset_type_qualifiers (null_quals, type_decl_type); >> + >> + base_id = DECL_ASSEMBLER_NAME (TREE_CHAIN (base_class)); > > > I think you want TYPE_LINKAGE_IDENTIFIER here. > I don't know the difference between DECL_ASSEMBLER_NAME and TYPE_LINKAGE_IDENTIFIER. We are just trying to get the mangled name for the class. > I don't understand what the qualifier business is trying to accomplish, > especially since you never use type_decl_type. You do this in several > places, but it should never be necessary; classes don't have any qualifiers. > We used to not have the "qualifier business", assuming that classes did not have any type qualifiers. This turned out not to be a true assumption. Occasionally we were getting a case where a class had a "const" qualifier attached to it *sometimes*. We used the mangled name to look it up in our hashtable, and it didn't find the entry for the class (which had been put in without the const qualifier). So we temporarily remove any type qualifiers before getting the mangled name; then re-apply whatever qualifiers were there before. The cod actually works because the type_decl_type we change the qualifiers on is a subtree of the class_type_decl, which is a subtree of the base_class that we get the mangled name from. (We have tested this code; it works and fixes our problem). >> + if (vtbl_map_node_registration_find (base_vtable_map_node, vtable_decl, >> + offset)) >> +return true; >> + >> + vtbl_map_node_registration_insert (base_vtable_map_node, vtable_decl, >> +offset); > > > Here you're doing two hash table lookups when one would be enough. > As written the insert function doesn't return anything to let you know whether the item was already there or not, which we need to know (we use the results here to avoid generating redundant calls to __VLTRegisterPair. I suppose we could modify the insert function to return a boolean indicating if the item was already in the hashtable, and then we could get by with just one call here... > >> 2). A linked list of all >> + vtbl_map_nodes ("vtbl_map_nodes") for easy iteration through all of >> + them; and 3). An array of vtbl_map_nodes, where the array index >> + corresponds to the unique id of the vtbl_map_node, which gives us >> + an easy way to use bitmaps to represent and find the vtable map >> + nodes. > > > What's hard about iteration over
Merge trunk to gccgo branch
It's true, I've merged trunk to gccgo branch again, in this case revision 195638. Ian
libgo patch committed: Correct test for morestack.S
I am an idiot. (Hope it is right this time.) Ian diff -r 686169e30cff libgo/runtime/go-callers.c --- a/libgo/runtime/go-callers.c Thu Jan 31 16:22:11 2013 -0800 +++ b/libgo/runtime/go-callers.c Thu Jan 31 21:42:52 2013 -0800 @@ -49,7 +49,7 @@ p = strrchr (filename, '/'); if (p == NULL) p = filename; - if (__builtin_strncmp (p, "morestack.S", 11) == 0) + if (__builtin_strncmp (p, "/morestack.S", 12) == 0) return 0; }
Merged trunk to gccgo branch
I merged trunk revision 195640 to the gccgo branch. Ian