Re: [Patch,AVR]: Fix PR56263
2013/3/11 Georg-Johann Lay : > This patch implements a new warning option -Waddr-space-convert warns about > conversions to a non-containing address space. > > Address spaces are implemented in such a way that each address space is > contained in each other space so that casting is possible, e.g. in code like > > char read_c (bool in_flash, const char *str) > { > if (in_flash) > return * (const __flash char*) str; > else > return *str; > } > > However, there are no warning about implicit or explicit address space > conversions, which makes it hard to port progmem code to address space code. > > If an address space qualifier is missing, e.g. when calling a function that is > not qualified correctly, this would result in wrong code (in contrast to > pgm_read that work no matter how the address is qualified). > > There is still some work to do to get more precise warnings and this is just a > first take to implement PR56263, see the FIXME in the source. > > The details can be worked out later, e.g. for 4.8.1. > > > Ok for trunk? > > > PR target/56263 > * config/avr/avr.c (TARGET_CONVERT_TO_TYPE): Define to... > (avr_convert_to_type): ...this new static function. > * config/avr/avr.opt (-Waddr-space-convert): New C option. > * doc/invoke.texi (AVR Options): Document it. Approved. Denis.
Re: extend fwprop optimization
Thanks for the helpful comments! I have some replies inlined. Regards, Wei. On Mon, Mar 11, 2013 at 12:52 PM, Steven Bosscher wrote: > On Mon, Mar 11, 2013 at 6:52 AM, Wei Mi wrote: >> This is the fwprop extension patch which is put in order. Regression >> test and bootstrap pass. Please help to review its rationality. The >> following is a brief description what I have done in the patch. >> >> In order to make fwprop more effective in rtl optimization, we extend >> it to handle general expressions instead of the three cases listed in >> the head comment in fwprop.c. The major changes include a) We need to >> check propagation correctness for src exprs of def which contain mem >> references. Previous fwprop for the three cases above doesn't have the >> problem. b) We need a better cost model because the benefit is usually >> not so apparent as the three cases above. >> >> For a general fwprop problem, there are two possible sources where >> benefit comes from. The frist is the new use insn after propagation >> and simplification may have lower cost than itself before propagation, >> or propagation may create a new insn, that could be splitted or >> peephole optimized later and get a lower cost. The second is that if >> all the uses are replaced with the src of the def insn, the def insn >> could be deleted. >> >> So instead of check each def-use pair independently, we use DU chain >> to track all the uses for a def. For each def-use pair, we attempt the >> propagation, record the change candidate in changes[] array, but we >> wait to confirm the changes until all the pairs with the same def are >> iterated. The changes confirmation is done in the func >> confirm_change_group_by_cost. We only do this for fwprop. For >> fwprop_addr, the benefit of each change is ensured by >> propagation_rtx_1 using should_replace_address, so we just confirm all >> the changes without checking benefit again. > > Hello Wei Mi, > > So IIUC, in essence you are doing: > > main: > FOR_EACH_BB: > FOR_BB_INSNS, non-debug insns only: > for each df_ref DEF operand on insn: > iterate_def_uses > > iterate_def_uses: > for each UD chain from DEF to USE(i): > forward_propagate_into > confirm changes by total benefit > > I still like the idea, but there are also still a few "design issues" > to resolve. > > Some of the same comments as before apply: Do you really, really, > reallyreally have to go so low-level as to insn splitting, peephole > optimizations, and even register allocation, to get the cost right? > That will almost certainly not be acceptable, and I for one would > oppose such a change. It's IMHO a violation of proper engineering when > your medium-to-high level code transformations have to do that. If you > have strong reasons for your approach, it'd be helpful if you can > explain them so that we can together look for a less intrusive > solution (e.g. splitting earlier, adjusting the cost model, etc.). > For the motivational case, I need insn splitting to get the cost right. insn splitting is not very intrusive. All I need is to call split_insns func. I think split_insns is just a pattern matching func just like recog(), which is called at many places. Peephole is not necessary (I add it in order to find as many oppotunities as possible, but from my trace analysis, it doesn't help much). To call peephole2_insn() is indeed intrusive, because peephole assumes reg allocation is completed, I have to insert the ugly workaround below. peephole also requires setting DF_LR_RUN_DCE flag and some initialization of peep2_insn_data array. So how about keep split_insns and remove peephole in the cost estimation func? > So things like: >> + /* we may call peephole2_insns in fwprop phase to estimate how >> + peephole will affect the cost of the insn transformed by fwprop. >> + fwprop is done before ira phase. In that case, we simply return >> + a new pseudo register. */ >> + if (!strncmp (current_pass->name, "fwprop", 6)) >> +return gen_reg_rtx (mode); > > and > >> Index: config/i386/i386.c >> === >> --- config/i386/i386.c(revision 196270) >> +++ config/i386/i386.c(working copy) >> @@ -15901,8 +15901,14 @@ ix86_expand_clear (rtx dest) >> { >>rtx tmp; >> >> - /* We play register width games, which are only valid after reload. */ >> - gcc_assert (reload_completed); >> + /* We play register width games, which are only valid after reload. >> + An exception: fwprop call peephole to estimate the change benefit, >> + and peephole will call this func. That is before reload complete. >> + It will not bring any problem because the peephole2_insns call is >> + only used for cost estimation in fwprop, and its change will be >> + abandoned immediately after the cost estimation. */ >> + if (strncmp (current_pass->name, "fwprop", 6)) >> +gcc_assert (reload_completed); > > are IMHO not OK.
Re: [PATCH][1/n] tree LIM TLC
On Mon, 11 Mar 2013, Richard Biener wrote: > > This is a series of patches applying some TLC to LIM. This first > patch gets rid of the remains of create_vop_ref_mapping and > alongside cleans up how we record references. Actually I rolled in some more changes into this patch, bootstrapped and tested on x86_64-unknown-linux-gnu, queued for 4.9. Richard. 2013-03-11 Richard Biener * tree-ssa-loop-im.c (record_mem_ref_loc): Record ref as stored here. (gather_mem_refs_stmt): Instead of here and in create_vop_ref_mapping_loop. (gather_mem_refs_in_loops): Fold into ... (analyze_memory_references): ... this. Move data structure init to tree_ssa_lim_initialize. Propagate stored refs info as well. (create_vop_ref_mapping_loop): Remove. (create_vop_ref_mapping): Likewise. (tree_ssa_lim_initialize): Initialize ref bitmaps here. Split always-executed computation into ... (fill_always_executed_in): ... this. Rename original to ... (fill_always_executed_in_1): ... this. (tree_ssa_lim): Call fill_always_executed_in here. Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2013-03-11 12:38:43.0 +0100 --- trunk/gcc/tree-ssa-loop-im.c2013-03-11 14:12:00.143773587 +0100 *** mem_ref_locs_alloc (void) *** 1518,1528 description REF. The reference occurs in statement STMT. */ static void ! record_mem_ref_loc (mem_ref_p ref, struct loop *loop, gimple stmt, tree *loc) { mem_ref_loc_p aref = XNEW (struct mem_ref_loc); mem_ref_locs_p accs; - bitmap ril = memory_accesses.refs_in_loop[loop->num]; if (ref->accesses_in_loop.length () <= (unsigned) loop->num) --- 1518,1528 description REF. The reference occurs in statement STMT. */ static void ! record_mem_ref_loc (mem_ref_p ref, bool is_stored, ! struct loop *loop, gimple stmt, tree *loc) { mem_ref_loc_p aref = XNEW (struct mem_ref_loc); mem_ref_locs_p accs; if (ref->accesses_in_loop.length () <= (unsigned) loop->num) *** record_mem_ref_loc (mem_ref_p ref, struc *** 1536,1556 aref->stmt = stmt; aref->ref = loc; - accs->locs.safe_push (aref); - bitmap_set_bit (ril, ref->id); - } - - /* Marks reference REF as stored in LOOP. */ ! static void ! mark_ref_stored (mem_ref_p ref, struct loop *loop) ! { ! for (; !loop != current_loops->tree_root !&& !bitmap_bit_p (ref->stored, loop->num); !loop = loop_outer (loop)) ! bitmap_set_bit (ref->stored, loop->num); } /* Gathers memory references in statement STMT in LOOP, storing the --- 1536,1552 aref->stmt = stmt; aref->ref = loc; accs->locs.safe_push (aref); ! bitmap_set_bit (memory_accesses.refs_in_loop[loop->num], ref->id); ! if (is_stored) ! { ! bitmap_set_bit (memory_accesses.all_refs_stored_in_loop[loop->num], ! ref->id); ! while (loop != current_loops->tree_root !&& bitmap_set_bit (ref->stored, loop->num)) ! loop = loop_outer (loop); ! } } /* Gathers memory references in statement STMT in LOOP, storing the *** gather_mem_refs_stmt (struct loop *loop, *** 1582,1590 fprintf (dump_file, "Unanalyzed memory reference %u: ", id); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); } ! if (gimple_vdef (stmt)) ! mark_ref_stored (ref, loop); ! record_mem_ref_loc (ref, loop, stmt, mem); return; } --- 1578,1584 fprintf (dump_file, "Unanalyzed memory reference %u: ", id); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); } ! record_mem_ref_loc (ref, gimple_vdef (stmt), loop, stmt, mem); return; } *** gather_mem_refs_stmt (struct loop *loop, *** 1611,1627 } } ! if (is_stored) ! mark_ref_stored (ref, loop); ! ! record_mem_ref_loc (ref, loop, stmt, mem); return; } /* Gathers memory references in loops. */ static void ! gather_mem_refs_in_loops (void) { gimple_stmt_iterator bsi; basic_block bb; --- 1605,1618 } } ! record_mem_ref_loc (ref, is_stored, loop, stmt, mem); return; } /* Gathers memory references in loops. */ static void ! analyze_memory_references (void) { gimple_stmt_iterator bsi; basic_block bb; *** gather_mem_refs_in_loops (void) *** 1647,1731 alrefs = memory_accesses.all_refs_in_loop[loop->num]; bitmap_ior_into (alrefs, lrefs); ! if (loop_outer (loop) == current_loops->tree_root) continue; ! alrefso = memory_accesses.all_refs_in_loop[loop_outer (loop)->num]; bitmap_ior_into (alrefso, alrefs); } } - /*
Re: [PATCH][4.8][4.7][4.6] Make -shared-libgcc the default on Cygwin.
On Tue, Mar 12, 2013 at 2:44 AM, Dave Korn wrote: > > Hello list, > > The attached patch makes -shared-libgcc the default for Cygwin. This is > something that I should have done some time ago, as shared libgcc on Cygwin is > more than mature. What's more, it is vital for reliable compilation of > applications that throw exceptions or share TLS variables from DLLs into the > main executable; at present these compile incorrectly without an explicit > -shared-libgcc. For instance, the just-released MPFR-3.1.2 doesn't work > without it. > > Given that it's a very simple tweak to the compiler specs on a single > platform only, I would like to use my target maintainer's discretion to apply > it even at this late stage, but I figure it's so close to RC1 that I should > ask the RM's permission anyway. > > I'd also like to backport it to all the currently-open branches. > > gcc/ChangeLog > > 2013-03-12 Dave Korn > > * config/i386/cygwin.h (SHARED_LIBGCC_SPEC): Make shared libgcc the > default setting. > > Is this OK by everyone? Ok for trunk (4.8). Please add a documentation entry to gcc-4.8/changes.html. I'm not sure whether this kind of stuff should change on a release branch, I'll defer to others for this. Still, if you backport it, add a gcc-4.x/changes.html item to the sub-release sections. Richard. > cheers, > DaveK > > >
Re: [patch testsuite]: Fix gcc.target/i386 cases for mingw-targets
On Fri, Mar 8, 2013 at 8:21 AM, Rainer Orth wrote: > Hi Kai, > Index: gcc.target/i386/pr20020-1.c === --- gcc.target/i386/pr20020-1.c (Revision 196507) +++ gcc.target/i386/pr20020-1.c (Arbeitskopie) @@ -1,5 +1,6 @@ /* Check that 128-bit struct's are represented as TImode values. */ /* { dg-do compile { target int128 } } */ +/* { dg-skip-if "" { x86_64-*-mingw* } { "*" } { "" } } */ >>> >>> Please omit the default { "*" } { "" } here and in other tests below. >>> And again: explain why the test is skipped. >> >> Hmm, why shall I omit here the default. I checked in tree and most >> statements for dg-skip-if are expressing default too. > > just because support for omitting the default isn't that old. There's > certainly opportunity for cleanup, but we shouldn't spread this any > further. > >> Well, to skip here x64 mingw is caused by the fact that it has a >> different ABI as x86_64. I will add it to skip message. > > Thanks. It's always far easier to have this in the testsuite to spare > the next guy from doing software archaeology. > >> Ok to apply with those changes? > > Again, I prefer to defer to the target maintainers. Ping
[PATCH][2/n] tree LIM TLC
This makes LIM work per outermost loop, reducing peak memory use. Not necessarily 2/n, but I've completed and tested it on x86_64-unknown-linux-gnu. Queued for 4.9. Richard. 2013-03-12 Richard Biener * tree-ssa-loop-im.c (determine_invariantness_stmt): Rename to ... (determine_invariantness_bb): ... this. Adjust for ... (determine_invariantness): ... walk all blocks of the loop we process. (move_computations_stmt): Rename to ... (move_computations_bb): ... this. Adjust for ... (move_computations): ... walk all blocks of the loop we process. (analyze_memory_references): Likewise. (store_motion): Process all sub-loops of the loop we process. (fill_always_executed_in): Likewise. (tree_ssa_lim_initialize): Move global bits to tree_ssa_lim. (tree_ssa_lim_finalize): Likewise. (tree_ssa_lim_1): Split out from ... (tree_ssa_lim): ... this. Perform global init and iterate over all outermost loops. Index: gcc/tree-ssa-loop-im.c === *** gcc/tree-ssa-loop-im.c.orig 2013-03-11 16:11:02.0 +0100 --- gcc/tree-ssa-loop-im.c 2013-03-12 10:09:58.923878391 +0100 *** rewrite_bittest (gimple_stmt_iterator *b *** 1040,1047 Callback for walk_dominator_tree. */ static void ! determine_invariantness_stmt (struct dom_walk_data *dw_data ATTRIBUTE_UNUSED, ! basic_block bb) { enum move_pos pos; gimple_stmt_iterator bsi; --- 1040,1046 Callback for walk_dominator_tree. */ static void ! determine_invariantness_bb (basic_block bb) { enum move_pos pos; gimple_stmt_iterator bsi; *** determine_invariantness_stmt (struct dom *** 1050,1058 struct loop *outermost = ALWAYS_EXECUTED_IN (bb); struct lim_aux_data *lim_data; - if (!loop_outer (bb->loop_father)) - return; - if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Basic block %d (loop %d -- depth %d):\n\n", bb->index, bb->loop_father->num, loop_depth (bb->loop_father)); --- 1049,1054 *** determine_invariantness_stmt (struct dom *** 1177,1211 each statement. */ static void ! determine_invariantness (void) { ! struct dom_walk_data walk_data; ! ! memset (&walk_data, 0, sizeof (struct dom_walk_data)); ! walk_data.dom_direction = CDI_DOMINATORS; ! walk_data.before_dom_children = determine_invariantness_stmt; ! init_walk_dominator_tree (&walk_data); ! walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR); ! fini_walk_dominator_tree (&walk_data); } /* Hoist the statements in basic block BB out of the loops prescribed by data stored in LIM_DATA structures associated with each statement. Callback for walk_dominator_tree. */ ! static void ! move_computations_stmt (struct dom_walk_data *dw_data, ! basic_block bb) { struct loop *level; gimple_stmt_iterator bsi; gimple stmt; unsigned cost = 0; struct lim_aux_data *lim_data; ! ! if (!loop_outer (bb->loop_father)) ! return; for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); ) { --- 1173,1202 each statement. */ static void ! determine_invariantness (struct loop *loop, basic_block *bbs) { ! unsigned i; ! for (i = 0; i < loop->num_nodes; ++i) ! { ! basic_block bb = bbs[i]; ! determine_invariantness_bb (bb); ! } } /* Hoist the statements in basic block BB out of the loops prescribed by data stored in LIM_DATA structures associated with each statement. Callback for walk_dominator_tree. */ ! static unsigned ! move_computations_bb (basic_block bb) { struct loop *level; gimple_stmt_iterator bsi; gimple stmt; unsigned cost = 0; struct lim_aux_data *lim_data; ! unsigned todo = 0; for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); ) { *** move_computations_stmt (struct dom_walk_ *** 1260,1266 gimple_phi_result (stmt), t, arg0, arg1); SSA_NAME_DEF_STMT (gimple_phi_result (stmt)) = new_stmt; ! *((unsigned int *)(dw_data->global_data)) |= TODO_cleanup_cfg; } gsi_insert_on_edge (loop_preheader_edge (level), new_stmt); remove_phi_node (&bsi, false); --- 1251,1257 gimple_phi_result (stmt), t, arg0, arg1); SSA_NAME_DEF_STMT (gimple_phi_result (stmt)) = new_stmt; ! todo |= TODO_cleanup_cfg; } gsi_insert_on_edge (loop_preheader_edge (level), new_stmt); remove_phi_node (&bsi, false); *** move_computations_stmt (struct dom_walk_ *** 1323,1351 gsi_
[PATCH][3/n] tree LIM TLC
This exploits symmetry in memory reference disambiguation. That should reduce the number of queries and space needed in the cache bitmaps. Bootstrapped and tested on x86_64-unknown-linux-gnu. Richard. 2013-03-12 Richard Biener PR tree-optimization/39326 * tree-ssa-loop-im.c (refs_independent_p): Exploit symmetry. Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2013-03-12 11:41:58.0 +0100 --- trunk/gcc/tree-ssa-loop-im.c2013-03-12 11:44:11.507745552 +0100 *** ref_always_accessed_p (struct loop *loop *** 2163,2177 static bool refs_independent_p (mem_ref_p ref1, mem_ref_p ref2) { ! if (ref1 == ref2 ! || bitmap_bit_p (ref1->indep_ref, ref2->id)) return true; ! if (bitmap_bit_p (ref1->dep_ref, ref2->id)) ! return false; if (!MEM_ANALYZABLE (ref1) || !MEM_ANALYZABLE (ref2)) return false; if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Querying dependency of refs %u and %u: ", ref1->id, ref2->id); --- 2163,2188 static bool refs_independent_p (mem_ref_p ref1, mem_ref_p ref2) { ! if (ref1 == ref2) return true; ! if (!MEM_ANALYZABLE (ref1) || !MEM_ANALYZABLE (ref2)) return false; + /* Reference dependence in a loop is symmetric. */ + if (ref1->id > ref2->id) + { + mem_ref_p tem = ref1; + ref1 = ref2; + ref2 = tem; + } + + if (bitmap_bit_p (ref1->indep_ref, ref2->id)) + return true; + if (bitmap_bit_p (ref1->dep_ref, ref2->id)) + return false; + if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Querying dependency of refs %u and %u: ", ref1->id, ref2->id); *** refs_independent_p (mem_ref_p ref1, mem_ *** 2180,2186 &memory_accesses.ttae_cache)) { bitmap_set_bit (ref1->dep_ref, ref2->id); - bitmap_set_bit (ref2->dep_ref, ref1->id); if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "dependent.\n"); return false; --- 2191,2196 *** refs_independent_p (mem_ref_p ref1, mem_ *** 2188,2194 else { bitmap_set_bit (ref1->indep_ref, ref2->id); - bitmap_set_bit (ref2->indep_ref, ref1->id); if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "independent.\n"); return true; --- 2198,2203
[PATCH][4/n] tree LIM TLC
This makes us only consider stores instead of all references when looking for store motion opportunities. Bootstrap pending on x86_64-unknown-linux-gnu, queued for 4.9. To be committed with omitting the gcc_checking_assert. Richard. 2013-03-12 Richard Biener * tree-ssa-loop-im.c (can_sm_ref_p): Do not test whether ref is stored in the loop. (find_refs_for_sm): Walk only over all stores. Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2013-03-12 13:25:30.0 +0100 --- trunk/gcc/tree-ssa-loop-im.c2013-03-12 13:28:49.468824315 +0100 *** can_sm_ref_p (struct loop *loop, mem_ref *** 2284,2291 return false; /* Unless the reference is stored in the loop, there is nothing to do. */ ! if (!bitmap_bit_p (ref->stored, loop->num)) ! return false; /* It should be movable. */ if (!is_gimple_reg_type (TREE_TYPE (ref->mem)) --- 2284,2290 return false; /* Unless the reference is stored in the loop, there is nothing to do. */ ! gcc_checking_assert (bitmap_bit_p (ref->stored, loop->num)); /* It should be movable. */ if (!is_gimple_reg_type (TREE_TYPE (ref->mem)) *** can_sm_ref_p (struct loop *loop, mem_ref *** 2322,2328 static void find_refs_for_sm (struct loop *loop, bitmap sm_executed, bitmap refs_to_sm) { ! bitmap refs = memory_accesses.all_refs_in_loop[loop->num]; unsigned i; bitmap_iterator bi; mem_ref_p ref; --- 2321,2327 static void find_refs_for_sm (struct loop *loop, bitmap sm_executed, bitmap refs_to_sm) { ! bitmap refs = memory_accesses.all_refs_stored_in_loop[loop->num]; unsigned i; bitmap_iterator bi; mem_ref_p ref;
Do not output references from external vtables into LTO symbol table
Hi, this patch fixes GCC side of PR 56557 where we fail to link #include int main() { std::fstream x; } with -flto -rdynamic because we do not see definition of _ZTCSt13basic_fstreamIcSt11char_traitsIcEE0_Sd. This symbol mistakely appears in lto symbol table because it is used from external constructor that is kept around only for optimization. There is also BFD linker bug that create stale dynamic table entries for these kind of symbols. Bootstrapped®tested x86_64-linux, tested on Mozilla and Qt LTO build, comitted. Honza PR lto/56557 * lto-streamer-out.c (output_symbol_p): Skip references from constructors of external variables. Index: lto-streamer-out.c === --- lto-streamer-out.c (revision 196611) +++ lto-streamer-out.c (working copy) @@ -1265,17 +1265,36 @@ bool output_symbol_p (symtab_node node) { struct cgraph_node *cnode; - struct ipa_ref *ref; - if (!symtab_real_symbol_p (node)) return false; /* We keep external functions in symtab for sake of inlining and devirtualization. We do not want to see them in symbol table as - references. */ + references unless they are really used. */ cnode = dyn_cast (node); - if (cnode && DECL_EXTERNAL (cnode->symbol.decl)) -return (cnode->callers - || ipa_ref_list_referring_iterate (&cnode->symbol.ref_list, 0, ref)); + if (cnode && DECL_EXTERNAL (cnode->symbol.decl) + && cnode->callers) +return true; + + /* Ignore all references from external vars initializers - they are not really +part of the compilation unit until they are used by folding. Some symbols, +like references to external construction vtables can not be referred to at all. +We decide this at can_refer_decl_in_current_unit_p. */ + if (DECL_EXTERNAL (node->symbol.decl)) +{ + int i; + struct ipa_ref *ref; + for (i = 0; ipa_ref_list_referring_iterate (&node->symbol.ref_list, + i, ref); i++) + { + if (ref->use == IPA_REF_ALIAS) + continue; + if (is_a (ref->referring)) + return true; + if (!DECL_EXTERNAL (ref->referring->symbol.decl)) + return true; + } + return false; +} return true; }
Re: [patch testsuite]: Fix gcc.target/i386 cases for mingw-targets
What's here to ping about? I got ok by rth. Kai
Re: [Patch,AVR]: Fix PR56263
On Mon, Mar 11, 2013 at 07:03:14PM +0100, Georg-Johann Lay wrote: > PR target/56263 > * doc/invoke.texi (AVR Options): Document it. This change broke building of info doc everywhere: ../../gcc/doc//invoke.texi:11652: @item found outside of an insertion block. makeinfo: Removing output file `doc/gcc.info' due to errors; use --force to preserve. make: *** [doc/gcc.info] Error 1 Fixed thusly, committed as obvious. 2013-03-12 Jakub Jelinek * doc/invoke.texi (-Waddr-space-convert): Move into the table earlier. --- gcc/doc/invoke.texi (revision 196613) +++ gcc/doc/invoke.texi (working copy) @@ -11632,6 +11632,11 @@ sbiw r26, const ; X -= const @item -mtiny-stack @opindex mtiny-stack Only change the lower 8@tie{}bits of the stack pointer. + +@item -Waddr-space-convert +@opindex Waddr-space-convert +Warn about conversions between address spaces in the case where the +resulting address space is not contained in the incoming address space. @end table @subsubsection @code{EIND} and Devices with more than 128 Ki Bytes of Flash @@ -11649,11 +11654,6 @@ when @code{EICALL} or @code{EIJMP} instr Indirect jumps and calls on these devices are handled as follows by the compiler and are subject to some limitations: -@item -Waddr-space-convert -@opindex Waddr-space-convert -Warn about conversions between address spaces in the case where the -resulting address space is not contained in the incoming address space. - @itemize @bullet @item Jakub
[PATCH][5/n] tree LIM TLC
This merges the tri-state caches (not computed, dependent, independent) bitmaps to improve cache locality. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Queued for 4.9. Richard. 2013-03-12 Richard Biener * tree-ssa-loop-im.c (struct mem_ref): Merge indep_loop and dep_loop into loop_dependence, merge indep_ref and dep_ref into ref_dependence. (mem_ref_alloc): Adjust. (refs_independent_p): Likewise. (record_indep_loop): Likewise. (ref_indep_loop_p): Likewise. Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2013-03-12 14:18:44.0 +0100 --- trunk/gcc/tree-ssa-loop-im.c2013-03-12 14:27:37.054487784 +0100 *** typedef struct mem_ref *** 127,147 /* The locations of the accesses. Vector indexed by the loop number. */ ! /* The following sets are computed on demand. We keep both set and ! its complement, so that we know whether the information was ! already computed or not. */ ! bitmap indep_loop; /* The set of loops in that the memory ! reference is independent, meaning: ! If it is stored in the loop, this store !is independent on all other loads and !stores. ! If it is only loaded, then it is independent !on all stores in the loop. */ ! bitmap dep_loop;/* The complement of INDEP_LOOP. */ ! ! bitmap indep_ref; /* The set of memory references on that ! this reference is independent. */ ! bitmap dep_ref; /* The complement of INDEP_REF. */ } *mem_ref_p; --- 127,145 /* The locations of the accesses. Vector indexed by the loop number. */ ! /* The following sets are computed on demand. We use two bits per ! information to represent the not-computed state. */ ! ! /* The set of loops in that the memory reference is independent ! (2 * loop->num) or dependent (2 * loop->num + 1) in. ! If it is stored in the loop, this store is independent on all other ! loads and stores. ! If it is only loaded, then it is independent on all stores in the loop. */ ! bitmap loop_dependence; ! ! /* The set of memory references on that this reference is independent ! (2 * mem->id) or dependent (2 * mem->id + 1). */ ! bitmap ref_dependence; } *mem_ref_p; *** mem_ref_alloc (tree mem, unsigned hash, *** 1481,1490 ref->id = id; ref->hash = hash; ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack); ! ref->indep_loop = BITMAP_ALLOC (&lim_bitmap_obstack); ! ref->dep_loop = BITMAP_ALLOC (&lim_bitmap_obstack); ! ref->indep_ref = BITMAP_ALLOC (&lim_bitmap_obstack); ! ref->dep_ref = BITMAP_ALLOC (&lim_bitmap_obstack); ref->accesses_in_loop.create (0); return ref; --- 1479,1486 ref->id = id; ref->hash = hash; ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack); ! ref->loop_dependence = BITMAP_ALLOC (&lim_bitmap_obstack); ! ref->ref_dependence = BITMAP_ALLOC (&lim_bitmap_obstack); ref->accesses_in_loop.create (0); return ref; *** refs_independent_p (mem_ref_p ref1, mem_ *** 2178,2186 ref2 = tem; } ! if (bitmap_bit_p (ref1->indep_ref, ref2->id)) return true; ! if (bitmap_bit_p (ref1->dep_ref, ref2->id)) return false; if (dump_file && (dump_flags & TDF_DETAILS)) --- 2174,2182 ref2 = tem; } ! if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id)) return true; ! if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id + 1)) return false; if (dump_file && (dump_flags & TDF_DETAILS)) *** refs_independent_p (mem_ref_p ref1, mem_ *** 2190,2203 if (mem_refs_may_alias_p (ref1->mem, ref2->mem, &memory_accesses.ttae_cache)) { ! bitmap_set_bit (ref1->dep_ref, ref2->id); if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "dependent.\n"); return false; } else { ! bitmap_set_bit (ref1->indep_ref, ref2->id); if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "independent.\n"); return true; --- 2186,2199 if (mem_refs_may_alias_p (ref1->mem, ref2->mem, &memory_accesses.ttae_cache)) { ! bitmap_set_bit (ref1->ref_dependence, 2 * ref2->id + 1); if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "dependent.\n"); return false; } els
Re: [PATCH][5/n] tree LIM TLC
On Tue, Mar 12, 2013 at 4:16 PM, Richard Biener wrote: > --- 127,145 > /* The locations of the accesses. Vector >indexed by the loop number. */ > > ! /* The following sets are computed on demand. We use two bits per > ! information to represent the not-computed state. */ > ! > ! /* The set of loops in that the memory reference is independent > ! (2 * loop->num) or dependent (2 * loop->num + 1) in. > ! If it is stored in the loop, this store is independent on all other > ! loads and stores. > ! If it is only loaded, then it is independent on all stores in the > loop. */ > ! bitmap loop_dependence; > ! > ! /* The set of memory references on that this reference is independent > ! (2 * mem->id) or dependent (2 * mem->id + 1). */ > ! bitmap ref_dependence; > } *mem_ref_p; Perhaps add simple inline functions to test those bits, to avoid: > --- 2174,2182 > ref2 = tem; > } > > ! if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id)) > return true; > ! if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id + 1)) > return false; > > if (dump_file && (dump_flags & TDF_DETAILS)) ? That kind of explicit 2*x+[01] is bound to go wrong at some point. Ciao! Steven
[PATCH][6/n] tree LIM TLC
(Un-?)surprisingly the most effective compile-time reduction for the testcase in PR39326 is to employ ao_ref caching for alias oracle queries and caching of expanded affine-combinations for affine disambiguations. This reduces compile-time to a manageable amount in the first place for me (so I'm sending it "late" in the series). Bootstrap and regtest scheduled on x86_64-unknown-linux-gnu, queued for 4.9. Richard. 2013-03-12 Richard Biener PR tree-optimization/39326 * tree-ssa-loop-im.c (struct mem_ref): Replace mem member with an ao_ref typed one. Add affine-combination cache members. (MEM_ANALYZABLE): Adjust. (memref_eq): Likewise. (mem_ref_alloc): Likewise. (gather_mem_refs_stmt): Likewise. (execute_sm_if_changed_flag_set): Likewise. (execute_sm): Likewise. (ref_always_accessed_p): Likewise. (refs_independent_p): Likewise. (can_sm_ref_p): Likewise. (mem_refs_may_alias_p): Use ao_ref members to query the oracle. Cache expanded affine combinations. Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2013-03-12 15:11:12.0 +0100 --- trunk/gcc/tree-ssa-loop-im.c2013-03-12 16:20:49.115169595 +0100 *** typedef struct mem_ref_locs *** 117,126 typedef struct mem_ref { - tree mem; /* The memory itself. */ unsigned id;/* ID assigned to the memory reference (its index in memory_accesses.refs_list) */ hashval_t hash; /* Its hash value. */ bitmap stored; /* The set of loops in that this memory location is stored to. */ vec accesses_in_loop; --- 117,130 typedef struct mem_ref { unsigned id;/* ID assigned to the memory reference (its index in memory_accesses.refs_list) */ hashval_t hash; /* Its hash value. */ + + /* The memory access itself and associated caching of alias-oracle + query meta-data. */ + ao_ref mem; /* The ao_ref of this memory access. */ + bitmap stored; /* The set of loops in that this memory location is stored to. */ vec accesses_in_loop; *** typedef struct mem_ref *** 142,147 --- 146,155 bitmap indep_ref; /* The set of memory references on that this reference is independent. */ bitmap dep_ref; /* The complement of INDEP_REF. */ + + /* The expanded affine combination of this memory access. */ + aff_tree aff_off; + double_int aff_size; } *mem_ref_p; *** static bool ref_indep_loop_p (struct loo *** 186,192 #define SET_ALWAYS_EXECUTED_IN(BB, VAL) ((BB)->aux = (void *) (VAL)) /* Whether the reference was analyzable. */ ! #define MEM_ANALYZABLE(REF) ((REF)->mem != error_mark_node) static struct lim_aux_data * init_lim_data (gimple stmt) --- 194,200 #define SET_ALWAYS_EXECUTED_IN(BB, VAL) ((BB)->aux = (void *) (VAL)) /* Whether the reference was analyzable. */ ! #define MEM_ANALYZABLE(REF) ((REF)->mem.ref != error_mark_node) static struct lim_aux_data * init_lim_data (gimple stmt) *** memref_eq (const void *obj1, const void *** 1435,1441 { const struct mem_ref *const mem1 = (const struct mem_ref *) obj1; ! return operand_equal_p (mem1->mem, (const_tree) obj2, 0); } /* Releases list of memory reference locations ACCS. */ --- 1443,1449 { const struct mem_ref *const mem1 = (const struct mem_ref *) obj1; ! return operand_equal_p (mem1->mem.ref, (const_tree) obj2, 0); } /* Releases list of memory reference locations ACCS. */ *** static mem_ref_p *** 1477,1483 mem_ref_alloc (tree mem, unsigned hash, unsigned id) { mem_ref_p ref = XNEW (struct mem_ref); ! ref->mem = mem; ref->id = id; ref->hash = hash; ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack); --- 1485,1491 mem_ref_alloc (tree mem, unsigned hash, unsigned id) { mem_ref_p ref = XNEW (struct mem_ref); ! ao_ref_init (&ref->mem, mem); ref->id = id; ref->hash = hash; ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack); *** mem_ref_alloc (tree mem, unsigned hash, *** 1487,1492 --- 1495,1502 ref->dep_ref = BITMAP_ALLOC (&lim_bitmap_obstack); ref->accesses_in_loop.create (0); + ref->aff_off.type = NULL_TREE; + return ref; } *** gather_mem_refs_stmt (struct loop *loop, *** 1586,1592 if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "Memory reference %u: ", id); ! print_generic_expr (dump_file, ref->
Re: [PATCH][6/n] tree LIM TLC
On Tue, Mar 12, 2013 at 4:25 PM, Richard Biener wrote: > > (Un-?)surprisingly the most effective compile-time reduction for > the testcase in PR39326 is to employ ao_ref caching for > alias oracle queries and caching of expanded affine-combinations > for affine disambiguations. > > This reduces compile-time to a manageable amount in the first place > for me (so I'm sending it "late" in the series). I suppose this renders my LIM patch obsolete. Did you also look at the memory foot print? Ciao! Steven
Re: [PATCH][6/n] tree LIM TLC
On Tue, 12 Mar 2013, Steven Bosscher wrote: > On Tue, Mar 12, 2013 at 4:25 PM, Richard Biener wrote: > > > > (Un-?)surprisingly the most effective compile-time reduction for > > the testcase in PR39326 is to employ ao_ref caching for > > alias oracle queries and caching of expanded affine-combinations > > for affine disambiguations. > > > > This reduces compile-time to a manageable amount in the first place > > for me (so I'm sending it "late" in the series). > > I suppose this renders my LIM patch obsolete. Not really - it's still tree loop invariant motion: 588.31 (78%) usr so limiting the O(n^2) dependence testing is a good thing. But I can take it over from here and implement that ontop of my patches if you like. > Did you also look at the memory foot print? Yeah, unfortunately processing outermost loops separately doesn't reduce peak memory consumption. I'll look into getting rid of the all-refs bitmaps, but I'm not there yet. Currently the testcase peaks at 1.7GB for me (after LIM, then it gets worse with DSE and IRA). And I only tested -O1 sofar. Thanks, Richard.
Re: [PATCH] Fix install-plugin with vxworks-dummy.h (PR plugins/45078)
Am 06.03.2013 20:44, schrieb Jakub Jelinek: > Hi! > > On Wed, Mar 06, 2013 at 06:57:03PM +0800, Matthias Klose wrote: >> There is still vxworks-dummy.h, which is not installed, see PR45078. Would >> the >> same approach work? > > Like this? Untested though, and no access to most of the targets. looks ok. using the first chunk as in a patch proposed early, or maybe just applied locally. works for arm and sparc, sh4 didn't build yet, for mips a tri-arch build currently fails with Bootstrap comparison failure! mips-linux-gnu/64/libstdc++-v3/src/c++98/sstream-inst.o differs mips-linux-gnu/64/libstdc++-v3/src/c++98/istream-inst.o differs mips-linux-gnu/64/libstdc++-v3/src/c++98/ostream-inst.o differs make[4]: *** [compare] Error 1 Matthias
Re: [PATCH][6/n] tree LIM TLC
On Tue, Mar 12, 2013 at 4:33 PM, Richard Biener wrote: > On Tue, 12 Mar 2013, Steven Bosscher wrote: >> I suppose this renders my LIM patch obsolete. > > Not really - it's still > > tree loop invariant motion: 588.31 (78%) usr > > so limiting the O(n^2) dependence testing is a good thing. But I > can take it over from here and implement that ontop of my patches > if you like. That'd be good, let's keep it in one hand, one set. >> Did you also look at the memory foot print? > > Yeah, unfortunately processing outermost loops separately doesn't > reduce peak memory consumption. I'll look into getting rid of the > all-refs bitmaps, but I'm not there yet. A few more ideas (though probably not with as much impact): Is it possible to use a bitmap_head for the (now merged) dep_loop/indep_loop, instead of bitmap? Likewise for a few other bitmaps, especially the vectors of bitmaps. Put "struct depend" in an alloc pool. (Also allows one to wipe them all out in free_lim_aux_data.) Likewise for "struct mem_ref". Use a shared mem_ref for the error_mark_node case (and hoist the MEM_ANALYZABLE checks in refs_independent_p above the bitmap tests). Use nameless temps instead of lsm_tmp_name_add. > Currently the testcase peaks at 1.7GB for me (after LIM, then > it gets worse with DSE and IRA). And I only tested -O1 sofar. Try my DSE patch (corrected version attached). What are you using now to measure per-pass memory usage? I'm still using my old hack (also attached) but it's not quite optimal. Ciao! Steven PR39326_RTLDSE.diff Description: Binary data passes_memstat.diff Description: Binary data
[patch, committed, wwwdata] Add id="current" to Current releases in gcc.gnu.org/onlinedocs/
Hi, I have committed the attached trivial patch in order to link directly to the documentation of the developer version: http://gcc.gnu.org/onlinedocs/#current Tobias PS: For releases, one can use "/" instead, e.g. http://gcc.gnu.org/onlinedocs/4.7.2/ Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/onlinedocs/index.html,v retrieving revision 1.129 diff -u -r1.129 index.html --- index.html 20 Sep 2012 10:17:30 - 1.129 +++ index.html 12 Mar 2013 16:55:05 - @@ -734,7 +734,7 @@ -Current development +Current development Please note that the following documentation refers to current development. Some information may not be applicable to any
Re: [trunk][google/gcc47]Add dependence of configure-target-libmudflap on configure-target-libstdc++-v3 (issue7740043)
I made a mistake in my previous patch. I did not notice that Makefile.in was a generated file. Update the patch. 2013-03-12 Jing Yu * Makefile.def (Target modules dependencies): Add new dependency. * Makefile.in: Re-generate. Index: Makefile.in === --- Makefile.in (revision 196604) +++ Makefile.in (working copy) @@ -6,6 +6,7 @@ all-target-libjava: maybe-all-target-boehm-gc all-target-libjava: maybe-all-target-libffi configure-target-libobjc: maybe-configure-target-boehm-gc all-target-libobjc: maybe-all-target-boehm-gc +configure-target-libmudflap: maybe-configure-target-libstdc++-v3 configure-target-libstdc++-v3: maybe-configure-target-libgomp configure-stage1-target-libstdc++-v3: maybe-configure-stage1-target-libgomp Index: Makefile.def === --- Makefile.def (revision 196604) +++ Makefile.def (working copy) @@ -504,6 +504,7 @@ dependencies = { module=all-target-libjava; on=all dependencies = { module=all-target-libjava; on=all-target-libffi; }; dependencies = { module=configure-target-libobjc; on=configure-target-boehm-gc; }; dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; }; +dependencies = { module=configure-target-libmudflap; on=configure-target-libstdc++-v3; }; dependencies = { module=configure-target-libstdc++-v3; on=configure-target-libgomp; }; // parallel_list.o and parallel_settings.o depend on omp.h, which is // generated by the libgomp configure. Unfortunately, due to the use of On Mon, Mar 11, 2013 at 5:21 PM, Jing Yu wrote: > Don't know why the email body became attachment. Sent it again. > The review link is https://codereview.appspot.com/7740043 > > Hi Diego, > > The nightly build of gcc-4.7 based ppc64 and ppc32 crosstools have failed > since > the build server upgraded to gPrecise one week ago. Log shows a configuration > fa > ilure on libmudflap. > > checking for suffix of object files... /lib/cpp > configure: error: in > `/g/nightly/build/work/gcc-4.7.x-grtev3-powerpc32-8540/rpmbuild/BUILD/.../powerpc-grtev3-linux-gnu/libmudflap': > configure: error: C++ preprocessor "/lib/cpp" fails sanity check > See `config.log' for more details. > > There is no /lib/cpp on gprecise server, though it should not be used here. > > What happened was that libmudflap configure looks for a preprocessor > by trying $CXX -E and then backing off to /lib/cpp. $CXX -E is > failing with "unrecognized command line option > ‘-funconfigured-libstdc++’", and the /lib/cpp backstop then fails > also. The -funconfigured-libstdc++ is because configure can't find > libstdc++/scripts/testsuite_flags. This is a so-far undiagnosed race > in gcc make, masked where /lib/cpp is available. And that's absent > because in this build, for whatever reason, libstdc++ loses a race > with libmudflap. > > The theory is confirmed by: > 1) if we force --job=1, build can succeed > 2) If we apply the following patch to build-gcc/Makefile, build can > succeed. After removing this dependency, build fails with the same > error again. > > Is this patch ok for google/gcc-4_7? > > If the same issue exists on upstream trunk, how does the patch sound to trunk? > > Thanks, > Jing > > 2013-03-11 Jing Yu > > * Makefile.in: (maybe-configure-target-libmudflap): > Add dependence on configure-target-libstdc++-v3. > > Index: Makefile.in > === > --- Makefile.in (revision 196604) > +++ Makefile.in (working copy) > @@ -31879,6 +31879,9 @@ maybe-configure-target-libmudflap: > @if gcc-bootstrap > configure-target-libmudflap: stage_current > @endif gcc-bootstrap > +@if target-libstdc++-v3 > +configure-target-libmudflap: configure-target-libstdc++-v3 > +@endif target-libstdc++-v3 > @if target-libmudflap > maybe-configure-target-libmudflap: configure-target-libmudflap > configure-target-libmudflap:
Re: [trunk][google/gcc47]Add dependence of configure-target-libmudflap on configure-target-libstdc++-v3 (issue7740043)
On 2013-03-12 13:24 , Jing Yu wrote: I made a mistake in my previous patch. I did not notice that Makefile.in was a generated file. Update the patch. 2013-03-12 Jing Yu * Makefile.def (Target modules dependencies): Add new dependency. * Makefile.in: Re-generate. Index: Makefile.in === --- Makefile.in (revision 196604) +++ Makefile.in (working copy) @@ -6,6 +6,7 @@ all-target-libjava: maybe-all-target-boehm-gc all-target-libjava: maybe-all-target-libffi configure-target-libobjc: maybe-configure-target-boehm-gc all-target-libobjc: maybe-all-target-boehm-gc +configure-target-libmudflap: maybe-configure-target-libstdc++-v3 configure-target-libstdc++-v3: maybe-configure-target-libgomp configure-stage1-target-libstdc++-v3: maybe-configure-stage1-target-libgomp Index: Makefile.def === --- Makefile.def (revision 196604) +++ Makefile.def (working copy) @@ -504,6 +504,7 @@ dependencies = { module=all-target-libjava; on=all dependencies = { module=all-target-libjava; on=all-target-libffi; }; dependencies = { module=configure-target-libobjc; on=configure-target-boehm-gc; }; dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; }; +dependencies = { module=configure-target-libmudflap; on=configure-target-libstdc++-v3; }; dependencies = { module=configure-target-libstdc++-v3; on=configure-target-libgomp; }; // parallel_list.o and parallel_settings.o depend on omp.h, which is // generated by the libgomp configure. Unfortunately, due to the use of On Mon, Mar 11, 2013 at 5:21 PM, Jing Yu wrote: Don't know why the email body became attachment. Sent it again. The review link is https://codereview.appspot.com/7740043 Hi Diego, The nightly build of gcc-4.7 based ppc64 and ppc32 crosstools have failed since the build server upgraded to gPrecise one week ago. Log shows a configuration fa ilure on libmudflap. checking for suffix of object files... /lib/cpp configure: error: in `/g/nightly/build/work/gcc-4.7.x-grtev3-powerpc32-8540/rpmbuild/BUILD/.../powerpc-grtev3-linux-gnu/libmudflap': configure: error: C++ preprocessor "/lib/cpp" fails sanity check See `config.log' for more details. There is no /lib/cpp on gprecise server, though it should not be used here. What happened was that libmudflap configure looks for a preprocessor by trying $CXX -E and then backing off to /lib/cpp. $CXX -E is failing with "unrecognized command line option ‘-funconfigured-libstdc++’", and the /lib/cpp backstop then fails also. The -funconfigured-libstdc++ is because configure can't find libstdc++/scripts/testsuite_flags. This is a so-far undiagnosed race in gcc make, masked where /lib/cpp is available. And that's absent because in this build, for whatever reason, libstdc++ loses a race with libmudflap. The theory is confirmed by: 1) if we force --job=1, build can succeed 2) If we apply the following patch to build-gcc/Makefile, build can succeed. After removing this dependency, build fails with the same error again. Is this patch ok for google/gcc-4_7? If the same issue exists on upstream trunk, how does the patch sound to trunk? Thanks, Jing 2013-03-11 Jing Yu * Makefile.in: (maybe-configure-target-libmudflap): Add dependence on configure-target-libstdc++-v3. OK for google/gcc-4_7. It's fine for trunk as well, but it may need to wait until trunk opens up again. Diego. Index: Makefile.in === --- Makefile.in (revision 196604) +++ Makefile.in (working copy) @@ -31879,6 +31879,9 @@ maybe-configure-target-libmudflap: @if gcc-bootstrap configure-target-libmudflap: stage_current @endif gcc-bootstrap +@if target-libstdc++-v3 +configure-target-libmudflap: configure-target-libstdc++-v3 +@endif target-libstdc++-v3 @if target-libmudflap maybe-configure-target-libmudflap: configure-target-libmudflap configure-target-libmudflap:
[Patch, microblaze]: Add support for TLS in MicroBlaze
Add support for thread local storage (general dynamic and local dynamic models) in MicroBlaze. gcc/Changelog 2013-03-13 Edgar E. Iglesias David Holsgrove * config/microblaze/microblaze-protos.h: (microblaze_cannot_force_const_mem, microblaze_tls_referenced_p, symbol_mentioned_p, label_mentioned_p): Add prototypes. * config/microblaze/microblaze.c (microblaze_address_type): Add ADDRESS_TLS and tls_reloc address types. (microblaze_address_info): Add tls_reloc. (TARGET_HAVE_TLS): Define. (get_tls_get_addr, microblaze_tls_symbol_p, microblaze_tls_operand_p_1, microblaze_tls_referenced_p, microblaze_cannot_force_const_mem, symbol_mentioned_p, label_mentioned_p, tls_mentioned_p, load_tls_operand, microblaze_call_tls_get_addr, microblaze_legitimize_tls_address): New functions. (microblaze_classify_unspec): Handle UNSPEC_TLS. (get_base_reg): Use microblaze_tls_symbol_p. (microblaze_classify_address): Handle TLS. (microblaze_legitimate_pic_operand): Use symbol_mentioned_p, label_mentioned_p and microblaze_tls_referenced_p. (microblaze_legitimize_address): Handle TLS. (microblaze_address_insns): Handle ADDRESS_TLS. (pic_address_needs_scratch): Handle TLS. (print_operand_address): Handle TLS. (microblaze_expand_prologue): Check TLS_NEEDS_GOT. (microblaze_expand_move): Handle TLS. (microblaze_legitimate_constant_p): Check microblaze_cannot_force_const_mem and microblaze_tls_symbol_p. (TARGET_CANNOT_FORCE_CONST_MEM): Define. * config/microblaze/microblaze.h (TLS_NEEDS_GOT): Define (PIC_OFFSET_TABLE_REGNUM): Set. * config/microblaze/linux.h (TLS_NEEDS_GOT): Define. * config/microblaze/microblaze.md (UNSPEC_TLS): Define. (addsi3, movsi_internal2, movdf_internal): Update constraints * config/microblaze/predicates.md (arith_plus_operand): Define (move_operand): Redefine as move_src_operand, check microblaze_tls_referenced_p. Signed-off-by: Edgar E. Iglesias Signed-off-by: David Holsgrove 0001-Patch-microblaze-Add-support-for-TLS.patch Description: 0001-Patch-microblaze-Add-support-for-TLS.patch
[Patch, microblaze]: Add MicroBlaze TLS configure support
Add test for MicroBlaze TLS support to gcc/configure.ac gcc/Changelog 2013-03-13 Edgar E. Iglesias David Holsgrove * configure.ac: Add MicroBlaze TLS support detection. * configure: Regenerate. Signed-off-by: Edgar E. Iglesias Signed-off-by: David Holsgrove 0002-Patch-microblaze-Add-MicroBlaze-TLS-configure-suppor.patch Description: 0002-Patch-microblaze-Add-MicroBlaze-TLS-configure-suppor.patch
[SH] PR 49880 - Fix some more -mdiv option issues
Hi, Initially I just wanted to simplify two lines as mentioned in the PR. However, when I started writing the test cases a small can of worms popped up. '-m4 -mdiv=call-div1' would not link on bare metal configs because of missing functions in libgcc, '-m2a -mdiv=call-fp' would ICE and/or not link and '*-nofpu -mdiv=call-fp' would invoke library functions that use the FPU. I've also run across some confusions regarding TARGET_FPU_SINGLE and friends, but I'm leaving a better cleanup for 4.9. Basically it's not possible to distinguish between -m4-nofpu and -m4-single-only, because there are no corresponding bits. Thus the two new mask bits in sh.opt, which required converting some of the existing mask bit into a var, as we already ran out of bits once in the past. The attached patch should make the -mdiv= option work as it is described in the documentation (which I updated recently as part of PR 56529). Tested with 'make all' and make -k check-gcc RUNTESTFLAGS="sh.exp=pr49880* --target_board=sh-sim \{-m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single, -m4-single-only,-m4a,-m4a-single,-m4a-single-only}" OK for 4.8 and 4.7? Cheers, Oleg gcc/ChangeLog: PR target/49880 * config/sh/sh.opt (FPU_SINGLE_ONLY): New mask. (musermode): Convert to Var(TARGET_USERMODE). * config/sh/sh.h (SELECT_SH2A_SINGLE_ONLY, SELECT_SH4_SINGLE_ONLY, MASK_ARCH): Add MASK_FPU_SINGLE_ONLY. * config/sh/sh.c (sh_option_override): Use TARGET_FPU_DOUBLE || TARGET_FPU_SINGLE_ONLY for call-fp case. * config/sh/sh.md (udivsi3_i1, divsi3_i1): Remove ! TARGET_SH4 condition. (udivsi3_i4, divsi3_i4): Use TARGET_FPU_DOUBLE condition instead of TARGET_SH4. (udivsi3_i4_single, divsi3_i4_single): Use TARGET_FPU_SINGLE_ONLY || TARGET_FPU_DOUBLE instead of TARGET_HARD_SH4. libgcc/ChangeLog: PR target/49880 * config/sh/lib1funcs.S (sdivsi3_i4, udivsi3_i4): Enable for SH2A. (sdivsi3, udivsi3): Remove SH4 check and always compile these functions. testsuite/ChangeLog: PR target/49880 * testsuite/gcc.target/sh/pr49880-1.c: New. * testsuite/gcc.target/sh/pr49880-2.c: New. * testsuite/gcc.target/sh/pr49880-3.c: New. * testsuite/gcc.target/sh/pr49880-4.c: New. * testsuite/gcc.target/sh/pr49880-5.c: New. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 196589) +++ gcc/config/sh/sh.md (working copy) @@ -2154,7 +2154,7 @@ (clobber (reg:SI PR_REG)) (clobber (reg:SI R4_REG)) (use (match_operand:SI 1 "arith_reg_operand" "r"))] - "TARGET_SH1 && (! TARGET_SH4 || TARGET_DIVIDE_CALL_DIV1)" + "TARGET_SH1 && TARGET_DIVIDE_CALL_DIV1" "jsr @%1%#" [(set_attr "type" "sfunc") (set_attr "needs_delay_slot" "yes")]) @@ -2217,7 +2217,7 @@ (clobber (reg:SI R5_REG)) (use (reg:PSI FPSCR_REG)) (use (match_operand:SI 1 "arith_reg_operand" "r"))] - "TARGET_SH4 && ! TARGET_FPU_SINGLE" + "TARGET_FPU_DOUBLE && ! TARGET_FPU_SINGLE" "jsr @%1%#" [(set_attr "type" "sfunc") (set_attr "fp_mode" "double") @@ -2236,7 +2236,8 @@ (clobber (reg:SI R4_REG)) (clobber (reg:SI R5_REG)) (use (match_operand:SI 1 "arith_reg_operand" "r"))] - "(TARGET_HARD_SH4 || TARGET_SHCOMPACT) && TARGET_FPU_SINGLE" + "(TARGET_FPU_SINGLE_ONLY || TARGET_FPU_DOUBLE || TARGET_SHCOMPACT) + && TARGET_FPU_SINGLE" "jsr @%1%#" [(set_attr "type" "sfunc") (set_attr "needs_delay_slot" "yes")]) @@ -2358,7 +2359,7 @@ (clobber (reg:SI R2_REG)) (clobber (reg:SI R3_REG)) (use (match_operand:SI 1 "arith_reg_operand" "r"))] - "TARGET_SH1 && (! TARGET_SH4 || TARGET_DIVIDE_CALL_DIV1)" + "TARGET_SH1 && TARGET_DIVIDE_CALL_DIV1" "jsr @%1%#" [(set_attr "type" "sfunc") (set_attr "needs_delay_slot" "yes")]) @@ -2487,7 +2488,7 @@ (clobber (reg:DF DR2_REG)) (use (reg:PSI FPSCR_REG)) (use (match_operand:SI 1 "arith_reg_operand" "r"))] - "TARGET_SH4 && ! TARGET_FPU_SINGLE" + "TARGET_FPU_DOUBLE && ! TARGET_FPU_SINGLE" "jsr @%1%#" [(set_attr "type" "sfunc") (set_attr "fp_mode" "double") @@ -2501,7 +2502,8 @@ (clobber (reg:DF DR2_REG)) (clobber (reg:SI R2_REG)) (use (match_operand:SI 1 "arith_reg_operand" "r"))] - "(TARGET_HARD_SH4 || TARGET_SHCOMPACT) && TARGET_FPU_SINGLE" + "(TARGET_FPU_SINGLE_ONLY || TARGET_FPU_DOUBLE || TARGET_SHCOMPACT) + && TARGET_FPU_SINGLE" "jsr @%1%#" [(set_attr "type" "sfunc") (set_attr "needs_delay_slot" "yes")]) Index: gcc/config/sh/sh.opt === --- gcc/config/sh/sh.opt (revision 196589) +++ gcc/config/sh/sh.opt (working copy) @@ -24,6 +24,10 @@ ;; Set if the default precision of th FPU is single. Mask(FPU_SINGLE) +;; Set if the a double-precision FPU is present but is restricted to +;; single precisi
Re: [SH] PR 49880 - Fix some more -mdiv option issues
Oleg Endo wrote: > The attached patch should make the -mdiv= option work as it is described > in the documentation (which I updated recently as part of PR 56529). > > Tested with 'make all' and > > make -k check-gcc RUNTESTFLAGS="sh.exp=pr49880* --target_board=sh-sim > \{-m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single, > -m4-single-only,-m4a,-m4a-single,-m4a-single-only}" > > OK for 4.8 and 4.7? OK. Regards, kaz