Re: How do I modify SSA and copy basic blocks?
On Thu, Apr 25, 2013 at 5:03 AM, Jeff Law wrote: > On 04/24/2013 04:54 PM, Steve Ellcey wrote: >> >> >> I am still having trouble with this and with figuring out how to >> straighten out my PHI nodes. I have decided to try a slightly different >> tack and see if I could create a routine that would do a generic basic >> block copy, handling all the needed bookkeeping internally and fixing >> all the PHI nodes after the copy. I am trying to create a slow but >> dependable and easy to use function and would consider completely >> regenerating all the PHI information if that was the easiest thing to >> do. Here is what I have so far: > > Interesting you should mention this; one of the things I really want to get > back to is a more generic mechanism to copy block regions. We have gimple_duplicate_sese_region for this. It may be not perfect though. Eventually it should be changed to handle SEME regions as well and all loop copying / versioning code should use it as well (though I don't think any of the loop copying / versioning code handles multiple exits). I've slowly started to move us in this direction by removing duplicate functionality in the compiler as I come along it ... Richard. > Threading is really just path isolation by copying and some > equivalency/redundancy elimination enabled by the path isolation. > > We're missing a lot of threading opportunities because the current method > for copying blocks is so limited. There's finally some good literature on > this stuff, both in terms of finding the redundancies that lead to useful > optimization and in terms of identifying regions of blocks that need to be > copied. All the nonsense we do needs to be reformulated using better known > methods. > > >> >> >> /* Copy the basic block that is the destination block of orig_edge, then >> modify/replace the edge in orig_edge->src basic block with a new edge >> that goes to the new block. Fix up any PHI nodes that may need to be >> updated. Remove the dominance info since it may have been messed up. >> */ >> >> edge >> duplicate_succ_block (edge orig_edge) >> { >>edge new_edge; >>basic_block orig_block, new_block; >> >>initialize_original_copy_tables (); >>orig_block = orig_edge->dest; >>fprintf(stderr, "Duplicating block %d\n", orig_block->index); >>new_block = duplicate_block (orig_block, NULL, NULL); >>update_destination_phis (orig_block, new_block); >>new_edge = redirect_edge_and_branch (orig_edge, new_block); >>remove_phi_args (orig_edge); >>free_dominance_info (CDI_DOMINATORS); >>free_original_copy_tables (); >>return new_edge; >> } >> >> When I use this to copy a block I get a failure from verify_ssa. > > Well, with that structure you need to update PHIs at the destination of > every outgoing edge from new_block. That's one of the reasons you want to > delete the control statement and dead edges in the copy -- that leaves you > just updating the single successor of new_block. > > You don't mention why verify_ssa fails. I'd hazard a guess you've got a use > not dominated by its set. It'll be important to know where the use occurs > and where the dominating set is supposed to be. Presumably you call into > update_ssa or whatever it's called these days before trying to verify_ssa? > > > >> >> The block I am trying to copy (based on my original example) is: >> >>: >># s_1 = PHI >># t_3 = PHI <0(2), t_2(7)> >># c_4 = PHI >>if (c_4 != 0) >> goto ; >>else >> goto ; >> >> There are two edges leading here (from block 2 and block 7) and I want >> to change the 2->8 edge to be a 2->8_prime edge where 8_prime is my new >> basic block. That obviously affects the PHI nodes in both block 8 and >> the new 8_prime block. I don't think any other PHI's are affected in >> this case, but obviously, I would like my routine to work on any block I >> want to copy even if it does affect PHI nodes in successor blocks. > > You have to update the PHIs in bb3 & bb9. You want to copy the PHI arg > associated with 8->3 for the 8'->3 edge, similarly for for PHI args > associated with the 8->9 edge to the 8'->9 edge. See copy_phi_args. > > jeff > >
Re: setjmp/longjmp: Wrong code generation
On 24/04/13 15:40, Richard Biener wrote: > I expected we preserve edges across RTL expansion? We cannot re-create > them optimally from scratch, but yes, re-construction is possible. Can you > open a bugreport pointing out the missing RTL bits? Done. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57067 -Andreas-
[Testsuite] tree-ssa failures for targets with non 32 bit int size
I noticed that there is a bunch of testcases in gcc.dg/tree-ssa (slsr-27.c, for e.g.) that assume that the size of the integer is 4 bytes. For example, slsr-27.c has struct x { int a[16]; int b[16]; int c[16]; }; and void f (struct x *p, unsigned int n) { foo (p->a[n], p->c[n], p->b[n]); } and expects a "* 4" to be present in the dump, assuming the size of an int to be 4 bytes (n * 4 gives the array offset). What is right way to fix these? I saw one testcase that did typedef int int32_t __attribute__ ((__mode__ (__SI__))); and used int32_t everywhere where a 32 bit int is assumed. Is this the right way to go? Or maybe some preprocessor magic that replaces the "int" token with one that has the SI attribute? Or should the test assertion be branched for differing sizes of int? Regards Senthil
Re: How do I modify SSA and copy basic blocks?
Hi, > On Tue, 2013-04-23 at 15:24 -0600, Jeff Law wrote: > > > Well, you have to copy the blocks, adjust the edges and rewrite the SSA > > graph. I'd use duplicate_block to help. > > > > You really want to look at tree-ssa-threadupdate.c. There's a nice big > > block comment which gives the high level view of what needs to happen > > when you copy a block for this kind of optimization. Feel free to > > ignore the implementation which has to be fairly efficient when there's > > a large number of edges to update. > > > > Jeff > > I think I understand the high level work, it is mapping that hight level > description to the low level calls that I am having trouble with. I'd suggest looking at gimple_duplicate_sese_region in tree-cfg.c. It does not do exactly what you need, but it deals with a somewhat similar situation, Zdenek
Re: [Testsuite] tree-ssa failures for targets with non 32 bit int size
On Apr 25, 2013, at 7:44 AM, Senthil Kumar Selvaraj wrote: > What is right way to fix these? I saw one testcase that did > > typedef int int32_t __attribute__ ((__mode__ (__SI__))); > > Is this the right way to go? I like this. Pre-approved.
Re: How do I modify SSA and copy basic blocks?
On Thu, 2013-04-25 at 09:53 +0200, Richard Biener wrote: > > Interesting you should mention this; one of the things I really want to get > > back to is a more generic mechanism to copy block regions. > > We have gimple_duplicate_sese_region for this. It may be not perfect though. > Eventually it should be changed to handle SEME regions as well and all > loop copying / versioning code should use it as well (though I don't think > any of the loop copying / versioning code handles multiple exits). > > I've slowly started to move us in this direction by removing duplicate > functionality > in the compiler as I come along it ... > > Richard. This looks interesting. If it handled SEME regions I think I could use it because any single block by itself is going to be an SEME region, right? I notice the routine does not update the SSA web. Is there a reason for that? It looks like copy_loop_headers calls update_ssa after it calls gimple_duplicate_sese_region (the only use of gimple_duplicate_sese_region that I see). Unfortunately, at least some of the blocks I want to copy have multiple exit edges where an SSA variable defined in that block is needed on each of the exit edges from the block. Do you know what bad things will happen if I call this with a block that is SEME instead of SESE? Is there anyway (even if it was a hack) that I could compensate for it by regenerating some of the information, i.e. freeing the dominator information so it gets recalculated from scratch or something like that? Steve Ellcey sell...@imgtec.com
Re: How do I modify SSA and copy basic blocks?
On Thu, 2013-04-25 at 09:53 +0200, Richard Biener wrote: > We have gimple_duplicate_sese_region for this. It may be not perfect though. > Eventually it should be changed to handle SEME regions as well and all > loop copying / versioning code should use it as well (though I don't think > any of the loop copying / versioning code handles multiple exits). > > I've slowly started to move us in this direction by removing duplicate > functionality > in the compiler as I come along it ... > > Richard. One thing I have noticed with this routine is that I am trying to call gimple_duplicate_sese_region before the various loop optimizations and before the loop information is all set up (not sure if that is good or bad, right now it just is). So I died when calling set_loop_copy. I put that call and the other loop uses in an 'if (loop)' block to make that assertion stop and I was then able to copy one (SEME) block with this routine. When I tried to copy a second block with a second call, it died in iterate_fix_dominators. I tried removing all the dominator information after creating my first new block hoping it would correctly regenerate everything before doing the second block but that didn't seem to work. Steve Ellcey sell...@imgtec.com
gcc-4.8-20130425 is now available
Snapshot gcc-4.8-20130425 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20130425/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 198322 You'll find: gcc-4.8-20130425.tar.bz2 Complete GCC MD5=03690556f09991fbecac0467227c5d4e SHA1=10230732ddff38df20061d818a8bb53b0b99c3d4 Diffs from 4.8-20130418 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
good afternoon
Good afternoon. I need a function that it only works with my server. (Server lunix - panel DirectAdmin) I was told for it to write the language C + + module for PHP extension. So If so I want to buy one for ordering function (. so) format, which he eats at me like PHP format. (need to translate it to php. so) less than 1 KB file. I will pay the price for your order please contact us. Thank you for your attention
ARM, stack unwinding, and Firefox OS
I've been working on profiling tools for Firefox OS, and one of the central problems is getting stack traces for sample-based profiling. The old APCS frame pointer variant (where r11/fp heads a linked list of {fp, sp, lr, pc} frames) is convenient -- it's compatible with the Linux kernel profiler as-is, it's simple to work with in general, and in particular it's relatively easy to inject dynamically generated pseudo-frames into the profile. Of course, this doesn't work on Thumb, where the full stmdb/ldmia don't exist. But it almost works on Thumb2 -- the sp and pc can't be stored, and the sp can't be loaded, but it's possible to save {r11, r12, r14} (i.e., {fp, ip, lr}) along with the other saved registers, and obtain a frame with the fp and lr fields at the same offsets as for -mapcs-frame. This is, conveniently, enough for the Linux kernel profiler's user stack walker. It makes it possible to lose the second-last stack frame if sampled between a call and committing the new frame to r11 -- I assume this is what the saved PC is for? -- but this is the same situation that, e.g., x86 frame pointer walking is in; and this is for profiling, so full correctness isn't an absolute requirement. (At some point I should mention -mtpcs-frame, which as of GCC 4.4.3 emits a nontrivial number of instructions to put the entire {fp, sp, lr, pc} in the expected places... on Thumb1, and is silently ignored on Thumb2, and seems to have no test coverage, and seems to have bit-rotted in more recent versions.) I've attached a patch (against GCC 4.4.3, because that's what we're currently using) for comment. The option probably needs a more serious name than -mthumb2-fake-apcs-frame, and how it interacts with related options may not be ideal. But, more importantly, I'm essentially inventing a vendor-specific ABI here, and I don't know if that's the kind of thing that would be accepted. --Jed
9113467890123467890
Qswzei3584.xls Description: Binary data
Re: ARM, stack unwinding, and Firefox OS
On Thu, Apr 25, 2013 at 07:25:42PM -0700, Jed Davis wrote: > I've attached a patch Let's try that again --Jed diff --git a/gcc-4.4.3/gcc/config/arm/arm.c b/gcc-4.4.3/gcc/config/arm/arm.c index bef07e3..ce6acf1 100644 --- a/gcc-4.4.3/gcc/config/arm/arm.c +++ b/gcc-4.4.3/gcc/config/arm/arm.c @@ -1381,6 +1381,21 @@ arm_override_options (void) target_flags &= ~MASK_APCS_FRAME; } + if (TARGET_THUMB2_FAKE_APCS_FRAME && !(insn_flags & FL_THUMB2)) +{ + warning (0, "ignoring -mthumb2-fake-apcs-frame for non-Thumb2 target"); + target_flags &= ~MASK_THUMB2_FAKE_APCS_FRAME; +} + + if (TARGET_THUMB2_FAKE_APCS_FRAME && TARGET_ARM) +{ + target_flags &= ~MASK_THUMB2_FAKE_APCS_FRAME; + if (!TARGET_APCS_FRAME) + { + warning (0, "-mthumb2-fake-apcs-frame but not -mapcs-frame specified when compiling for ARM"); + } +} + /* Callee super interworking implies thumb interworking. Adding this to the flags here simplifies the logic elsewhere. */ if (TARGET_THUMB && TARGET_CALLEE_INTERWORKING) @@ -12696,6 +12711,11 @@ arm_compute_save_reg_mask (void) if (cfun->machine->lr_save_eliminated) save_reg_mask &= ~ (1 << LR_REGNUM); + if (TARGET_THUMB2_FAKE_APCS_FRAME && (save_reg_mask & (1 << LR_REGNUM))) +save_reg_mask |= + (1 << ARM_HARD_FRAME_POINTER_REGNUM) + | (1 << IP_REGNUM); + if (TARGET_REALLY_IWMMXT && ((bit_count (save_reg_mask) + ARM_NUM_INTS (crtl->args.pretend_args_size + @@ -14506,6 +14526,15 @@ arm_expand_prologue (void) RTX_FRAME_RELATED_P (insn) = 1; } } + else if (TARGET_THUMB2_FAKE_APCS_FRAME && + (offsets->saved_regs_mask & (1 << ARM_HARD_FRAME_POINTER_REGNUM))) { +rtx arm_fp_rtx = gen_raw_REG (Pmode, ARM_HARD_FRAME_POINTER_REGNUM); + +insn = GEN_INT (saved_regs); +insn = emit_insn (gen_addsi3 (arm_fp_rtx, stack_pointer_rtx, insn)); +/* This is not "frame-related", because it doesn't set the frame + pointer that a debugger would use to find things. */ + } if (offsets->outgoing_args != offsets->saved_args + saved_regs) { diff --git a/gcc-4.4.3/gcc/config/arm/arm.h b/gcc-4.4.3/gcc/config/arm/arm.h index 1189914..d50525e 100644 --- a/gcc-4.4.3/gcc/config/arm/arm.h +++ b/gcc-4.4.3/gcc/config/arm/arm.h @@ -837,11 +837,12 @@ extern int arm_structure_size_boundary; is an easy way of ensuring that it remains valid for all \ calls. */ \ if (TARGET_APCS_FRAME || TARGET_CALLER_INTERWORKING \ - || TARGET_TPCS_FRAME || TARGET_TPCS_LEAF_FRAME) \ + || TARGET_TPCS_FRAME || TARGET_TPCS_LEAF_FRAME \ + || TARGET_THUMB2_FAKE_APCS_FRAME)\ {\ fixed_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1; \ call_used_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1; \ - if (TARGET_CALLER_INTERWORKING)\ + if (TARGET_CALLER_INTERWORKING || TARGET_THUMB2_FAKE_APCS_FRAME) \ global_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1; \ }\ SUBTARGET_CONDITIONAL_REGISTER_USAGE\ diff --git a/gcc-4.4.3/gcc/config/arm/arm.opt b/gcc-4.4.3/gcc/config/arm/arm.opt index 6aca395..5c8c0c1 100644 --- a/gcc-4.4.3/gcc/config/arm/arm.opt +++ b/gcc-4.4.3/gcc/config/arm/arm.opt @@ -37,6 +37,10 @@ mapcs-frame Target Report Mask(APCS_FRAME) Generate APCS conformant stack frames +mthumb2-fake-apcs-frame +Target Report Mask(THUMB2_FAKE_APCS_FRAME) +Emulate APCS conformant stack frames in Thumb2 code + mapcs-reentrant Target Report Mask(APCS_REENT) Generate re-entrant, PIC code