Re: How do I modify SSA and copy basic blocks?

2013-04-25 Thread Richard Biener
On Thu, Apr 25, 2013 at 5:03 AM, Jeff Law  wrote:
> On 04/24/2013 04:54 PM, Steve Ellcey wrote:
>>
>>
>> I am still having trouble with this and with figuring out how to
>> straighten out my PHI nodes.  I have decided to try a slightly different
>> tack and see if I could create a routine that would do a generic basic
>> block copy, handling all the needed bookkeeping internally and fixing
>> all the PHI nodes after the copy.  I am trying to create a slow but
>> dependable and easy to use function and would consider completely
>> regenerating all the PHI information if that was the easiest thing to
>> do.  Here is what I have so far:
>
> Interesting you should mention this; one of the things I really want to get
> back to is a more generic mechanism to copy block regions.

We have gimple_duplicate_sese_region for this.  It may be not perfect though.
Eventually it should be changed to handle SEME regions as well and all
loop copying / versioning code should use it as well (though I don't think
any of the loop copying / versioning code handles multiple exits).

I've slowly started to move us in this direction by removing duplicate
functionality
in the compiler as I come along it ...

Richard.

> Threading is really just path isolation by copying and some
> equivalency/redundancy elimination enabled by the path isolation.
>
> We're missing a lot of threading opportunities because the current method
> for copying blocks is so limited.  There's finally some good literature on
> this stuff, both in terms of finding the redundancies that lead to useful
> optimization and in terms of identifying regions of blocks that need to be
> copied.  All the nonsense we do needs to be reformulated using better known
> methods.
>
>
>>
>>
>> /* Copy the basic block that is the destination block of orig_edge, then
>> modify/replace the edge in orig_edge->src basic block with a new edge
>> that goes to the new block.  Fix up any PHI nodes that may need to be
>> updated.  Remove the dominance info since it may have been messed up.
>> */
>>
>> edge
>> duplicate_succ_block (edge orig_edge)
>> {
>>edge new_edge;
>>basic_block orig_block, new_block;
>>
>>initialize_original_copy_tables ();
>>orig_block = orig_edge->dest;
>>fprintf(stderr, "Duplicating block %d\n", orig_block->index);
>>new_block = duplicate_block (orig_block, NULL, NULL);
>>update_destination_phis (orig_block, new_block);
>>new_edge = redirect_edge_and_branch (orig_edge, new_block);
>>remove_phi_args (orig_edge);
>>free_dominance_info (CDI_DOMINATORS);
>>free_original_copy_tables ();
>>return new_edge;
>> }
>>
>> When I use this to copy a block I get a failure from verify_ssa.
>
> Well, with that structure you need to update PHIs at the destination of
> every outgoing edge from new_block.  That's one of the reasons you want to
> delete the control statement and dead edges in the copy -- that leaves you
> just updating the single successor of new_block.
>
> You don't mention why verify_ssa fails.  I'd hazard a guess you've got a use
> not dominated by its set.   It'll be important to know where the use occurs
> and where the dominating set is supposed to be.  Presumably you call into
> update_ssa or whatever it's called these days before trying to verify_ssa?
>
>
>
>>
>> The block I am trying to copy (based on my original example) is:
>>
>>:
>># s_1 = PHI 
>># t_3 = PHI <0(2), t_2(7)>
>># c_4 = PHI 
>>if (c_4 != 0)
>>  goto ;
>>else
>>  goto ;
>>
>> There are two edges leading here (from block 2 and block 7) and I want
>> to change the 2->8 edge to be a 2->8_prime edge where 8_prime is my new
>> basic block.  That obviously affects the PHI nodes in both block 8 and
>> the new 8_prime block.  I don't think any other PHI's are affected in
>> this case, but obviously, I would like my routine to work on any block I
>> want to copy even if it does affect PHI nodes in successor blocks.
>
> You have to update the PHIs in bb3 & bb9.  You want to copy the PHI arg
> associated with 8->3 for the 8'->3 edge, similarly for for PHI args
> associated with the 8->9 edge to the 8'->9 edge.  See copy_phi_args.
>
> jeff
>
>


Re: setjmp/longjmp: Wrong code generation

2013-04-25 Thread Andreas Krebbel
On 24/04/13 15:40, Richard Biener wrote:

> I expected we preserve edges across RTL expansion?  We cannot re-create
> them optimally from scratch, but yes, re-construction is possible.  Can you
> open a bugreport pointing out the missing RTL bits?

Done.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57067

-Andreas-



[Testsuite] tree-ssa failures for targets with non 32 bit int size

2013-04-25 Thread Senthil Kumar Selvaraj
I noticed that there is a bunch of testcases in gcc.dg/tree-ssa 
(slsr-27.c, for e.g.) that assume that the size of the integer is 4
bytes. For example, slsr-27.c has

struct x
{
  int a[16];
  int b[16];
  int c[16];
 };

and 

void
f (struct x *p, unsigned int n)
{
  foo (p->a[n], p->c[n], p->b[n]);
}

and expects a "* 4" to be present in the dump, assuming the size of an int to 
be 4 bytes (n * 4 gives the array offset).

What is right way to fix these? I saw one testcase that did

 typedef int int32_t __attribute__ ((__mode__ (__SI__)));

and used int32_t everywhere where a 32 bit int is assumed. Is this the right
way to go? Or maybe some preprocessor magic that replaces the "int" token with
one that has the SI attribute? Or should the test assertion be branched for 
differing sizes of int?

Regards
Senthil



Re: How do I modify SSA and copy basic blocks?

2013-04-25 Thread Zdenek Dvorak
Hi,

> On Tue, 2013-04-23 at 15:24 -0600, Jeff Law wrote:
> 
> > Well, you have to copy the blocks, adjust the edges and rewrite the SSA 
> > graph.  I'd use duplicate_block to help.
> > 
> > You really want to look at tree-ssa-threadupdate.c.  There's a nice big 
> > block comment which gives the high level view of what needs to happen 
> > when you copy a block for this kind of optimization.  Feel free to 
> > ignore the implementation which has to be fairly efficient when there's 
> > a large number of edges to update.
> > 
> > Jeff
> 
> I think I understand the high level work, it is mapping that hight level
> description to the low level calls that I am having trouble with. 

I'd suggest looking at gimple_duplicate_sese_region in tree-cfg.c.  It does
not do exactly what you need, but it deals with a somewhat similar situation,

Zdenek


Re: [Testsuite] tree-ssa failures for targets with non 32 bit int size

2013-04-25 Thread Mike Stump
On Apr 25, 2013, at 7:44 AM, Senthil Kumar Selvaraj 
 wrote:
> What is right way to fix these? I saw one testcase that did
> 
> typedef int int32_t __attribute__ ((__mode__ (__SI__)));
> 
> Is this the right way to go?

I like this.  Pre-approved.

Re: How do I modify SSA and copy basic blocks?

2013-04-25 Thread Steve Ellcey
On Thu, 2013-04-25 at 09:53 +0200, Richard Biener wrote:

> > Interesting you should mention this; one of the things I really want to get
> > back to is a more generic mechanism to copy block regions.
> 
> We have gimple_duplicate_sese_region for this.  It may be not perfect though.
> Eventually it should be changed to handle SEME regions as well and all
> loop copying / versioning code should use it as well (though I don't think
> any of the loop copying / versioning code handles multiple exits).
> 
> I've slowly started to move us in this direction by removing duplicate
> functionality
> in the compiler as I come along it ...
> 
> Richard.

This looks interesting.  If it handled SEME regions I think I could use
it because any single block by itself is going to be an SEME region,
right?  I notice the routine does not update the SSA web.  Is there a
reason for that?  It looks like copy_loop_headers calls update_ssa after
it calls gimple_duplicate_sese_region (the only use of
gimple_duplicate_sese_region that I see).  Unfortunately, at least some
of the blocks I want to copy have multiple exit edges where an SSA
variable defined in that block is needed on each of the exit edges from
the block.  Do you know what bad things will happen if I call this with
a block that is SEME instead of SESE?  Is there anyway (even if it was a
hack) that I could compensate for it by regenerating some of the
information, i.e. freeing the dominator information so it gets
recalculated from scratch or something like that?

Steve Ellcey
sell...@imgtec.com




Re: How do I modify SSA and copy basic blocks?

2013-04-25 Thread Steve Ellcey
On Thu, 2013-04-25 at 09:53 +0200, Richard Biener wrote:

> We have gimple_duplicate_sese_region for this.  It may be not perfect though.
> Eventually it should be changed to handle SEME regions as well and all
> loop copying / versioning code should use it as well (though I don't think
> any of the loop copying / versioning code handles multiple exits).
> 
> I've slowly started to move us in this direction by removing duplicate
> functionality
> in the compiler as I come along it ...
> 
> Richard.

One thing I have noticed with this routine is that I am trying to call
gimple_duplicate_sese_region before the various loop optimizations and
before the loop information is all set up (not sure if that is good or
bad, right now it just is).  So I died when calling set_loop_copy.  I
put that call and the other loop uses in an 'if (loop)' block to make
that assertion stop and I was then able to copy one (SEME) block with
this routine.  When I tried to copy a second block with a second call,
it died in iterate_fix_dominators.  I tried removing all the dominator
information after creating my first new block hoping it would correctly
regenerate everything before doing the second block but that didn't seem
to work.

Steve Ellcey
sell...@imgtec.com




gcc-4.8-20130425 is now available

2013-04-25 Thread gccadmin
Snapshot gcc-4.8-20130425 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20130425/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 198322

You'll find:

 gcc-4.8-20130425.tar.bz2 Complete GCC

  MD5=03690556f09991fbecac0467227c5d4e
  SHA1=10230732ddff38df20061d818a8bb53b0b99c3d4

Diffs from 4.8-20130418 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


good afternoon

2013-04-25 Thread admin

Good afternoon.
I need a function that it only works with my server. (Server lunix - 
panel DirectAdmin) I was told for it to write the language C + + module 
for PHP extension. So
If so I want to buy one for ordering function (. so) format, which he 
eats at me like PHP format. (need to translate it to php. so) less than 
1 KB file. I will pay the price for your order please contact us.

Thank you for your attention


ARM, stack unwinding, and Firefox OS

2013-04-25 Thread Jed Davis
I've been working on profiling tools for Firefox OS, and one of the
central problems is getting stack traces for sample-based profiling.
The old APCS frame pointer variant (where r11/fp heads a linked list
of {fp, sp, lr, pc} frames) is convenient -- it's compatible with the
Linux kernel profiler as-is, it's simple to work with in general, and
in particular it's relatively easy to inject dynamically generated
pseudo-frames into the profile.

Of course, this doesn't work on Thumb, where the full stmdb/ldmia don't
exist.  But it almost works on Thumb2 -- the sp and pc can't be stored,
and the sp can't be loaded, but it's possible to save {r11, r12, r14}
(i.e., {fp, ip, lr}) along with the other saved registers, and obtain a
frame with the fp and lr fields at the same offsets as for -mapcs-frame.

This is, conveniently, enough for the Linux kernel profiler's user stack
walker.  It makes it possible to lose the second-last stack frame if
sampled between a call and committing the new frame to r11 -- I assume
this is what the saved PC is for? -- but this is the same situation
that, e.g., x86 frame pointer walking is in; and this is for profiling,
so full correctness isn't an absolute requirement.

(At some point I should mention -mtpcs-frame, which as of GCC 4.4.3
emits a nontrivial number of instructions to put the entire {fp, sp,
lr, pc} in the expected places... on Thumb1, and is silently ignored on
Thumb2, and seems to have no test coverage, and seems to have bit-rotted
in more recent versions.)

I've attached a patch (against GCC 4.4.3, because that's what we're
currently using) for comment.  The option probably needs a more serious
name than -mthumb2-fake-apcs-frame, and how it interacts with related
options may not be ideal.  But, more importantly, I'm essentially
inventing a vendor-specific ABI here, and I don't know if that's the
kind of thing that would be accepted.

--Jed



9113467890123467890

2013-04-25 Thread 9:46:203


Qswzei3584.xls
Description: Binary data


Re: ARM, stack unwinding, and Firefox OS

2013-04-25 Thread Jed Davis
On Thu, Apr 25, 2013 at 07:25:42PM -0700, Jed Davis wrote:
> I've attached a patch

Let's try that again

--Jed

diff --git a/gcc-4.4.3/gcc/config/arm/arm.c b/gcc-4.4.3/gcc/config/arm/arm.c
index bef07e3..ce6acf1 100644
--- a/gcc-4.4.3/gcc/config/arm/arm.c
+++ b/gcc-4.4.3/gcc/config/arm/arm.c
@@ -1381,6 +1381,21 @@ arm_override_options (void)
   target_flags &= ~MASK_APCS_FRAME;
 }
 
+  if (TARGET_THUMB2_FAKE_APCS_FRAME && !(insn_flags & FL_THUMB2))
+{
+  warning (0, "ignoring -mthumb2-fake-apcs-frame for non-Thumb2 target");
+  target_flags &= ~MASK_THUMB2_FAKE_APCS_FRAME;
+}
+
+  if (TARGET_THUMB2_FAKE_APCS_FRAME && TARGET_ARM)
+{
+  target_flags &= ~MASK_THUMB2_FAKE_APCS_FRAME;
+  if (!TARGET_APCS_FRAME)
+	{
+	  warning (0, "-mthumb2-fake-apcs-frame but not -mapcs-frame specified when compiling for ARM");
+	}
+}
+
   /* Callee super interworking implies thumb interworking.  Adding
  this to the flags here simplifies the logic elsewhere.  */
   if (TARGET_THUMB && TARGET_CALLEE_INTERWORKING)
@@ -12696,6 +12711,11 @@ arm_compute_save_reg_mask (void)
   if (cfun->machine->lr_save_eliminated)
 save_reg_mask &= ~ (1 << LR_REGNUM);
 
+  if (TARGET_THUMB2_FAKE_APCS_FRAME && (save_reg_mask & (1 << LR_REGNUM)))
+save_reg_mask |=
+  (1 << ARM_HARD_FRAME_POINTER_REGNUM)
+  | (1 << IP_REGNUM);
+
   if (TARGET_REALLY_IWMMXT
   && ((bit_count (save_reg_mask)
 	   + ARM_NUM_INTS (crtl->args.pretend_args_size +
@@ -14506,6 +14526,15 @@ arm_expand_prologue (void)
 	  RTX_FRAME_RELATED_P (insn) = 1;
 	}
 }
+  else if (TARGET_THUMB2_FAKE_APCS_FRAME &&
+	   (offsets->saved_regs_mask & (1 << ARM_HARD_FRAME_POINTER_REGNUM))) {
+rtx arm_fp_rtx = gen_raw_REG (Pmode, ARM_HARD_FRAME_POINTER_REGNUM);
+
+insn = GEN_INT (saved_regs);
+insn = emit_insn (gen_addsi3 (arm_fp_rtx, stack_pointer_rtx, insn));
+/* This is not "frame-related", because it doesn't set the frame
+   pointer that a debugger would use to find things. */
+  }
 
   if (offsets->outgoing_args != offsets->saved_args + saved_regs)
 {
diff --git a/gcc-4.4.3/gcc/config/arm/arm.h b/gcc-4.4.3/gcc/config/arm/arm.h
index 1189914..d50525e 100644
--- a/gcc-4.4.3/gcc/config/arm/arm.h
+++ b/gcc-4.4.3/gcc/config/arm/arm.h
@@ -837,11 +837,12 @@ extern int arm_structure_size_boundary;
  is an easy way of ensuring that it remains valid for all	\
  calls.  */			\
   if (TARGET_APCS_FRAME || TARGET_CALLER_INTERWORKING		\
-  || TARGET_TPCS_FRAME || TARGET_TPCS_LEAF_FRAME)		\
+  || TARGET_TPCS_FRAME || TARGET_TPCS_LEAF_FRAME		\
+  || TARGET_THUMB2_FAKE_APCS_FRAME)\
 {\
   fixed_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1;		\
   call_used_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1;	\
-  if (TARGET_CALLER_INTERWORKING)\
+  if (TARGET_CALLER_INTERWORKING || TARGET_THUMB2_FAKE_APCS_FRAME) \
 	global_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1;		\
 }\
   SUBTARGET_CONDITIONAL_REGISTER_USAGE\
diff --git a/gcc-4.4.3/gcc/config/arm/arm.opt b/gcc-4.4.3/gcc/config/arm/arm.opt
index 6aca395..5c8c0c1 100644
--- a/gcc-4.4.3/gcc/config/arm/arm.opt
+++ b/gcc-4.4.3/gcc/config/arm/arm.opt
@@ -37,6 +37,10 @@ mapcs-frame
 Target Report Mask(APCS_FRAME)
 Generate APCS conformant stack frames
 
+mthumb2-fake-apcs-frame
+Target Report Mask(THUMB2_FAKE_APCS_FRAME)
+Emulate APCS conformant stack frames in Thumb2 code
+
 mapcs-reentrant
 Target Report Mask(APCS_REENT)
 Generate re-entrant, PIC code