GCC 5.2.0 Status Report (2015-07-07), branch frozen

2015-07-07 Thread Richard Biener

The GCC 5 branch is now frozen for the release of GCC 5.2, all changes
require release manager approval from now on.

I will shortly announce a first release candidate for GCC 5.2.


Previous Report
===

https://gcc.gnu.org/ml/gcc/2015-06/msg00202.html


[AD] UltraGDB, an alternative tool to debug GCC, GDB, LLDB, etc. on Windows and Linux

2015-07-07 Thread Xu,Chiheng
UltraGDB is a GDB GUI front-end, as well as a lightweight C/C++ IDE
based on industry standard Eclipse technology.

Visit http://www.ultragdb.com/ to learn more.

Visit https://www.youtube.com/channel/UCr7F3ZZ_hgpxYXiMOA27hhw for demos.

This is an ad. Sorry if you receive multiple copies of this message.


GCC 5.2 Release Candidate available from gcc.gnu.org

2015-07-07 Thread Richard Biener

The first release candidate for GCC 5.2 is available from

 ftp://gcc.gnu.org/pub/gcc/snapshots/5.2.0-RC-20150707

and shortly its mirrors.  It has been generated from SVN revision 225500.

I have sofar bootstrapped the release candidate on
{i586,ia64,ppc,ppc64,x86_64,aarch64}-suse-linux-gnu.

Please test the release candidate and report any issues to bugzilla.

If all goes well I'd like to release GCC 5.2 at the middle of next week.


Re: Allocation of hotness of data structure with respect to the top of stack.

2015-07-07 Thread Oleg Endo

On 07 Jul 2015, at 04:49, Jeff Law  wrote:

> On 07/05/2015 05:11 AM, Ajit Kumar Agarwal wrote:
>> All:
>> 
>> I am wondering allocation of hot data structure closer to the top of
>> the stack increases the performance of the application. The data
>> structure are identified as hot and cold data structure and all the
>> data structures are sorted in decreasing order of The hotness and the
>> hot data structure will be allocated closer to the top of the stack.
>> 
>> The load and store on accessing with respect to allocation of data
>> structure on stack will be faster with allocation of hot Data
>> structure closer to the top of the stack.
>> 
>> Based on the above the code is generated with respect to load and
>> store with the correct offset of the stack allocated on the
>> decreasing order of hotness.
> You might want to look at this paper from an old gcc summit conference.  
> Basically they were trying to reorder stack slots to minimize offsets in 
> reg+d addressing for hte SH port.  It should touch on a number of common 
> issues/goals.
> 
> 
> ftp://gcc.gnu.org/pub/gcc/summit/2003/Optimal%20Stack%20Slot%20Assignment.pdf
> 
> 
> I can't recall if they ever tried to submit that work for inclusion.

Ah, inverse-AMS so to say :)
It might be interesting to combine forward and inverse AMS.  In the current AMS 
GSoC work we're hitting some cases which need mem access reordering in order to 
pick cheaper address modes.  It's not there yet, but if it knows how to reorder 
mem accesses in the insn stream it could probably be extended to try reordering 
memory layout of variables.

Cheers,
Oleg 

Re: Does GCC generate LDRD/STRD (Register) forms?

2015-07-07 Thread Oleg Endo

On 07 Jul 2015, at 13:52, Bin.Cheng  wrote:

> On Tue, Jul 7, 2015 at 10:05 AM, Anmol Paralkar (anmparal)
>  wrote:
>> Hello,
>> 
>> Does GCC generate LDRD/STRD (Register) forms [A8.8.74/A8.8.211 per ARMv7-A
>> & ARMv7-R ARM]?
>> 
>> Based on various attempts to write code to get GCC to generate a sample
>> form, and subsequently inspecting the code I see in
>> config/arm/arm.c/output_move_double () & arm.md [GCC 4.9.2], I think that
>> these register based forms of LDRD/STRD are
>> not generated, but I thought it might be a good idea to ask on the list,
>> just in case.
> Register based LDRD is harder than immediate version.  ARM doesn't
> support [base + reg + offset] addressing mode, so address computation
> of the second memory reference is scattered both in and out of memory
> reference.  To identify such opportunities, one needs to trace
> registers in address expression the memory access instruction and does
> some kind of value computation and re-association.

Basically, this is what we're trying to do with AMS.  For each mem access it 
tries to trace the reg values and figure out the effective address expression.  
For now we've limited it to the form 'base_reg + index_reg*scale + 
const_displacement'.  Then we try to see how to fit the address expressions to 
the available address modes.

It's still work in progress but already shows some improvements.
A classic SH4 example:

float fun (float* x)
{
  return x[0] + x[1] + x[2] + x[3];
}

no AMS:
mov r4,r1
add #4,r1
fmov.s  @r4,fr0
fmov.s  @r1,fr1
mov r4,r1
add #8,r1
faddfr1,fr0
fmov.s  @r1,fr1
add #12,r4
faddfr1,fr0
fmov.s  @r4,fr1
rts 
faddfr1,fr0

AMS:
fmov.s  @r4+,fr0
fmov.s  @r4+,fr1
faddfr1,fr0
fmov.s  @r4+,fr1
faddfr1,fr0
fmov.s  @r4,fr1
rts 
faddfr1,fr0

If I understand correctly, ARM's LDRD/STRD are similar to SH's FPU 2x32 pair 
loads/stores.  It needs the mem access insns of adjacent addresses to be 
adjacent in the insn stream.  We'll try to do some mem access reordering in 
AMS, mainly to improve post/pre inc/dec address mode utilization.  Afterwards, 
adjacent mem accesses can be fused together in a separate RTL pass or AMS 
sub-pass to avoid re-discovering mem access sequence information, which AMS 
already has.

Cheers,
Oleg

Re: Uninitialized registers handling in the REE pass

2015-07-07 Thread Jeff Law

On 07/06/2015 09:42 AM, Pierre-Marie de Rodat wrote:

Hello,

The attached reproducer[1] seems to trigger a code generation issue at
least on x86_64-linux:

 $ gnatmake -q p -O3 -gnatn
 $ ./p

 raised PROGRAM_ERROR : p.adb:9 explicit raise

Can you please file this as a bug in bugzilla so that can get tracked?

http://gcc.gnu.org/bugzilla

jeff


Re: Uninitialized registers handling in the REE pass

2015-07-07 Thread Pierre-Marie de Rodat

On 07/07/2015 05:02 PM, Jeff Law wrote:

Can you please file this as a bug in bugzilla so that can get tracked?

http://gcc.gnu.org/bugzilla


Sure, it's there: .

--
Pierre-Marie de Rodat


Re: rl78 vs cse vs memory_address_addr_space

2015-07-07 Thread Segher Boessenkool
On Mon, Jul 06, 2015 at 04:58:36PM -0500, Segher Boessenkool wrote:
> On Mon, Jul 06, 2015 at 04:45:35PM -0400, DJ Delorie wrote:
> > Combine gets as far as this:
> > 
> > Trying 5 -> 9:
> > Failed to match this instruction:
> > (parallel [
> > (set (mem/v/j:QI (const_int 240 [0xf0]) [0 MEM[(volatile union 
> > un_per0 *)240B].BIT.no4+0 S1 A16])
> > (ior:QI (mem/v/j:QI (const_int 240 [0xf0]) [0 MEM[(volatile 
> > union un_per0 *)240B].BIT.no4+0 S1 A16])
> > (const_int 16 [0x10])))
> > (set (reg/f:HI 43)
> > (const_int 240 [0xf0]))
> > ])
> > 
> > (the set is left behind because it's used for the second assignment)
> > 
> > Both of those insns in the parallel are valid rl78 insns.  I tried
> > adding that parallel as a define-and-split but combine doesn't split
> > it at the point where it inserts it, so it doesn't work right.  If it
> > reduced those four instructions to the two in the parallel, but
> > without the parallel, it would probably work too.
> 
> Did you try just a define_split instead?  Ugly, but it should work I think.

So I built a rl78-elf cross and tried the example (s/union __BITS9/__BITS8/).
I couldn't get your exact code, I guess I need some special options?
But before combine it looks the same.

There is no combination of four instructions, the only combinations
combine makes here are 2->1 combinations, and they all succeed, except
that last one.  Since combine won't do 2->2 no split will work.  And,
as far as combine is concerned, a define_insn_and_split is a define_insn,
so that indeed won't help you either.

What you need is some form of uncse, like the attached hack (which doesn't
work in this case because the rtx costs prohibit it).


Segher


diff --git a/gcc/combine.c b/gcc/combine.c
index b97aa10..20d526f 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -3928,8 +3938,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   && XVECLEN (newpat, 0) == 2
   && GET_CODE (XVECEXP (newpat, 0, 0)) == SET
   && GET_CODE (XVECEXP (newpat, 0, 1)) == SET
-  && (i1 || set_noop_p (XVECEXP (newpat, 0, 0))
- || set_noop_p (XVECEXP (newpat, 0, 1)))
+  && (1
+  || i1
+  || set_noop_p (XVECEXP (newpat, 0, 0))
+  || set_noop_p (XVECEXP (newpat, 0, 1)))
   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
   && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT


Can shrink-wrapping ever move prologue past an ASM statement?

2015-07-07 Thread Martin Jambor
Hi,

I've been asked to look into the item one of
http://permalink.gmane.org/gmane.linux.kernel/1990397 and found out
that at least shrink-wrapping happily moves prologue past an asm
statement which can be bad if the asm statement contains a call
instruction.

Am I right concluding that this is a bug?  Looking into the manual and
at requires_stack_frame_p() in shrink-wrap.c, I do not see any obvious
way of marking the asm statement as requiring the stack frame (but I
will not mind being proven wrong).  Do we want to create one, such as
only disallowing moving prologue past volatile asm statements?  Any
other ideas?

Thanks,

Martin


This is an x86_64 testcase, compare output of gcc -O2 -S and
gcc -S -O2 -fno-shrink-wrap


enum machine_mode
{
  FAKE_0,
  FAKE_1,
  FAKE_2,
  FAKE_3,
  FAKE_4,
  FAKE_5,
  NUM_MACHINE_MODES,
};

typedef int *rtx;
typedef long unsigned int size_t;
extern unsigned char mode_size[NUM_MACHINE_MODES];

extern rtx c_readstr (const char *, enum machine_mode);
extern rtx convert_to_mode (enum machine_mode, rtx, int);
extern rtx expand_mult (enum machine_mode, rtx, rtx, rtx, int);
extern rtx force_reg (enum machine_mode, rtx);
extern unsigned char mode_size_inline (enum machine_mode);
extern void *memset (void *__s, int __c, size_t __n);

rtx
builtin_memset_gen_str (void *data, long offset __attribute__ ((__unused__)),
enum machine_mode mode)
{
  rtx target, coeff;
  size_t size;
  char *p;
  asm volatile ("#" : : :);

  size = ((unsigned short) (__builtin_constant_p (mode)
? mode_size_inline (mode) : mode_size[mode]));
  if (size == 1)
return (rtx) data;

  p = ((char *) __builtin_alloca(sizeof (char) * (size)));
  memset (p, 1, size);
  coeff = c_readstr (p, mode);

  target = convert_to_mode (mode, (rtx) data, 1);
  target = expand_mult (mode, target, coeff, (rtx) 0, 1);
  return force_reg (mode, target);
}


Re: Can shrink-wrapping ever move prologue past an ASM statement?

2015-07-07 Thread Segher Boessenkool
On Tue, Jul 07, 2015 at 07:53:49PM +0200, Martin Jambor wrote:
> I've been asked to look into the item one of
> http://permalink.gmane.org/gmane.linux.kernel/1990397 and found out
> that at least shrink-wrapping happily moves prologue past an asm
> statement which can be bad if the asm statement contains a call
> instruction.
> 
> Am I right concluding that this is a bug?  Looking into the manual and
> at requires_stack_frame_p() in shrink-wrap.c, I do not see any obvious
> way of marking the asm statement as requiring the stack frame (but I
> will not mind being proven wrong).  Do we want to create one, such as
> only disallowing moving prologue past volatile asm statements?  Any
> other ideas?

For architectures like PowerPC where all calls clobber some register,
you can write e.g.

asm("bl func" : : : "lr");

and all is well (better than saving/restoring LR manually, too).


For other archs, e.g. x86-64, you can do

register void *sp asm("%sp");
asm volatile("call func" : "+r"(sp));

and the result seems to be optimal as well.


Some special clobber, maybe "stack" (like "memory", which won't work)
could be nicer?  What should the *exact* semantics of that be?


Segher


Re: [AD] UltraGDB, an alternative tool to debug GCC, GDB, LLDB, etc. on Windows and Linux

2015-07-07 Thread Sergio Durigan Junior
On Tuesday, July 07 2015, Chiheng Xu wrote:

> UltraGDB is a GDB GUI front-end, as well as a lightweight C/C++ IDE
> based on industry standard Eclipse technology.
>
> Visit http://www.ultragdb.com/ to learn more.
>
> Visit https://www.youtube.com/channel/UCr7F3ZZ_hgpxYXiMOA27hhw for demos.

Hello Chiheng,

I tried to find the source code of this package, but I could not find
it.  Do you have a URL or something you can provide?

Thank you,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/


Re: [AD] UltraGDB, an alternative tool to debug GCC, GDB, LLDB, etc. on Windows and Linux

2015-07-07 Thread Xu,Chiheng
On Wed, Jul 8, 2015 at 3:18 AM, Sergio Durigan Junior
 wrote:
>
> I tried to find the source code of this package, but I could not find
> it.  Do you have a URL or something you can provide?
>
Sorry, source code of UltraGDB is not available now.  The source code
is actually a trimmed down, supercharged, and re-branded Eclipse CDT.
All of our technology is built on open source, and for open source. It
maybe weird for us not to provide the source code. But we are just
founded, we have no idea how to continue developing this product. In
other word, we have no business model yet.  At present, we just want
to know whether or not our product is useful, if so, then we think it
is meaningful to work on it.

This product is actually a part of much bigger and more ambitious plan
that can't be disclosed right now.  In the future, we may decide to
provide the source code of UltraGDB.

Any comment or suggestion on the UltraGDB product, or "development
model", or "business model" is welcomed.


-- 
Xu,Chiheng(徐持恒)


Re: Can shrink-wrapping ever move prologue past an ASM statement?

2015-07-07 Thread Jeff Law

On 07/07/2015 11:53 AM, Martin Jambor wrote:

Hi,

I've been asked to look into the item one of
http://permalink.gmane.org/gmane.linux.kernel/1990397 and found out
that at least shrink-wrapping happily moves prologue past an asm
statement which can be bad if the asm statement contains a call
instruction.

Am I right concluding that this is a bug?  Looking into the manual and
at requires_stack_frame_p() in shrink-wrap.c, I do not see any obvious
way of marking the asm statement as requiring the stack frame (but I
will not mind being proven wrong).  Do we want to create one, such as
only disallowing moving prologue past volatile asm statements?  Any
other ideas?

Shouldn't this be driven by dataflow?

jeff



X18 on AArch64

2015-07-07 Thread André Hentschel
Hi all,
Note: I'm new to that mailinglist...

On AArch64 X18 is used as a platform register for some platforms, so to 
generate portable executables it should not be used by the compiler.
The use case i have for this is Wine. Windows arm64 programs use X18 as TLS 
register, thus it shouldn't be changed, otherwise it leads to a crash.
I worked on a patch [1], but realized that X17 is a bad choice (IP registers 
should be avoided on arm64 for the chain register it seems...),
so I wonder which register would be better. Something from X9-X15?

See also the bug report containing the patch:
[1] https://bugs.winehq.org/show_bug.cgi?id=38780


Re: making the new if-converter not mangle IR that is already vectorizer-friendly

2015-07-07 Thread Abe

[Alan wrote:]


My understanding is that any decision as to whether one or both of y or z is 
evaluated (when 'evaluation' involves doing any work,
e.g. a load), has already been encoded into the gimple/tree IR. Thus, if we are 
to only evaluate one of 'y' or 'z' in your example,
the IR will (prior to if-conversion), contain basic blocks and control flow, 
that means we jump around the one that's not evaluated.



This appears to be the case in pr61194.c: prior to if-conversion, the IR for 
the loop in barX is



  :
   # i_16 = PHI 
   # ivtmp_21 = PHI 
   _5 = x[i_16];
   _6 = _5 > 0.0;
   _7 = w[i_16];
   _8 = _7 < 0.0;
   _9 = _6 & _8;
   if (_9 != 0)
 goto ;
   else
 goto ;

   :
   iftmp.0_10 = z[i_16];
   goto ;

   :
   iftmp.0_11 = y[i_16];


[snip]


[note: the following section, until the next quoted line, is mostly an 
explanation
 of if conversion and its trade-offs; please feel free to skip it or to read it 
later]


OK, but that`s overly-conservative in this case for most/all modern high-speed processors: 
"z[i]" and "y[i]" are both pure expressions and
the possible values of 'i' here [given that the hardware doesn`t malfunction, 
nobody alters the value in a debugger, etc.] can be proven
to not overflow the bounds of the arrays.  With purity established and 
bounds-validity known to be OK, all we need to do is to evaluate
the "cost" of evaluating both expressions vs. that of inflicting a conditional branch on 
the hardware.  I believe this "cost" depends on the
values in the arrays 'x' and 'w', but when the probability of either case -- "((x[i]>0) 
& (w[i]<0))" is true or is false -- is very close to
50% for a long time, the branch predictor is going to find it difficult to make 
any sense of it, and speculative execution is going to speculatively
execute the incorrect branch about 50% of the time until the data comes in that 
renders speculation no-longer-needed in the given iteration.

On most/all modern high-speed processors, doing the loads unconditionally is going to be 
better IMO for anything even remotely resembling a "normal"
or "average" workload.  In other words, so long as the result of "((x[i]>0) & 
(w[i]<0))" changes frequently and unpredictably as 'i' is incremented,
it should be better to do both loads: the needed data are in cache lines that 
are already in cache, possibly even in L1 data cache, so it`s pretty
"cheap" to just load them.  Conditional branches based on difficult-to-predict 
data that changes frequently [as the arrays are traversed],
OTOH, are likely to cause stalls and result in slower execution than the 
fastest possible on that CPU for the source code in question.

In cases where the data upon which the condition depends lead to mostly-true or 
mostly-false results,
the conditional branch should perform well for enough-iterations loops on any 
CPU with a decent branch predictor,
but those are rare conditions IMO.  I think we should assume random data in the 
arrays unless/until we have analysis that tells
us otherwise, and I don`t expect much compile-time analysis of the contents of 
arrays to become the norm in compilers anytime soon.
These are probably extremely-infrequent corner cases anyway, with the possible 
exception of the processing of sparse matrices;
does anybody reading this with a strong background in numerical/scientific 
computing have anything to comment on this?

Predication of instructions can help to remove the burden of the conditional 
branch, but is not available on all relevant architectures.
In some architectures that are typically implemented in modern high-speed 
processors -- i.e. with high core frequency, caches, speculative execution, 
etc. --
there is not full support for predication [as there is, for example, in 32-bit ARM] but there _is_ 
support for "conditional move" [hereinafter "cmove"].
If/when the ISA in question supports cmove to/from main memory, perhaps it 
would be profitable to use two cmoves back-to-back with opposite conditions
and the same register [destination for load, source for store] to implement e.g. "temp = c ? 
X[x] : Y[y]" and/or "temp = C[c] ? X[x] : Y[y]" etc.
Even without cmove to/from main memory, two cmoves back-to-back with opposite 
conditions could still be used, e.g. [not for a real-world ISA]:
  load X[x] -> reg1
  load Y[y] -> reg2
  cmove   c  ? reg1 -> reg3
  cmove (!c) ? reg2 -> reg3

Or even better if the ISA can support something like:
  load X[x] -> reg1
  load Y[y] -> reg2
  cmove (c ? reg1 : reg2) -> reg3

However, this is a code-gen issue from my POV, and at the present time all the 
if-conversion work is occurring at the GIMPLE level.
If anybody reading this knows how I could make the if converter generate GIMPLE 
that leads to code-gen that is better for at
least one ISA and the change[s] do/does not negatively impact upon code-gen for 
any other ISA, I`d be glad to read about it.



Without -ftree-loop-if-convert-stores, if-conversion leaves this alone, and 
vectorization

Re: [AD] UltraGDB, an alternative tool to debug GCC, GDB, LLDB, etc. on Windows and Linux

2015-07-07 Thread Sergio Durigan Junior
On Tuesday, July 07 2015, Chiheng Xu wrote:

>> I tried to find the source code of this package, but I could not find
>> it.  Do you have a URL or something you can provide?
>>
> Sorry, source code of UltraGDB is not available now.  The source code
> is actually a trimmed down, supercharged, and re-branded Eclipse CDT.
> All of our technology is built on open source, and for open source. It
> maybe weird for us not to provide the source code. But we are just
> founded, we have no idea how to continue developing this product. In
> other word, we have no business model yet.  At present, we just want
> to know whether or not our product is useful, if so, then we think it
> is meaningful to work on it.
>
> This product is actually a part of much bigger and more ambitious plan
> that can't be disclosed right now.  In the future, we may decide to
> provide the source code of UltraGDB.
>
> Any comment or suggestion on the UltraGDB product, or "development
> model", or "business model" is welcomed.

[ Removing lldb-dev from the Cc list as requested. ]

Hi,

First of all, you are repackaging Eclipse CDT and are not distributing
the source code for it; in fact, you have relicensed the entire project
with a proprietary license.  Eclipse CDT is license under EPL:

  

Which forbids this practice.  EPL still allows you to license your
plugin as proprietary, but you still have to release the source code of
everything else (and if you modified Eclipse CDT in some way, you also
have to release your modifications).

Secondly, what is the advantage of UltraGDB when you compare it with the
Eclipse CDT Standalone Debugger?

  

The Standalone Debugger is Free Software (released under the EPL as
well), and distributed along with the Eclipse CDT on some distributions
(Fedora GNU/Linux, for example).  It is nice to see people developing
other plugins and GUI's for GDB (assuming they are Free Software as
well, of course), but it looks to me that Standalone Debugger offers a
better user experience than UltraGDB.

Thank you,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/


Re: X18 on AArch64

2015-07-07 Thread pinskia
What does the elf abi say about x18, I thought it was just another temp. If the 
target does not use it as a platform reg.  Note there are many assembly files 
which might use x18 also due to that. 

So this means you need wrapper functions when moving between the different 
abis. Nothing much can be done in gcc really. Please push this back to wine 
instead. 

Also I think windows had a bad choice of using x18 when there was a system 
register already for tls. 

Thanks,
Andrew



> On Jul 7, 2015, at 1:30 PM, André Hentschel  wrote:
> 
> Hi all,
> Note: I'm new to that mailinglist...
> 
> On AArch64 X18 is used as a platform register for some platforms, so to 
> generate portable executables it should not be used by the compiler.
> The use case i have for this is Wine. Windows arm64 programs use X18 as TLS 
> register, thus it shouldn't be changed, otherwise it leads to a crash.
> I worked on a patch [1], but realized that X17 is a bad choice (IP registers 
> should be avoided on arm64 for the chain register it seems...),
> so I wonder which register would be better. Something from X9-X15?
> 
> See also the bug report containing the patch:
> [1] https://bugs.winehq.org/show_bug.cgi?id=38780


Re: Question about DRAP register and reserving hard registers

2015-07-07 Thread Steve Ellcey
On Mon, 2015-06-29 at 11:10 +0100, Richard Henderson wrote:

> > I also need the drap pointer in the MIPS epilogue but I would like to
> > avoid having to get it from memory.  Ideally I would like to restore it
> > from the virtual register that the prologue code / get_drap_rtx code put
> > it into.  I tried just doing a move from the virtual drap register to
> > the real one in expand_epilogue but that didn't work because it looks
> > like you can't access virtual registers from expand_prologue or
> > expand_epilogue.  I guess that is why the code to copy the hard drap reg
> > to the virtual drap_reg is done in get_drap_reg and not in
> > expand_prologue.  I thought about putting code in get_drap_reg to do
> > this copying but I don't see how to access the end of a function.  The
> > hard drap reg to virtual drap reg copy is inserted into the beginning of
> > a function with:
> >
> > insn = emit_insn_before (seq, NEXT_INSN (entry_of_function ()));
> >
> > Is there an equivalent method to insert code to the end of a function?
> > I don't see an 'end_of_function ()' routine anywhere.
> 
> Because, while generating initial rtl for a function, the beginning of a 
> function has already been emitted, while the end of the function hasn't.
> 
> You'd need to hook into expand_function_end, right at the bottom, before the 
> call to use_return_register.
> 
> 
> r~

I ran into an interesting issue while doing this.  Right now the expand
pass calls construct_exit_block (which calls expand_function_end) before
it calls expand_stack_alignment.  That means that crtl->drap_reg, etc
are not yet set up when in expand_function_end.  I moved the
expand_stack_alignment call up before construct_exit_block to fix that.
I hope moving it up doesn't break anything.

Steve Ellcey
sell...@imgtec.com



Re: [lldb-dev] [AD] UltraGDB, an alternative tool to debug GCC, GDB, LLDB, etc. on Windows and Linux

2015-07-07 Thread Greg Clayton
Note that since you are using MI as the interface between your debugger GUI 
and the debugging backend (gdb) you can try using the lldb-mi layer and test 
performance between the GDB MI and the LLDB MI. Then you can rename your 
debugger to be UltraMI to keep your debugger agnostic the the backend debugger. 
:-)

I noticed in the youtube video that it took 40 seconds to set a breakpoint at 
lldb's main function and to run to the breakpoint and hit it. If you try LLDB 
MI, I am guessing it will be faster than 40 seconds.

Greg Clayton


> On Jul 7, 2015, at 12:46 PM, Xu,Chiheng  wrote:
> 
> On Wed, Jul 8, 2015 at 3:18 AM, Sergio Durigan Junior
>  wrote:
>> 
>> I tried to find the source code of this package, but I could not find
>> it.  Do you have a URL or something you can provide?
>> 
> Sorry, source code of UltraGDB is not available now.  The source code
> is actually a trimmed down, supercharged, and re-branded Eclipse CDT.
> All of our technology is built on open source, and for open source. It
> maybe weird for us not to provide the source code. But we are just
> founded, we have no idea how to continue developing this product. In
> other word, we have no business model yet.  At present, we just want
> to know whether or not our product is useful, if so, then we think it
> is meaningful to work on it.
> 
> This product is actually a part of much bigger and more ambitious plan
> that can't be disclosed right now.  In the future, we may decide to
> provide the source code of UltraGDB.
> 
> Any comment or suggestion on the UltraGDB product, or "development
> model", or "business model" is welcomed.
> 
> 
> -- 
> Xu,Chiheng(徐持恒)
> 
> ___
> lldb-dev mailing list
> lldb-...@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev



Question about always executed info computed in tree-ssa-loop-im.c

2015-07-07 Thread Bin.Cheng
Hi,
Function fill_always_executed_in_1 computes basic blocks' always
executed information, and it has below code and comment:

  /* In a loop that is always entered we may proceed anyway.
 But record that we entered it and stop once we leave it.  */
  inn_loop = bb->loop_father;

Then in following iterations, it breaks the loop if basic block not
belonging to the inner loop is encountered.  This means basic blocks
after inner loop won't have always executed information computed, even
they dominates the original loop's latch.

Am I missing something?  Why is that?

Thanks,
bin