Re: RFC: Adding non-PIC executable support to MIPS

2008-07-27 Thread Richard Sandiford
Daniel Jacobowitz <[EMAIL PROTECTED]> writes:
> All comments welcome - Richard, especially from you.  How would you
> like to proceed?  I think the first step should be to get your other
> binutils/gcc patches merged, including MIPS16 PIC; I used those as a
> base.  But see a few of the notes for potential problems with those
> patches.

Yeah, Nick's approved most of the remaining binutils changes (thanks).
I haven't applied them yet because of the doubt over whether st_size
should be even or odd for ISA-encoded MIPS16 symbols.  I don't really
have an opinion, so I'll accept a maintainerly decision...

Anyway, the gcc patch looks good to me, thanks.  The only niggle
I could see was that you didn't update the comment for:

+/* True if the output file is marked as ".abicalls; .option pic0"
+   (-call_mixed).  This is a GNU extension.  */
+#define TARGET_ABICALLS_PIC0 \
+  (TARGET_ABSOLUTE_ABICALLS && TARGET_PLT)

(That kind of thing was inevitable given the amount of code you had
to wade through.  I'm impressed if there's really only one instance!)
I think the gcc side is good to go, modulo the _mcount thing.

As far as binutils goes, I saw a couple of potential problems:

(1) The patch adds the following code to
_bfd_mips_elf_create_dynamic_sections:

+  if (htab->use_plts_and_copy_relocs && htab->root.hplt == NULL)
+   {
+ h = _bfd_elf_define_linkage_sym (abfd, info, s,
+  "_PROCEDURE_LINKAGE_TABLE_");
+ htab->root.hplt = h;
+ if (h == NULL)
+   return FALSE;
+ h->type = STT_FUNC;
+   }

But use_plts_and_copy_relocs is only set after all input bfds have
been read in.

(2) The patch sets pointer_equality_needed as follows:

@@ -7432,9 +7484,18 @@ _bfd_mips_elf_check_relocs (bfd *abfd, s
elf_hash_table (info)->dynobj = dynobj = abfd;
  break;
}
- /* Fall through  */
+ /* Fall through.  */
 
default:
+ /* Most static relocations require pointer equality, except
+for branches.  */
+ if (h)
+   h->pointer_equality_needed = TRUE;
+ /* Fall through.  */
+
+   case R_MIPS_26:
+   case R_MIPS_PC16:
+   case R_MIPS16_26:
  if (h)
((struct mips_elf_link_hash_entry *) h)->has_static_relocs = TRUE;
  break;

But pointer equality is needed for non-call GOT relocations too.
I couldn't see anything that explicitly handled that case.

I think it would be more robust to set pointer_equality_needed in a
separate block, rather than combining it with the existing switch
statements.  It might then be clearer to set has_nonpic_branches
in the new block too, so that you don't need two copies of:

  if (h && !PIC_OBJECT_P (abfd))
((struct mips_elf_link_hash_entry *) h)->has_nonpic_branches = TRUE;

Some minor nits too:

+  0x0399,  /* l[wd] $25, %lo(&GOTPLT[0])($28)  */
+  0x01d9,  /* l[wd] $25, %lo(&GOTPLT[0])($14)  */
+  0x01d9,  /* l[wd] $25, %lo(&GOTPLT[0])($14)  */

These are all fixed as either lw or ld.

@@ -1649,13 +1695,16 @@ mips_elf_check_symbols (struct mips_elf_
   /* H is a function that might need $25 to be valid on entry.
 If we're creating a non-PIC relocatable object, mark H as
 being PIC.  If we're creating a non-relocatable object with
-non-PIC references to H, make sure that H has an la25 stub.  */
+branches to H, make sure that H has an la25 stub.  Only
+use the stub for branches from non-PIC objects; GCC's
+-mno-shared uses branches from PIC objects to functions
+which do not require $25.  */
   if (hti->info->relocatable)
{
  if (!PIC_OBJECT_P (hti->output_bfd))
h->root.other = ELF_ST_SET_MIPS_PIC (h->root.other);
}
-  else if (h->non_pic_ref && !mips_elf_add_la25_stub (hti->info, h))
+  else if (h->has_nonpic_branches && !mips_elf_add_la25_stub (hti->info, 
h))
{
  hti->error = TRUE;
  return FALSE;

How about something like the following:

-non-PIC references to H, make sure that H has an la25 stub.  */
+non-PIC branches and jumps to H, make sure that H has an la25 stub.
+We specifically ignore branches and jumps from EF_PIC objects,
+where the onus is on the compiler or programmer to perform any
+necessary initialization of $25.  Sometimes such initialization
+is unnecessary; for example, -mno-shared functions do not use
+the incoming value of $25, and may therefore be called directly.  */

(Wordsmith as necessary.)  The original wording made it sound like we'd
created a stub if there were any branches at all, but that the stub
would only be used for branches from non-PIC objects.

@@ -2928,6 +2977,7 @@ mips_elf_gotplt_index (struct bfd_link_i
   struct m

Re: [tuples] New memory/time comparison vs trunk

2008-07-27 Thread Jan Hubicka
> 
> - The rest of the memory utilization difference is mostly in inlining
> (240Kb) and SSA update (50Kb).
> 
> I think the main focus points should be DSE and trying to get a good
> way of measuring the memory utilization differences.  Jan, any
> suggestion?

I've switched memory tester to tuples now.  The full report is in
gcc-regressions and graphs here
http://gcc.opensuse.org/memory/graphs/index.html
Overall the footprint improved.

For combine.c, the gimplified program now needs 3.6% less memory,
overall amount of GGC memory referenced at one time decreased by 5%
pretty much at all levels.

Insn-attrtab is 14% smaller befora IPA, overlall footprint is 12%
smaller.

Gerald's testcase shows 2% increase in overall allocation, but memory
footprint is still 3% smaller.  This might be increased inlining, but
also the aliassing issue bellow.

The top memory allocations:

rtl.c:269 (copy_rtx)1594936: 2.6%
gimple-iterator.c:447 (gsi_insert_after_without_1714000: 2.8%
stringpool.c:74 (alloc_node) 761696: 1.2%
gimplify.c:521 (create_tmp_var_raw) 1852032: 3.0%
cselib.c:1155 (cselib_subst_to_values)  1914640: 3.1%
emit-rtl.c:3339 (make_insn_raw) 1945592: 3.2%
tree-inline.c:3563 (copy_tree_r)2450104: 4.0%
tree-ssanames.c:141 (make_ssa_name_fn)  3374640: 5.5%
tree-phinodes.c:157 (allocate_phi_node) 3475264: 5.7%

(this is combine.c at O2)

gimple.c:2098 (gimple_copy) 8735664: 2.5%
gimple-iterator.c:447 (gsi_insert_after_without_9558400: 2.8%
gimplify.c:521 (create_tmp_var_raw) 9737784: 2.8%
tree-phinodes.c:157 (allocate_phi_node)15222104: 4.4%
tree-ssanames.c:141 (make_ssa_name_fn) 16784400: 4.8%
tree-inline.c:4062 (copy_decl_no_change)   18668784: 5.4%
tree-inline.c:3563 (copy_tree_r)   19876608: 5.7%

(this is Gerald's testcase at O2).

PHI nodes and SSA names now got very up.  This is all from virtual
operands.  Without aliasing at combine.c I get 1.5MV (instead of 3.3MB)
for SSA names and 195KB (instead of 3.4MB) for PHI nodes.
Perhaps something got wrong with aliasing heruistics?

It seem to show in your scores too:
>  ^ tree alias analys4.474.531.34%   68751   710853.39%
>  ^ tree call clobber0.420.59   40.48% 8871084   22.21%

For space reasons memory tester no longer save older reports so I can't
compare easilly with mainline, but I can give it a try if this is not
obvious problem somewhere. (requires me to build stuff by hand that is
not hard).

Honza


Re: Recent warning regression: no return statement in function returning non-void

2008-07-27 Thread Richard Guenther
On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer <[EMAIL PROTECTED]> wrote:
> I believe the following happened in the last 48 or so hours; I saw
> this triggered by my nightly Wine builds which in turn use my nightly
> GCC builds. ;-)
>
> For code like the following where we have an infinite loop in a
> non-void function, we now (incorrectly) issue a warning with all
> of -O0, -O1 and -O2 whereas previously we would not:
>
>  void g();
>
>  int f() {
>for(;;)
>  g();
>  }
>
>  % gccvs -c -Wall x.c
>  x.c: In function 'int f()':
>  x.c:6: warning: no return statement in function returning non-void

I think the warning is perfectly correct.  There is no return statement
in that function and it does return non-void.  The warning doesn't say
that the function does return without a value.

Richard.


Recent warning regression: no return statement in function returning non-void

2008-07-27 Thread Gerald Pfeifer
I believe the following happened in the last 48 or so hours; I saw
this triggered by my nightly Wine builds which in turn use my nightly
GCC builds. ;-)

For code like the following where we have an infinite loop in a
non-void function, we now (incorrectly) issue a warning with all
of -O0, -O1 and -O2 whereas previously we would not:

  void g();

  int f() {
for(;;)
  g();
  }

  % gccvs -c -Wall x.c
  x.c: In function 'int f()':
  x.c:6: warning: no return statement in function returning non-void

Looking at the ChangeLog changes that look most plausible are the 
unit-at-a-time ones, though I'm not sure how that would apply since
this is independent of optimization level.

Tested on i386-unknown-freebsd6.  I verified that i586-suse-linux
does not warn with GCC 4.2.1 and am just building current trunk 
there as well.

Gerald


Build failure with Cygwin

2008-07-27 Thread Paul Richard Thomas
Dear All,

Perhaps this is old news/my fault but I am seeing the following on
Cygwin_NT/amd64:

/irun/bin/gcc  -g -O2 -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes -W
missing-prototypes -Wcast-qual -Wold-style-definition -Wc++-compat -Wmissing-for
mat-attribute -pedantic -Wno-long-long -Wno-variadic-macros  -Wn
o-overlength-strings -fno-common  -DHAVE_CONFIG_H  -o cc1-dummy.exe c-lang.o stu
b-objc.o attribs.o c-errors.o c-lex.o c-pragma.o c-decl.o c-typeck.o c-convert.o
 c-aux-info.o c-common.o c-opts.o c-format.o c-semantics.o c-ppoutput.o c-cppbui
ltin.o c-objc-common.o c-dump.o c-pch.o c-parser.o i386-c.o cygwin2.o msformat-c
.o c-gimplify.o tree-mudflap.o c-pretty-print.o c-omp.o dummy-checksum.o \
  main.o tree-browser.o libbackend.a ../libcpp/libcpp.a ../libdecnumber/
libdecnumber.a ../libcpp/libcpp.a -lintl -liconv ../libiberty/libiberty.a ../lib
decnumber/libdecnumber.a -lmpfr -lgmp
libbackend.a(stringpool.o): In function `ggc_purge_stringpool':
../../trunk/gcc/stringpool.c:192: undefined reference to `_ht_purge'
collect2: ld returned 1 exit status
make[2]: *** [cc1-dummy.exe] Error 1
make[2]: Leaving directory `/svn/build/gcc'
make[1]: *** [install-gcc] Error 2
make[1]: Leaving directory `/svn/build'
make: *** [install] Error 2

Other than the obvious, any suggestions?

Paul


Re: Recent warning regression: no return statement in function returning non-void

2008-07-27 Thread Jan Hubicka
> On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer <[EMAIL PROTECTED]> wrote:
> > I believe the following happened in the last 48 or so hours; I saw
> > this triggered by my nightly Wine builds which in turn use my nightly
> > GCC builds. ;-)
> >
> > For code like the following where we have an infinite loop in a
> > non-void function, we now (incorrectly) issue a warning with all
> > of -O0, -O1 and -O2 whereas previously we would not:
> >
> >  void g();
> >
> >  int f() {
> >for(;;)
> >  g();
> >  }
> >
> >  % gccvs -c -Wall x.c
> >  x.c: In function 'int f()':
> >  x.c:6: warning: no return statement in function returning non-void
> 
> I think the warning is perfectly correct.  There is no return statement
> in that function and it does return non-void.  The warning doesn't say
> that the function does return without a value.

Also if you make the function static inline, older GCC versions will
trigger same warning.
The problem here is that original code was relying on the fact that
extern inline and static inline functions was only functions that was
ever removed and not compiled.  THis is not true since GCC 3.4 when
callgraph code started to elliminate all static functions.  

Honza
> 
> Richard.


GCC trunk frozen for the tuples merge

2008-07-27 Thread Richard Guenther

The trunk is frozen now until after the merge of the tuples branch
which will happen tomorrow, Monday Jul 28th.  Unfreezing of the
trunk will be annonced after the fact.

Thanks,
Richard.

-- 
Richard Guenther <[EMAIL PROTECTED]>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Re: lto gimple types and debug info

2008-07-27 Thread Mark Mitchell

David Edelsohn wrote:


I do not expect LTO (or WHOPR) to work on AIX -- at least not
without a lot of work on wrappers around the AIX linker.  However, I do
not understand why enhancing GCC to support LTO -- when GCC is run without
enabling LTO -- requires locking GCC completely into DWARF debugging.


I agree that, at least in principle, it should be possible to emit the 
debug info (whether the format is DWARF, Stabs, etc.) once.  So, I don't 
see a reason that this makes us a DWARF-only compiler either.


Others have raised the issue of types which are fundamentally 
transformed by the compiler (such as by removing fields).  I think that 
such opportunities are going to be relatively rare; the global "struct 
Window" object in a GUI library full of functions taking "struct Window 
*" parameters probably isn't optimizable in this way.  But there will be 
situations where this is possible and profitable of course.


In that case, I'm not sure that *type* ought to be modified at all, from 
the debug perspective.  To the extent there's still an object of type 
"struct X" around, it's type is still what it was.  And other things you 
might do in a debugger, like ask "What member functions does class X 
have?", have the same answer no matter the layout chosen by the 
compiler, including throwing out half the fields and leaving the rest in 
random registers.  For that matter, "print sizeof(X)" should print the 
same value when debugging optimized code as when debugging unoptimized 
code, even if the compiler has optimized X away to an empty structure!


There are other things we could do, like mark the *variables* of type X 
(rather than the type) as having no location, so that you can't 
print/modify objects that have been optimized in this way.  That 
reflects more accurately the user's view of what has happened; it's not 
that the type itself is different as much as it is that objects of the 
type are hard to view.


You could also add a marker to the type that says "optimized madly; 
debugger should proceed with caution" -- and you could do that without 
reloading and rewriting the type information.  For example, when 
generating the original type debug info, emit a relocation against 
"X_optimized_madly" and then providing an approprivate value for the 
symbol at link time.


I'm curious what we do with SRA at the moment.  This is the same sort of 
problem; do we have any solutions at present?


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: lto gimple types and debug info

2008-07-27 Thread Richard Guenther
On Sun, Jul 27, 2008 at 7:18 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
> David Edelsohn wrote:
>
>>I do not expect LTO (or WHOPR) to work on AIX -- at least not
>> without a lot of work on wrappers around the AIX linker.  However, I do
>> not understand why enhancing GCC to support LTO -- when GCC is run without
>> enabling LTO -- requires locking GCC completely into DWARF debugging.
>
> I agree that, at least in principle, it should be possible to emit the debug
> info (whether the format is DWARF, Stabs, etc.) once.  So, I don't see a
> reason that this makes us a DWARF-only compiler either.
>
> Others have raised the issue of types which are fundamentally transformed by
> the compiler (such as by removing fields).  I think that such opportunities
> are going to be relatively rare; the global "struct Window" object in a GUI
> library full of functions taking "struct Window *" parameters probably isn't
> optimizable in this way.  But there will be situations where this is
> possible and profitable of course.
>
> In that case, I'm not sure that *type* ought to be modified at all, from the
> debug perspective.  To the extent there's still an object of type "struct X"
> around, it's type is still what it was.  And other things you might do in a
> debugger, like ask "What member functions does class X have?", have the same
> answer no matter the layout chosen by the compiler, including throwing out
> half the fields and leaving the rest in random registers.  For that matter,
> "print sizeof(X)" should print the same value when debugging optimized code
> as when debugging unoptimized code, even if the compiler has optimized X
> away to an empty structure!
>
> There are other things we could do, like mark the *variables* of type X
> (rather than the type) as having no location, so that you can't print/modify
> objects that have been optimized in this way.  That reflects more accurately
> the user's view of what has happened; it's not that the type itself is
> different as much as it is that objects of the type are hard to view.
>
> You could also add a marker to the type that says "optimized madly; debugger
> should proceed with caution" -- and you could do that without reloading and
> rewriting the type information.  For example, when generating the original
> type debug info, emit a relocation against "X_optimized_madly" and then
> providing an approprivate value for the symbol at link time.
>
> I'm curious what we do with SRA at the moment.  This is the same sort of
> problem; do we have any solutions at present?

We generate variables with names like x$y for struct { int y; } x; - in theory
the debugger could "magically" associate a print x.y with x$y.  But of course
there is no way to express this in the DWARF.

Richard.


Re: lto gimple types and debug info

2008-07-27 Thread Andrew Pinski
On Sun, Jul 27, 2008 at 11:09 AM, Richard Guenther
<[EMAIL PROTECTED]> wrote:

> We generate variables with names like x$y for struct { int y; } x; - in theory
> the debugger could "magically" associate a print x.y with x$y.  But of course
> there is no way to express this in the DWARF.

Actually there is a way to express this in Dwarf2, using DW_OP_piece.
See the thread at http://gcc.gnu.org/ml/gcc/2005-01/msg00080.html for
more information.

-- Pinski


Re: lto gimple types and debug info

2008-07-27 Thread Daniel Berlin
On Sun, Jul 27, 2008 at 1:18 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
> David Edelsohn wrote:
>
>>I do not expect LTO (or WHOPR) to work on AIX -- at least not
>> without a lot of work on wrappers around the AIX linker.  However, I do
>> not understand why enhancing GCC to support LTO -- when GCC is run without
>> enabling LTO -- requires locking GCC completely into DWARF debugging.
>
> I agree that, at least in principle, it should be possible to emit the debug
> info (whether the format is DWARF, Stabs, etc.) once.

No, you can't.
You would at least have to emit the variables separate from the types
(IE emit debug info twice).

>  So, I don't see a
> reason that this makes us a DWARF-only compiler either.
>
> Others have raised the issue of types which are fundamentally transformed by
> the compiler (such as by removing fields).  I think that such opportunities
> are going to be relatively rare; the global "struct Window" object in a GUI
> library full of functions taking "struct Window *" parameters probably isn't
> optimizable in this way.  But there will be situations where this is
> possible and profitable of course.
>
> In that case, I'm not sure that *type* ought to be modified at all, from the
> debug perspective.  To the extent there's still an object of type "struct X"
> around, it's type is still what it was.

Uh, except that if you only write things out once, and have already
written out the variables, the variable no longer has the correct type
if you've rewritten the type, and if we've already emitted debug info,
it won't display properly anymore (since the locations of data members
the type specifies will now be incorrect).

So are you suggesting we emit debug info at multiple times


Re: lto gimple types and debug info

2008-07-27 Thread Mark Mitchell

Andrew Pinski wrote:


Actually there is a way to express this in Dwarf2, using DW_OP_piece.
See the thread at http://gcc.gnu.org/ml/gcc/2005-01/msg00080.html for
more information.


And if that's not sufficient, we can of course add extensions to DWARF 
that do represent it -- and that would be better than modifying the 
debugger to make assumptions about magic variable names.


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: lto gimple types and debug info

2008-07-27 Thread Mark Mitchell

Daniel Berlin wrote:


I agree that, at least in principle, it should be possible to emit the debug
info (whether the format is DWARF, Stabs, etc.) once.


No, you can't.
You would at least have to emit the variables separate from the types
(IE emit debug info twice).


Yes, of course; that's what everyone is talking about, I think.  "Emit" 
here may also mean "cache in memory some place", rather than "write to a 
file".  It could mean, for example, fill in the data structures we 
already use for types in dwarf2out.c early, and then throw away the 
front-end type information.


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: lto gimple types and debug info

2008-07-27 Thread Kenneth Zadeck

Daniel Berlin wrote:

On Sun, Jul 27, 2008 at 1:18 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
  

David Edelsohn wrote:



   I do not expect LTO (or WHOPR) to work on AIX -- at least not
without a lot of work on wrappers around the AIX linker.  However, I do
not understand why enhancing GCC to support LTO -- when GCC is run without
enabling LTO -- requires locking GCC completely into DWARF debugging.
  

I agree that, at least in principle, it should be possible to emit the debug
info (whether the format is DWARF, Stabs, etc.) once.



No, you can't.
You would at least have to emit the variables separate from the types
(IE emit debug info twice).

  

 So, I don't see a
reason that this makes us a DWARF-only compiler either.

Others have raised the issue of types which are fundamentally transformed by
the compiler (such as by removing fields).  I think that such opportunities
are going to be relatively rare; the global "struct Window" object in a GUI
library full of functions taking "struct Window *" parameters probably isn't
optimizable in this way.  But there will be situations where this is
possible and profitable of course.

In that case, I'm not sure that *type* ought to be modified at all, from the
debug perspective.  To the extent there's still an object of type "struct X"
around, it's type is still what it was.



Uh, except that if you only write things out once, and have already
written out the variables, the variable no longer has the correct type
if you've rewritten the type, and if we've already emitted debug info,
it won't display properly anymore (since the locations of data members
the type specifies will now be incorrect).

So are you suggesting we emit debug info at multiple times
  
it is my guess that we are still going to have to generate the debugging 
info for the variables late, if for no other reasons, that things like 
stack offsets are not set until then.   If that is true, we could 
possibly just generate a new type and abandon the first one.  


kenny



Re: RFC: Adding non-PIC executable support to MIPS

2008-07-27 Thread Mark Mitchell

Richard Sandiford wrote:

Daniel Jacobowitz <[EMAIL PROTECTED]> writes:

All comments welcome - Richard, especially from you.  How would you
like to proceed?  I think the first step should be to get your other
binutils/gcc patches merged, including MIPS16 PIC; I used those as a
base.  But see a few of the notes for potential problems with those
patches.


Yeah, Nick's approved most of the remaining binutils changes (thanks).
I haven't applied them yet because of the doubt over whether st_size
should be even or odd for ISA-encoded MIPS16 symbols.  I don't really
have an opinion, so I'll accept a maintainerly decision...


[I'm not sure if this is a helpful suggestion or not, so feel free to 
ignore it if it's not.]


I would suggest that st_size be the actual size of the function, as it 
lives in memory.  A test of it's start/end location is "could I stick a 
random data byte there and have it affect the function".  For example, 
for a Thumb function whose ISA address is "0x0001", I would consider 
for size purposes that it starts at "0x", since altering that 
byte at run-time would change the meaning of the function.


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: lto gimple types and debug info

2008-07-27 Thread Daniel Berlin
On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
> Daniel Berlin wrote:
>
>>> I agree that, at least in principle, it should be possible to emit the
>>> debug
>>> info (whether the format is DWARF, Stabs, etc.) once.
>>
>> No, you can't.
>> You would at least have to emit the variables separate from the types
>> (IE emit debug info twice).
>
> Yes, of course; that's what everyone is talking about, I think.  "Emit" here
> may also mean "cache in memory some place", rather than "write to a file".
>  It could mean, for example, fill in the data structures we already use for
> types in dwarf2out.c early, and then throw away the front-end type
> information
Okay, then let us go through the options, and you tell me which you
are suggesting:

If you assume LTO does not have access to the front ends, your options
look something like this:

When you first compile each file:
  Emit type debug info
  Emit LTO

When you LTO them all together
  Do LTO
  Emit variable debug info

Under this option, "Emit variable info" requires being able to
reference the types.  If you've lowered the types,  this is quite
problematic.  So either you get to store label names for the already
output type debug info with the variables (so you can still reference
the type you output properly when you read it back in).  This is
fairly fragile, to be honest.
Another downside of this is that you can't eliminate duplicate types
between units because you don't know which types are really the same
in the debug info. You have to let the

Another option is:

When you first compile each file:
  Emit type debug info
  Emit partial variable debug info (IE add pointers to outputted types
but not to locations)
  Emit LTO

When you LTO them all together:
  Do LTO
  Parse and update variable debug info to have locations
  Emit variable debug info

This requires parsing the debug info (in some format, be it DWARF or
some generic format we've made up) so that you can update the variable
info's location.
As a plus, you can easily update the types where you need to.
Unlike the first option, because you understand the debug info, you
can now remove all the duplicate types between units without having to
have the linker do it for you.

Unless  you link in every single frontend to LTO1 (Or move a lot to
the middle end), there is no way to do the following:

When you first compile each file:
  Emit LTO

When you LTO them all together:
  Emit type debug info
  Do LTO
  Emit variable debug info

If you don't want to link the frontends, you could also get away with
moving a lot of junk to the middle end (everything from being able to
distinguish between class and struct to namespaces, the context of
lexical blocks) because debug info outputting uses language specific
nodes all over the place right now.

Unless i've missed something, our least fragile and IMHO, best option
requires parsing back in debug info.
It is certainly *possible* to get debug info without parsing the debug
info back in.
Then again, I also don't see what the big deal about adding a debug
info parser is.

It's not like they are all that large.

[EMAIL PROTECTED]:/home/dannyb/util/debuginfo]> wc -l bytereader.*
bytereader-inl.h dwarf2enums.h dwarf2reader*
   40 bytereader.cc
  110 bytereader.h
  118 bytereader-inl.h
  465 dwarf2enums.h
  797 dwarf2reader.cc
  373 dwarf2reader.h
 1903 total

(This includes both a callback style reader that simply hands you
thinks you tell it to, as well as something that can read back into a
format much like we use during debug info output)


Re: lto gimple types and debug info

2008-07-27 Thread Kenneth Zadeck

Daniel Berlin wrote:

On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
  

Daniel Berlin wrote:



I agree that, at least in principle, it should be possible to emit the
debug
info (whether the format is DWARF, Stabs, etc.) once.


No, you can't.
You would at least have to emit the variables separate from the types
(IE emit debug info twice).
  

Yes, of course; that's what everyone is talking about, I think.  "Emit" here
may also mean "cache in memory some place", rather than "write to a file".
 It could mean, for example, fill in the data structures we already use for
types in dwarf2out.c early, and then throw away the front-end type
information


Okay, then let us go through the options, and you tell me which you
are suggesting:

If you assume LTO does not have access to the front ends, your options
look something like this:

When you first compile each file:
  Emit type debug info
  Emit LTO

When you LTO them all together
  Do LTO
  Emit variable debug info

Under this option, "Emit variable info" requires being able to
reference the types.  If you've lowered the types,  this is quite
problematic.  So either you get to store label names for the already
output type debug info with the variables (so you can still reference
the type you output properly when you read it back in).  This is
fairly fragile, to be honest.
Another downside of this is that you can't eliminate duplicate types
between units because you don't know which types are really the same
in the debug info. You have to let the

Another option is:

When you first compile each file:
  Emit type debug info
  Emit partial variable debug info (IE add pointers to outputted types
but not to locations)
  Emit LTO

When you LTO them all together:
  Do LTO
  Parse and update variable debug info to have locations
  Emit variable debug info

This requires parsing the debug info (in some format, be it DWARF or
some generic format we've made up) so that you can update the variable
info's location.
As a plus, you can easily update the types where you need to.
Unlike the first option, because you understand the debug info, you
can now remove all the duplicate types between units without having to
have the linker do it for you.

Unless  you link in every single frontend to LTO1 (Or move a lot to
the middle end), there is no way to do the following:

When you first compile each file:
  Emit LTO

When you LTO them all together:
  Emit type debug info
  Do LTO
  Emit variable debug info

If you don't want to link the frontends, you could also get away with
moving a lot of junk to the middle end (everything from being able to
distinguish between class and struct to namespaces, the context of
lexical blocks) because debug info outputting uses language specific
nodes all over the place right now.

Unless i've missed something, our least fragile and IMHO, best option
requires parsing back in debug info.
It is certainly *possible* to get debug info without parsing the debug
info back in.
Then again, I also don't see what the big deal about adding a debug
info parser is.

It's not like they are all that large.

[EMAIL PROTECTED]:/home/dannyb/util/debuginfo]> wc -l bytereader.*
bytereader-inl.h dwarf2enums.h dwarf2reader*
   40 bytereader.cc
  110 bytereader.h
  118 bytereader-inl.h
  465 dwarf2enums.h
  797 dwarf2reader.cc
  373 dwarf2reader.h
 1903 total

(This includes both a callback style reader that simply hands you
thinks you tell it to, as well as something that can read back into a
format much like we use during debug info output)
  
you may of course be right and this is what we will end up doing, but 
the implications for whopr are not good.   The parser is going to have 
to work in lockstep with the type merger and all of the debug sections 
for all of the .o files are going to have to be parsed in lto1.  My 
predictions is that this is going to be a bottleneck.


kenny



Re: lto gimple types and debug info

2008-07-27 Thread Mark Mitchell

Daniel Berlin wrote:


Then again, I also don't see what the big deal about adding a debug
info parser is.


OK, yes, we may need to read debug info back in.

I don't see it as a big deal, either -- and I also don't see it as 
locking us into DWARF2.  We can presumably read in any formats we are 
about, so if we want to add a stabs reader, we can do that to support 
stabs platforms.  And, until we have a stabs reader, we can just drop 
debug info on those platforms when doing LTO.  So, we just have to 
design LTO with some abstraction over debug info in mind.


In fact, we could probably treat DWARF as canonical, and have a 
STABS->DWARF input filter and DWARF->STABS output filter, if we like.


I'm not hung up on the exact implementation; all I'm trying to do is 
address the idea that somehow we're going to make it impossible for LTO 
to work with non-DWARF debug info.  As long as it we design it 
carefully, there's no reason we should have that limitation.


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: lto gimple types and debug info

2008-07-27 Thread Daniel Berlin
On Sun, Jul 27, 2008 at 7:41 PM, Daniel Berlin <[EMAIL PROTECTED]> wrote:
> On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
>> Daniel Berlin wrote:
>>
 I agree that, at least in principle, it should be possible to emit the
 debug
 info (whether the format is DWARF, Stabs, etc.) once.
>>>
>>> No, you can't.
>>> You would at least have to emit the variables separate from the types
>>> (IE emit debug info twice).
>>
>> Yes, of course; that's what everyone is talking about, I think.  "Emit" here
>> may also mean "cache in memory some place", rather than "write to a file".
>>  It could mean, for example, fill in the data structures we already use for
>> types in dwarf2out.c early, and then throw away the front-end type
>> information
> Okay, then let us go through the options, and you tell me which you
> are suggesting:
>
> If you assume LTO does not have access to the front ends, your options
> look something like this:
>
> When you first compile each file:
>  Emit type debug info
>  Emit LTO
>
> When you LTO them all together
>  Do LTO
>  Emit variable debug info
>
> Under this option, "Emit variable info" requires being able to
> reference the types.  If you've lowered the types,  this is quite
> problematic.  So either you get to store label names for the already
> output type debug info with the variables (so you can still reference
> the type you output properly when you read it back in).  This is
> fairly fragile, to be honest.
> Another downside of this is that you can't eliminate duplicate types
> between units because you don't know which types are really the same
> in the debug info. You have to let the
>
> Another option is:
>
> When you first compile each file:
>  Emit type debug info
>  Emit partial variable debug info (IE add pointers to outputted types
> but not to locations)
>  Emit LTO
>
> When you LTO them all together:
>  Do LTO
>  Parse and update variable debug info to have locations
>  Emit variable debug info
>
> This requires parsing the debug info (in some format, be it DWARF or
> some generic format we've made up) so that you can update the variable
> info's location.
> As a plus, you can easily update the types where you need to.
> Unlike the first option, because you understand the debug info, you
> can now remove all the duplicate types between units without having to
> have the linker do it for you.
>
> Unless  you link in every single frontend to LTO1 (Or move a lot to
> the middle end), there is no way to do the following:
>
> When you first compile each file:
>  Emit LTO
>
> When you LTO them all together:
>  Emit type debug info
>  Do LTO
>  Emit variable debug info
>
> If you don't want to link the frontends, you could also get away with
> moving a lot of junk to the middle end (everything from being able to
> distinguish between class and struct to namespaces, the context of
> lexical blocks) because debug info outputting uses language specific
> nodes all over the place right now.

Sorry, hit send a little too early.

This option also requires being able to serialize language specific
nodes (or again, you move things like namespaces and other language
specific contexts to the middle end), and to stop throwing this stuff
out at the point we do right now.

I'm not sure what most LTO compilers do.
At least when i was at IBM, XLC simply output the debug info in a
generic format (it was part of the definition of wcode), parsed it
back in, updated it, and transformed it into DWARF/etc at the backend.

This is a variant of the second option above.  Again, i'm not saying
it's the best option, and in fact i'm very curious what most compilers
do.


Re: lto gimple types and debug info

2008-07-27 Thread Daniel Berlin
On Sun, Jul 27, 2008 at 7:48 PM, Kenneth Zadeck
<[EMAIL PROTECTED]> wrote:
> Daniel Berlin wrote:
>> you may of course be right and this is what we will end up doing, but the
> implications for whopr are not good.   The parser is going to have to work
> in lockstep with the type merger

Why?

You don't want to merge the types in the debuginfo.

You only have to parse the debuginfo types that correspond to types
you've changed in some fashion
(and if you don't want to do that you only have to parse to update the
variable info, which means you don't even have to parse or follow the
DW_AT_type references)


Re: lto gimple types and debug info

2008-07-27 Thread Daniel Berlin
On Sun, Jul 27, 2008 at 7:50 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
> Daniel Berlin wrote:
>
>> Then again, I also don't see what the big deal about adding a debug
>> info parser is.
>
> OK, yes, we may need to read debug info back in.
>
> I don't see it as a big deal, either -- and I also don't see it as locking
> us into DWARF2.  We can presumably read in any formats we are about, so if
> we want to add a stabs reader, we can do that to support stabs platforms.
>  And, until we have a stabs reader, we can just drop debug info on those
> platforms when doing LTO.  So, we just have to design LTO with some
> abstraction over debug info in mind.

Yes, this is what i would suggest.

I'll also note that GDB already contains such an abstraction, which
was based on STABS, rather than DWARF.

>
> In fact, we could probably treat DWARF as canonical, and have a STABS->DWARF
> input filter and DWARF->STABS output filter, if we like.

Sure. Again, this input filter is basically what GDB does, converting
DWARF -> internal debuginfo abstraction.