Re: RFC: Adding non-PIC executable support to MIPS
Daniel Jacobowitz <[EMAIL PROTECTED]> writes: > All comments welcome - Richard, especially from you. How would you > like to proceed? I think the first step should be to get your other > binutils/gcc patches merged, including MIPS16 PIC; I used those as a > base. But see a few of the notes for potential problems with those > patches. Yeah, Nick's approved most of the remaining binutils changes (thanks). I haven't applied them yet because of the doubt over whether st_size should be even or odd for ISA-encoded MIPS16 symbols. I don't really have an opinion, so I'll accept a maintainerly decision... Anyway, the gcc patch looks good to me, thanks. The only niggle I could see was that you didn't update the comment for: +/* True if the output file is marked as ".abicalls; .option pic0" + (-call_mixed). This is a GNU extension. */ +#define TARGET_ABICALLS_PIC0 \ + (TARGET_ABSOLUTE_ABICALLS && TARGET_PLT) (That kind of thing was inevitable given the amount of code you had to wade through. I'm impressed if there's really only one instance!) I think the gcc side is good to go, modulo the _mcount thing. As far as binutils goes, I saw a couple of potential problems: (1) The patch adds the following code to _bfd_mips_elf_create_dynamic_sections: + if (htab->use_plts_and_copy_relocs && htab->root.hplt == NULL) + { + h = _bfd_elf_define_linkage_sym (abfd, info, s, + "_PROCEDURE_LINKAGE_TABLE_"); + htab->root.hplt = h; + if (h == NULL) + return FALSE; + h->type = STT_FUNC; + } But use_plts_and_copy_relocs is only set after all input bfds have been read in. (2) The patch sets pointer_equality_needed as follows: @@ -7432,9 +7484,18 @@ _bfd_mips_elf_check_relocs (bfd *abfd, s elf_hash_table (info)->dynobj = dynobj = abfd; break; } - /* Fall through */ + /* Fall through. */ default: + /* Most static relocations require pointer equality, except +for branches. */ + if (h) + h->pointer_equality_needed = TRUE; + /* Fall through. */ + + case R_MIPS_26: + case R_MIPS_PC16: + case R_MIPS16_26: if (h) ((struct mips_elf_link_hash_entry *) h)->has_static_relocs = TRUE; break; But pointer equality is needed for non-call GOT relocations too. I couldn't see anything that explicitly handled that case. I think it would be more robust to set pointer_equality_needed in a separate block, rather than combining it with the existing switch statements. It might then be clearer to set has_nonpic_branches in the new block too, so that you don't need two copies of: if (h && !PIC_OBJECT_P (abfd)) ((struct mips_elf_link_hash_entry *) h)->has_nonpic_branches = TRUE; Some minor nits too: + 0x0399, /* l[wd] $25, %lo(&GOTPLT[0])($28) */ + 0x01d9, /* l[wd] $25, %lo(&GOTPLT[0])($14) */ + 0x01d9, /* l[wd] $25, %lo(&GOTPLT[0])($14) */ These are all fixed as either lw or ld. @@ -1649,13 +1695,16 @@ mips_elf_check_symbols (struct mips_elf_ /* H is a function that might need $25 to be valid on entry. If we're creating a non-PIC relocatable object, mark H as being PIC. If we're creating a non-relocatable object with -non-PIC references to H, make sure that H has an la25 stub. */ +branches to H, make sure that H has an la25 stub. Only +use the stub for branches from non-PIC objects; GCC's +-mno-shared uses branches from PIC objects to functions +which do not require $25. */ if (hti->info->relocatable) { if (!PIC_OBJECT_P (hti->output_bfd)) h->root.other = ELF_ST_SET_MIPS_PIC (h->root.other); } - else if (h->non_pic_ref && !mips_elf_add_la25_stub (hti->info, h)) + else if (h->has_nonpic_branches && !mips_elf_add_la25_stub (hti->info, h)) { hti->error = TRUE; return FALSE; How about something like the following: -non-PIC references to H, make sure that H has an la25 stub. */ +non-PIC branches and jumps to H, make sure that H has an la25 stub. +We specifically ignore branches and jumps from EF_PIC objects, +where the onus is on the compiler or programmer to perform any +necessary initialization of $25. Sometimes such initialization +is unnecessary; for example, -mno-shared functions do not use +the incoming value of $25, and may therefore be called directly. */ (Wordsmith as necessary.) The original wording made it sound like we'd created a stub if there were any branches at all, but that the stub would only be used for branches from non-PIC objects. @@ -2928,6 +2977,7 @@ mips_elf_gotplt_index (struct bfd_link_i struct m
Re: [tuples] New memory/time comparison vs trunk
> > - The rest of the memory utilization difference is mostly in inlining > (240Kb) and SSA update (50Kb). > > I think the main focus points should be DSE and trying to get a good > way of measuring the memory utilization differences. Jan, any > suggestion? I've switched memory tester to tuples now. The full report is in gcc-regressions and graphs here http://gcc.opensuse.org/memory/graphs/index.html Overall the footprint improved. For combine.c, the gimplified program now needs 3.6% less memory, overall amount of GGC memory referenced at one time decreased by 5% pretty much at all levels. Insn-attrtab is 14% smaller befora IPA, overlall footprint is 12% smaller. Gerald's testcase shows 2% increase in overall allocation, but memory footprint is still 3% smaller. This might be increased inlining, but also the aliassing issue bellow. The top memory allocations: rtl.c:269 (copy_rtx)1594936: 2.6% gimple-iterator.c:447 (gsi_insert_after_without_1714000: 2.8% stringpool.c:74 (alloc_node) 761696: 1.2% gimplify.c:521 (create_tmp_var_raw) 1852032: 3.0% cselib.c:1155 (cselib_subst_to_values) 1914640: 3.1% emit-rtl.c:3339 (make_insn_raw) 1945592: 3.2% tree-inline.c:3563 (copy_tree_r)2450104: 4.0% tree-ssanames.c:141 (make_ssa_name_fn) 3374640: 5.5% tree-phinodes.c:157 (allocate_phi_node) 3475264: 5.7% (this is combine.c at O2) gimple.c:2098 (gimple_copy) 8735664: 2.5% gimple-iterator.c:447 (gsi_insert_after_without_9558400: 2.8% gimplify.c:521 (create_tmp_var_raw) 9737784: 2.8% tree-phinodes.c:157 (allocate_phi_node)15222104: 4.4% tree-ssanames.c:141 (make_ssa_name_fn) 16784400: 4.8% tree-inline.c:4062 (copy_decl_no_change) 18668784: 5.4% tree-inline.c:3563 (copy_tree_r) 19876608: 5.7% (this is Gerald's testcase at O2). PHI nodes and SSA names now got very up. This is all from virtual operands. Without aliasing at combine.c I get 1.5MV (instead of 3.3MB) for SSA names and 195KB (instead of 3.4MB) for PHI nodes. Perhaps something got wrong with aliasing heruistics? It seem to show in your scores too: > ^ tree alias analys4.474.531.34% 68751 710853.39% > ^ tree call clobber0.420.59 40.48% 8871084 22.21% For space reasons memory tester no longer save older reports so I can't compare easilly with mainline, but I can give it a try if this is not obvious problem somewhere. (requires me to build stuff by hand that is not hard). Honza
Re: Recent warning regression: no return statement in function returning non-void
On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer <[EMAIL PROTECTED]> wrote: > I believe the following happened in the last 48 or so hours; I saw > this triggered by my nightly Wine builds which in turn use my nightly > GCC builds. ;-) > > For code like the following where we have an infinite loop in a > non-void function, we now (incorrectly) issue a warning with all > of -O0, -O1 and -O2 whereas previously we would not: > > void g(); > > int f() { >for(;;) > g(); > } > > % gccvs -c -Wall x.c > x.c: In function 'int f()': > x.c:6: warning: no return statement in function returning non-void I think the warning is perfectly correct. There is no return statement in that function and it does return non-void. The warning doesn't say that the function does return without a value. Richard.
Recent warning regression: no return statement in function returning non-void
I believe the following happened in the last 48 or so hours; I saw this triggered by my nightly Wine builds which in turn use my nightly GCC builds. ;-) For code like the following where we have an infinite loop in a non-void function, we now (incorrectly) issue a warning with all of -O0, -O1 and -O2 whereas previously we would not: void g(); int f() { for(;;) g(); } % gccvs -c -Wall x.c x.c: In function 'int f()': x.c:6: warning: no return statement in function returning non-void Looking at the ChangeLog changes that look most plausible are the unit-at-a-time ones, though I'm not sure how that would apply since this is independent of optimization level. Tested on i386-unknown-freebsd6. I verified that i586-suse-linux does not warn with GCC 4.2.1 and am just building current trunk there as well. Gerald
Build failure with Cygwin
Dear All, Perhaps this is old news/my fault but I am seeing the following on Cygwin_NT/amd64: /irun/bin/gcc -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -W missing-prototypes -Wcast-qual -Wold-style-definition -Wc++-compat -Wmissing-for mat-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wn o-overlength-strings -fno-common -DHAVE_CONFIG_H -o cc1-dummy.exe c-lang.o stu b-objc.o attribs.o c-errors.o c-lex.o c-pragma.o c-decl.o c-typeck.o c-convert.o c-aux-info.o c-common.o c-opts.o c-format.o c-semantics.o c-ppoutput.o c-cppbui ltin.o c-objc-common.o c-dump.o c-pch.o c-parser.o i386-c.o cygwin2.o msformat-c .o c-gimplify.o tree-mudflap.o c-pretty-print.o c-omp.o dummy-checksum.o \ main.o tree-browser.o libbackend.a ../libcpp/libcpp.a ../libdecnumber/ libdecnumber.a ../libcpp/libcpp.a -lintl -liconv ../libiberty/libiberty.a ../lib decnumber/libdecnumber.a -lmpfr -lgmp libbackend.a(stringpool.o): In function `ggc_purge_stringpool': ../../trunk/gcc/stringpool.c:192: undefined reference to `_ht_purge' collect2: ld returned 1 exit status make[2]: *** [cc1-dummy.exe] Error 1 make[2]: Leaving directory `/svn/build/gcc' make[1]: *** [install-gcc] Error 2 make[1]: Leaving directory `/svn/build' make: *** [install] Error 2 Other than the obvious, any suggestions? Paul
Re: Recent warning regression: no return statement in function returning non-void
> On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer <[EMAIL PROTECTED]> wrote: > > I believe the following happened in the last 48 or so hours; I saw > > this triggered by my nightly Wine builds which in turn use my nightly > > GCC builds. ;-) > > > > For code like the following where we have an infinite loop in a > > non-void function, we now (incorrectly) issue a warning with all > > of -O0, -O1 and -O2 whereas previously we would not: > > > > void g(); > > > > int f() { > >for(;;) > > g(); > > } > > > > % gccvs -c -Wall x.c > > x.c: In function 'int f()': > > x.c:6: warning: no return statement in function returning non-void > > I think the warning is perfectly correct. There is no return statement > in that function and it does return non-void. The warning doesn't say > that the function does return without a value. Also if you make the function static inline, older GCC versions will trigger same warning. The problem here is that original code was relying on the fact that extern inline and static inline functions was only functions that was ever removed and not compiled. THis is not true since GCC 3.4 when callgraph code started to elliminate all static functions. Honza > > Richard.
GCC trunk frozen for the tuples merge
The trunk is frozen now until after the merge of the tuples branch which will happen tomorrow, Monday Jul 28th. Unfreezing of the trunk will be annonced after the fact. Thanks, Richard. -- Richard Guenther <[EMAIL PROTECTED]> Novell / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex
Re: lto gimple types and debug info
David Edelsohn wrote: I do not expect LTO (or WHOPR) to work on AIX -- at least not without a lot of work on wrappers around the AIX linker. However, I do not understand why enhancing GCC to support LTO -- when GCC is run without enabling LTO -- requires locking GCC completely into DWARF debugging. I agree that, at least in principle, it should be possible to emit the debug info (whether the format is DWARF, Stabs, etc.) once. So, I don't see a reason that this makes us a DWARF-only compiler either. Others have raised the issue of types which are fundamentally transformed by the compiler (such as by removing fields). I think that such opportunities are going to be relatively rare; the global "struct Window" object in a GUI library full of functions taking "struct Window *" parameters probably isn't optimizable in this way. But there will be situations where this is possible and profitable of course. In that case, I'm not sure that *type* ought to be modified at all, from the debug perspective. To the extent there's still an object of type "struct X" around, it's type is still what it was. And other things you might do in a debugger, like ask "What member functions does class X have?", have the same answer no matter the layout chosen by the compiler, including throwing out half the fields and leaving the rest in random registers. For that matter, "print sizeof(X)" should print the same value when debugging optimized code as when debugging unoptimized code, even if the compiler has optimized X away to an empty structure! There are other things we could do, like mark the *variables* of type X (rather than the type) as having no location, so that you can't print/modify objects that have been optimized in this way. That reflects more accurately the user's view of what has happened; it's not that the type itself is different as much as it is that objects of the type are hard to view. You could also add a marker to the type that says "optimized madly; debugger should proceed with caution" -- and you could do that without reloading and rewriting the type information. For example, when generating the original type debug info, emit a relocation against "X_optimized_madly" and then providing an approprivate value for the symbol at link time. I'm curious what we do with SRA at the moment. This is the same sort of problem; do we have any solutions at present? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 7:18 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: > David Edelsohn wrote: > >>I do not expect LTO (or WHOPR) to work on AIX -- at least not >> without a lot of work on wrappers around the AIX linker. However, I do >> not understand why enhancing GCC to support LTO -- when GCC is run without >> enabling LTO -- requires locking GCC completely into DWARF debugging. > > I agree that, at least in principle, it should be possible to emit the debug > info (whether the format is DWARF, Stabs, etc.) once. So, I don't see a > reason that this makes us a DWARF-only compiler either. > > Others have raised the issue of types which are fundamentally transformed by > the compiler (such as by removing fields). I think that such opportunities > are going to be relatively rare; the global "struct Window" object in a GUI > library full of functions taking "struct Window *" parameters probably isn't > optimizable in this way. But there will be situations where this is > possible and profitable of course. > > In that case, I'm not sure that *type* ought to be modified at all, from the > debug perspective. To the extent there's still an object of type "struct X" > around, it's type is still what it was. And other things you might do in a > debugger, like ask "What member functions does class X have?", have the same > answer no matter the layout chosen by the compiler, including throwing out > half the fields and leaving the rest in random registers. For that matter, > "print sizeof(X)" should print the same value when debugging optimized code > as when debugging unoptimized code, even if the compiler has optimized X > away to an empty structure! > > There are other things we could do, like mark the *variables* of type X > (rather than the type) as having no location, so that you can't print/modify > objects that have been optimized in this way. That reflects more accurately > the user's view of what has happened; it's not that the type itself is > different as much as it is that objects of the type are hard to view. > > You could also add a marker to the type that says "optimized madly; debugger > should proceed with caution" -- and you could do that without reloading and > rewriting the type information. For example, when generating the original > type debug info, emit a relocation against "X_optimized_madly" and then > providing an approprivate value for the symbol at link time. > > I'm curious what we do with SRA at the moment. This is the same sort of > problem; do we have any solutions at present? We generate variables with names like x$y for struct { int y; } x; - in theory the debugger could "magically" associate a print x.y with x$y. But of course there is no way to express this in the DWARF. Richard.
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 11:09 AM, Richard Guenther <[EMAIL PROTECTED]> wrote: > We generate variables with names like x$y for struct { int y; } x; - in theory > the debugger could "magically" associate a print x.y with x$y. But of course > there is no way to express this in the DWARF. Actually there is a way to express this in Dwarf2, using DW_OP_piece. See the thread at http://gcc.gnu.org/ml/gcc/2005-01/msg00080.html for more information. -- Pinski
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 1:18 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: > David Edelsohn wrote: > >>I do not expect LTO (or WHOPR) to work on AIX -- at least not >> without a lot of work on wrappers around the AIX linker. However, I do >> not understand why enhancing GCC to support LTO -- when GCC is run without >> enabling LTO -- requires locking GCC completely into DWARF debugging. > > I agree that, at least in principle, it should be possible to emit the debug > info (whether the format is DWARF, Stabs, etc.) once. No, you can't. You would at least have to emit the variables separate from the types (IE emit debug info twice). > So, I don't see a > reason that this makes us a DWARF-only compiler either. > > Others have raised the issue of types which are fundamentally transformed by > the compiler (such as by removing fields). I think that such opportunities > are going to be relatively rare; the global "struct Window" object in a GUI > library full of functions taking "struct Window *" parameters probably isn't > optimizable in this way. But there will be situations where this is > possible and profitable of course. > > In that case, I'm not sure that *type* ought to be modified at all, from the > debug perspective. To the extent there's still an object of type "struct X" > around, it's type is still what it was. Uh, except that if you only write things out once, and have already written out the variables, the variable no longer has the correct type if you've rewritten the type, and if we've already emitted debug info, it won't display properly anymore (since the locations of data members the type specifies will now be incorrect). So are you suggesting we emit debug info at multiple times
Re: lto gimple types and debug info
Andrew Pinski wrote: Actually there is a way to express this in Dwarf2, using DW_OP_piece. See the thread at http://gcc.gnu.org/ml/gcc/2005-01/msg00080.html for more information. And if that's not sufficient, we can of course add extensions to DWARF that do represent it -- and that would be better than modifying the debugger to make assumptions about magic variable names. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: lto gimple types and debug info
Daniel Berlin wrote: I agree that, at least in principle, it should be possible to emit the debug info (whether the format is DWARF, Stabs, etc.) once. No, you can't. You would at least have to emit the variables separate from the types (IE emit debug info twice). Yes, of course; that's what everyone is talking about, I think. "Emit" here may also mean "cache in memory some place", rather than "write to a file". It could mean, for example, fill in the data structures we already use for types in dwarf2out.c early, and then throw away the front-end type information. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: lto gimple types and debug info
Daniel Berlin wrote: On Sun, Jul 27, 2008 at 1:18 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: David Edelsohn wrote: I do not expect LTO (or WHOPR) to work on AIX -- at least not without a lot of work on wrappers around the AIX linker. However, I do not understand why enhancing GCC to support LTO -- when GCC is run without enabling LTO -- requires locking GCC completely into DWARF debugging. I agree that, at least in principle, it should be possible to emit the debug info (whether the format is DWARF, Stabs, etc.) once. No, you can't. You would at least have to emit the variables separate from the types (IE emit debug info twice). So, I don't see a reason that this makes us a DWARF-only compiler either. Others have raised the issue of types which are fundamentally transformed by the compiler (such as by removing fields). I think that such opportunities are going to be relatively rare; the global "struct Window" object in a GUI library full of functions taking "struct Window *" parameters probably isn't optimizable in this way. But there will be situations where this is possible and profitable of course. In that case, I'm not sure that *type* ought to be modified at all, from the debug perspective. To the extent there's still an object of type "struct X" around, it's type is still what it was. Uh, except that if you only write things out once, and have already written out the variables, the variable no longer has the correct type if you've rewritten the type, and if we've already emitted debug info, it won't display properly anymore (since the locations of data members the type specifies will now be incorrect). So are you suggesting we emit debug info at multiple times it is my guess that we are still going to have to generate the debugging info for the variables late, if for no other reasons, that things like stack offsets are not set until then. If that is true, we could possibly just generate a new type and abandon the first one. kenny
Re: RFC: Adding non-PIC executable support to MIPS
Richard Sandiford wrote: Daniel Jacobowitz <[EMAIL PROTECTED]> writes: All comments welcome - Richard, especially from you. How would you like to proceed? I think the first step should be to get your other binutils/gcc patches merged, including MIPS16 PIC; I used those as a base. But see a few of the notes for potential problems with those patches. Yeah, Nick's approved most of the remaining binutils changes (thanks). I haven't applied them yet because of the doubt over whether st_size should be even or odd for ISA-encoded MIPS16 symbols. I don't really have an opinion, so I'll accept a maintainerly decision... [I'm not sure if this is a helpful suggestion or not, so feel free to ignore it if it's not.] I would suggest that st_size be the actual size of the function, as it lives in memory. A test of it's start/end location is "could I stick a random data byte there and have it affect the function". For example, for a Thumb function whose ISA address is "0x0001", I would consider for size purposes that it starts at "0x", since altering that byte at run-time would change the meaning of the function. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: > Daniel Berlin wrote: > >>> I agree that, at least in principle, it should be possible to emit the >>> debug >>> info (whether the format is DWARF, Stabs, etc.) once. >> >> No, you can't. >> You would at least have to emit the variables separate from the types >> (IE emit debug info twice). > > Yes, of course; that's what everyone is talking about, I think. "Emit" here > may also mean "cache in memory some place", rather than "write to a file". > It could mean, for example, fill in the data structures we already use for > types in dwarf2out.c early, and then throw away the front-end type > information Okay, then let us go through the options, and you tell me which you are suggesting: If you assume LTO does not have access to the front ends, your options look something like this: When you first compile each file: Emit type debug info Emit LTO When you LTO them all together Do LTO Emit variable debug info Under this option, "Emit variable info" requires being able to reference the types. If you've lowered the types, this is quite problematic. So either you get to store label names for the already output type debug info with the variables (so you can still reference the type you output properly when you read it back in). This is fairly fragile, to be honest. Another downside of this is that you can't eliminate duplicate types between units because you don't know which types are really the same in the debug info. You have to let the Another option is: When you first compile each file: Emit type debug info Emit partial variable debug info (IE add pointers to outputted types but not to locations) Emit LTO When you LTO them all together: Do LTO Parse and update variable debug info to have locations Emit variable debug info This requires parsing the debug info (in some format, be it DWARF or some generic format we've made up) so that you can update the variable info's location. As a plus, you can easily update the types where you need to. Unlike the first option, because you understand the debug info, you can now remove all the duplicate types between units without having to have the linker do it for you. Unless you link in every single frontend to LTO1 (Or move a lot to the middle end), there is no way to do the following: When you first compile each file: Emit LTO When you LTO them all together: Emit type debug info Do LTO Emit variable debug info If you don't want to link the frontends, you could also get away with moving a lot of junk to the middle end (everything from being able to distinguish between class and struct to namespaces, the context of lexical blocks) because debug info outputting uses language specific nodes all over the place right now. Unless i've missed something, our least fragile and IMHO, best option requires parsing back in debug info. It is certainly *possible* to get debug info without parsing the debug info back in. Then again, I also don't see what the big deal about adding a debug info parser is. It's not like they are all that large. [EMAIL PROTECTED]:/home/dannyb/util/debuginfo]> wc -l bytereader.* bytereader-inl.h dwarf2enums.h dwarf2reader* 40 bytereader.cc 110 bytereader.h 118 bytereader-inl.h 465 dwarf2enums.h 797 dwarf2reader.cc 373 dwarf2reader.h 1903 total (This includes both a callback style reader that simply hands you thinks you tell it to, as well as something that can read back into a format much like we use during debug info output)
Re: lto gimple types and debug info
Daniel Berlin wrote: On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: Daniel Berlin wrote: I agree that, at least in principle, it should be possible to emit the debug info (whether the format is DWARF, Stabs, etc.) once. No, you can't. You would at least have to emit the variables separate from the types (IE emit debug info twice). Yes, of course; that's what everyone is talking about, I think. "Emit" here may also mean "cache in memory some place", rather than "write to a file". It could mean, for example, fill in the data structures we already use for types in dwarf2out.c early, and then throw away the front-end type information Okay, then let us go through the options, and you tell me which you are suggesting: If you assume LTO does not have access to the front ends, your options look something like this: When you first compile each file: Emit type debug info Emit LTO When you LTO them all together Do LTO Emit variable debug info Under this option, "Emit variable info" requires being able to reference the types. If you've lowered the types, this is quite problematic. So either you get to store label names for the already output type debug info with the variables (so you can still reference the type you output properly when you read it back in). This is fairly fragile, to be honest. Another downside of this is that you can't eliminate duplicate types between units because you don't know which types are really the same in the debug info. You have to let the Another option is: When you first compile each file: Emit type debug info Emit partial variable debug info (IE add pointers to outputted types but not to locations) Emit LTO When you LTO them all together: Do LTO Parse and update variable debug info to have locations Emit variable debug info This requires parsing the debug info (in some format, be it DWARF or some generic format we've made up) so that you can update the variable info's location. As a plus, you can easily update the types where you need to. Unlike the first option, because you understand the debug info, you can now remove all the duplicate types between units without having to have the linker do it for you. Unless you link in every single frontend to LTO1 (Or move a lot to the middle end), there is no way to do the following: When you first compile each file: Emit LTO When you LTO them all together: Emit type debug info Do LTO Emit variable debug info If you don't want to link the frontends, you could also get away with moving a lot of junk to the middle end (everything from being able to distinguish between class and struct to namespaces, the context of lexical blocks) because debug info outputting uses language specific nodes all over the place right now. Unless i've missed something, our least fragile and IMHO, best option requires parsing back in debug info. It is certainly *possible* to get debug info without parsing the debug info back in. Then again, I also don't see what the big deal about adding a debug info parser is. It's not like they are all that large. [EMAIL PROTECTED]:/home/dannyb/util/debuginfo]> wc -l bytereader.* bytereader-inl.h dwarf2enums.h dwarf2reader* 40 bytereader.cc 110 bytereader.h 118 bytereader-inl.h 465 dwarf2enums.h 797 dwarf2reader.cc 373 dwarf2reader.h 1903 total (This includes both a callback style reader that simply hands you thinks you tell it to, as well as something that can read back into a format much like we use during debug info output) you may of course be right and this is what we will end up doing, but the implications for whopr are not good. The parser is going to have to work in lockstep with the type merger and all of the debug sections for all of the .o files are going to have to be parsed in lto1. My predictions is that this is going to be a bottleneck. kenny
Re: lto gimple types and debug info
Daniel Berlin wrote: Then again, I also don't see what the big deal about adding a debug info parser is. OK, yes, we may need to read debug info back in. I don't see it as a big deal, either -- and I also don't see it as locking us into DWARF2. We can presumably read in any formats we are about, so if we want to add a stabs reader, we can do that to support stabs platforms. And, until we have a stabs reader, we can just drop debug info on those platforms when doing LTO. So, we just have to design LTO with some abstraction over debug info in mind. In fact, we could probably treat DWARF as canonical, and have a STABS->DWARF input filter and DWARF->STABS output filter, if we like. I'm not hung up on the exact implementation; all I'm trying to do is address the idea that somehow we're going to make it impossible for LTO to work with non-DWARF debug info. As long as it we design it carefully, there's no reason we should have that limitation. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 7:41 PM, Daniel Berlin <[EMAIL PROTECTED]> wrote: > On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: >> Daniel Berlin wrote: >> I agree that, at least in principle, it should be possible to emit the debug info (whether the format is DWARF, Stabs, etc.) once. >>> >>> No, you can't. >>> You would at least have to emit the variables separate from the types >>> (IE emit debug info twice). >> >> Yes, of course; that's what everyone is talking about, I think. "Emit" here >> may also mean "cache in memory some place", rather than "write to a file". >> It could mean, for example, fill in the data structures we already use for >> types in dwarf2out.c early, and then throw away the front-end type >> information > Okay, then let us go through the options, and you tell me which you > are suggesting: > > If you assume LTO does not have access to the front ends, your options > look something like this: > > When you first compile each file: > Emit type debug info > Emit LTO > > When you LTO them all together > Do LTO > Emit variable debug info > > Under this option, "Emit variable info" requires being able to > reference the types. If you've lowered the types, this is quite > problematic. So either you get to store label names for the already > output type debug info with the variables (so you can still reference > the type you output properly when you read it back in). This is > fairly fragile, to be honest. > Another downside of this is that you can't eliminate duplicate types > between units because you don't know which types are really the same > in the debug info. You have to let the > > Another option is: > > When you first compile each file: > Emit type debug info > Emit partial variable debug info (IE add pointers to outputted types > but not to locations) > Emit LTO > > When you LTO them all together: > Do LTO > Parse and update variable debug info to have locations > Emit variable debug info > > This requires parsing the debug info (in some format, be it DWARF or > some generic format we've made up) so that you can update the variable > info's location. > As a plus, you can easily update the types where you need to. > Unlike the first option, because you understand the debug info, you > can now remove all the duplicate types between units without having to > have the linker do it for you. > > Unless you link in every single frontend to LTO1 (Or move a lot to > the middle end), there is no way to do the following: > > When you first compile each file: > Emit LTO > > When you LTO them all together: > Emit type debug info > Do LTO > Emit variable debug info > > If you don't want to link the frontends, you could also get away with > moving a lot of junk to the middle end (everything from being able to > distinguish between class and struct to namespaces, the context of > lexical blocks) because debug info outputting uses language specific > nodes all over the place right now. Sorry, hit send a little too early. This option also requires being able to serialize language specific nodes (or again, you move things like namespaces and other language specific contexts to the middle end), and to stop throwing this stuff out at the point we do right now. I'm not sure what most LTO compilers do. At least when i was at IBM, XLC simply output the debug info in a generic format (it was part of the definition of wcode), parsed it back in, updated it, and transformed it into DWARF/etc at the backend. This is a variant of the second option above. Again, i'm not saying it's the best option, and in fact i'm very curious what most compilers do.
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 7:48 PM, Kenneth Zadeck <[EMAIL PROTECTED]> wrote: > Daniel Berlin wrote: >> you may of course be right and this is what we will end up doing, but the > implications for whopr are not good. The parser is going to have to work > in lockstep with the type merger Why? You don't want to merge the types in the debuginfo. You only have to parse the debuginfo types that correspond to types you've changed in some fashion (and if you don't want to do that you only have to parse to update the variable info, which means you don't even have to parse or follow the DW_AT_type references)
Re: lto gimple types and debug info
On Sun, Jul 27, 2008 at 7:50 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: > Daniel Berlin wrote: > >> Then again, I also don't see what the big deal about adding a debug >> info parser is. > > OK, yes, we may need to read debug info back in. > > I don't see it as a big deal, either -- and I also don't see it as locking > us into DWARF2. We can presumably read in any formats we are about, so if > we want to add a stabs reader, we can do that to support stabs platforms. > And, until we have a stabs reader, we can just drop debug info on those > platforms when doing LTO. So, we just have to design LTO with some > abstraction over debug info in mind. Yes, this is what i would suggest. I'll also note that GDB already contains such an abstraction, which was based on STABS, rather than DWARF. > > In fact, we could probably treat DWARF as canonical, and have a STABS->DWARF > input filter and DWARF->STABS output filter, if we like. Sure. Again, this input filter is basically what GDB does, converting DWARF -> internal debuginfo abstraction.