https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87362

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So I tried debugging using LTO bootstrapped cc1.  profiling gdb for a simple

gdb ./cc1
(gdb) b do_rpo_vn
(gdb) q

yields

Samples: 2K of event 'instructions', Event count (approx.): 45695722362         
Overhead  Command  Shared Object        Symbol                                  
   8.32%  gdb      gdb                  [.] read_attribute_value
   5.78%  gdb      gdb                  [.] dwarf2_attr
   5.10%  gdb      gdb                  [.] load_partial_dies
   4.23%  gdb      gdb                  [.] cp_find_first_component_aux
   4.10%  gdb      gdb                  [.] partial_die_info::read
   3.55%  gdb      gdb                  [.] htab_find_slot_with_hash
   3.11%  gdb      gdb                  [.] get_objfile_arch
   2.98%  gdb      gdb                  [.] peek_die_abbrev
   2.88%  gdb      gdb                  [.] cp_canonicalize_string

or with a callgraph

Samples: 1K of event 'instructions', Event count (approx.): 37206022209         
  Children      Self  Command  Shared Object        Symbol                    
◆
+   91.92%     0.00%  gdb      gdb                  [.] gdb_main              
▒
+   91.47%     0.00%  gdb      gdb                  [.] main                  
▒
+   91.42%     0.00%  gdb      libc-2.22.so         [.] __libc_start_main     
▒
+   91.35%     0.00%  gdb      gdb                  [.] catch_command_errors  
▒
+   91.30%     0.00%  gdb      gdb                  [.] _start                
▒
+   85.40%     0.00%  gdb      gdb                  [.]
symbol_file_add_main_ad▒
+   85.40%     0.00%  gdb      gdb                  [.] symbol_file_add_main  
▒
+   55.17%     0.00%  gdb      gdb                  [.] psym_lookup_symbol    
▒
+   55.13%     0.00%  gdb      gdb                  [.] psymtab_to_symtab     
▒
+   55.13%     0.00%  gdb      gdb                  [.] dwarf2_read_symtab    
▒
+   55.13%     0.00%  gdb      gdb                  [.]
dw2_do_instantiate_symt▒
+   55.06%     0.00%  gdb      gdb                  [.]
lookup_symbol_in_objfil▒
+   55.02%     0.00%  gdb      gdb                  [.] lookup_global_symbol  
▒
+   55.02%     0.00%  gdb      gdb                  [.]
default_iterate_over_ob▒
+   55.02%     0.00%  gdb      gdb                  [.]
lookup_symbol_global_it▒
+   55.00%     0.00%  gdb      gdb                  [.] lookup_symbol_aux     
▒
+   54.99%     0.00%  gdb      gdb                  [.]
basic_lookup_symbol_non▒
+   54.94%     0.00%  gdb      gdb                  [.]
lookup_symbol_in_langua▒
+   54.83%     0.00%  gdb      gdb                  [.] lookup_symbol         
▒
+   54.77%     0.00%  gdb      gdb                  [.] set_initial_language  
▒
+   43.75%     0.49%  gdb      gdb                  [.] process_die     

but that doesn't look too useful.

Note that startup / breakpointing isn't as fast as non-LTOed cc1 but it's
still usable.  I notice that while .debug_ranges is quite large the
.debug_aranges section is small.  I wonder through what hoops gdb needs to
go to get at the entry address for main() - I can imagine that because
the late LTO debug only contains the ranges attribute but not DW_AT_name
gdb has to follow all LTO debug DIE abstract origins.  Since those
abstract origins are in DW_TAG_imported_unit imported CUs it may (hopefully
lazily!) need to parse those when an abstract origin refers to a DIE
within them.

At least I don't see sth like a "symbol table" refering to the late LTO DIEs
in DWARF.

Maybe if we're lucky and main() is the very first DIE we run into startup
would be faster.

Of course looking at the startup / breakpoint differences between LTO
and non-LTO might yield to a better understanding of things here.  For
example it might be possible to optimize the poking at DW_AT_name
via an abstract origin _without_ needing to pull in all of the imported
unit if it's from such kind of searching.

When using callgrind it seems that the whole complication comes in via
symbol_file_add_main -> ... -> read_symbols -> ... -> read_psyms ->
dwarf2_build_psymtabs as expected.  So somehow avoiding to pull in all
the early LTO CUs would be the thing to do(?) - maybe we can add
DW_AT_linkage_name to the late generated DIEs to help gdb (we seem
to not do that).  In fact we seem to add them to the early DIEs
(probably needed for TYPE_DECLs).

I'm trying a hack like

Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c     (revision 264418)
+++ gcc/dwarf2out.c     (working copy)
@@ -6018,6 +6018,9 @@ dwarf2out_register_external_die (tree de
       break;
     case FUNCTION_DECL:
       die = new_die (DW_TAG_subprogram, parent, decl);
+      /* This helps debuggers to build a symbol table.  */
+      if (! flag_wpa && flag_incremental_link != INCREMENTAL_LINK_LTO)
+       add_linkage_name (die, decl);
       break;
     case VAR_DECL:
       die = new_die (DW_TAG_variable, parent, decl);

Reply via email to