Re: [debug-early] LTO streaming of on-the-side dwarf data structures

Aldy Hernandez Tue, 14 Oct 2014 13:08:26 -0700

On 10/14/14 06:21, Richard Biener wrote:

On Tue, Oct 14, 2014 at 2:48 AM, Aldy Hernandez <al...@redhat.com> wrote:

Gentlemen, your feedback would be greatly appreciated!


I was investigating why locals were not being early dumped, and realized
Michael's patch was skipping decls_for_scope() unless
DECL_STRUCT_FUNCTION->gimple_df was set.  I assume this was to wait until
location information was available.  That design caused locals to be dumped
LATE, which defeats the whole purpose of this exercise.

Since we want the local DECL DIEs to be generated early as well, we'd want
the location information to be amended in the second dwarf pass. This got me
thinking about deferred_locations, and all these on-the-side data structures
that a second dwarf pass would depend on.  Unless I'm misunderstanding
something, we need a plan...

Basically, any early collected data that dwarf2out_finish() and
dwarf2out_function_decl() need, would need to be LTO streamed out after
early dwarf generation and streamed back before the second dwarf pass. For
instance, I see at least the following that need to be streamed in/out:

         file_table
         deferred_locations
         limbo_die_list (and deferred_asm_name)
         decl_die_table
         pubname_table
         pubtype_table


I think that all but decl_die_table should not be needed (which may
need implementation changes in dwarf2out.c of course).  Maybe you
can explain why you think they are needed late.

I see what you mean. With some minor surgery I was able to remove allreferences to deferred_locations. I assume limbo_die_list anddeferred_asm_name can be submitted to similar surgery.

How about file_table? In dwarf2out_finish() we need file_table to emitDW_AT_comp_dir if no relative file names are used. I suppose we coulddetermine that information by traversing the DIE table and scanning allDW_AT_decl_file, albeit slower. Would this be acceptable?

How about pubname_table (and pubtype_type??)? It looks like we need alist of all publicly accessible names, but output_pubnames() ends upwriting directly to the assembly file, and this can only happen at thevery end of dwarf generation. I suppose we could also traverse the DIEtable and pick publicly accessible names (direct children ofDW_TAG_compile_units, and/or some other static/extern flag in theDIE??)??. Am I missing something?

For decl_die_table the idea was to be able to create references to
the early output DIEs via decl->die_offset (to be added and LTO streamed)
and the translation unit decls symbol of the dwarf tree root.

How so? Do you mean by storing the DECL's DECL_UID in the correspondingdie_offset, since die_offset will be zero (and unused) after early dwarfdumping? If so, that's kinda neat. We could recreate the hash from that.

Another similar issue I've seen is handling DW_TAG_lexical_block(gen_lexical_block_die). Ideally we should generate theDW_TAG_lexical_block and the corresponding locals in early dumping, andthen fill in the high/low attributes of the lexical block the secondtime around. We would need a hash similar to decl_die_table to get fromBLOCK->DW_TAG_lexical_block, similar to die_table_offset. For thatmatter, we could store the relationship in die_table_offset, or in thedie_offset if I understood things correctly.

We could either stream the hash tables and/or data structures above (and
merge them from different compilation units upon stream-in), or we could
come up with some way of annotating existing dwarf (to be read/merged back
in and annotated).  For instance, deferred_locations, decl_die_table, and
limbo_die_lists need to associate a DIE with a TREE.  We could tag the DIE
with an artificial DW_AT_* that has some byte representation of the TREE and
recreate the hash at LTO stream-in time.  For other data structures (perhaps
file_table and pub*_table), perhaps we could come up with yet another way o
representing the data in Dwarf.


Why do we end up with any deferred stuff after early dwarf?  Similarly
nothing should be on the limbo list (instead we should properly construct
early dwarf!).

However...I don't know if this is worth the trouble, or if we should just
stream the individual hash tables and data structures, and not bother with
this dwarf gymnastics.

Did anybody have a plan for this?  Am I misunderstanding something, or do we
need to stream a lot of these on-the-side structures?


I indeed hope we don't need to stream all this, but it may need more
"structured" generation of early dwarf (so we always have an origin
and thus do not need the limbo list for example).


What's this "so we always have an origin..." bit?  I'm not following.

Thanks for great insight. The code looks much cleaner now withoutdeferred_locations (and soon with many more deletions :)).


Aldy

Re: [debug-early] LTO streaming of on-the-side dwarf data structures

Reply via email to