On Thu, Mar 11, 2021 at 4:29 PM Greg Clayton <clayb...@gmail.com> wrote:
> > > On Mar 11, 2021, at 1:12 PM, Paul Robinson via Dwarf-Discuss < > dwarf-discuss@lists.dwarfstd.org> wrote: > > Tom Russell could perhaps speak to this better, but my understanding is > that our debugger guys like having .debug_aranges, because parsing the CU > DIE does take that extra effort. I am unfamiliar with their code so I have > to take their word on it. But I can certainly imagine that probing > hundreds to thousands of CUs in order to collect range information with > lengthy range lists would be more expensive than running through a > comparatively compact .debug_aranges list. If Tom tells me I’m wrong, > well, wouldn’t be the first time. > > > We will use them if they are there, but one interesting issue that we ran > into with LLDB is some compile units might be in .debug_aranges because the > compiler made a .debug_aranges section in the .o file, but others might > not. So we had to add code to LLDB to figure out which compile units have > any entries in the .debug_aranges section, and read the DW_AT_ranges from > the DW_TAG_compile_unit if it exist, and if it doesn't, manually index the > DWARF to create one on the fly each time. > > > One thing we have encountered (see issue 210113.1) is that when we’ve done > dead-stripping, .debug_aranges entries (one per function, typically, > because -ffunction-sections) can end up pointing to nothing. In our > proprietary linker I believe we compress/rewrite .debug_aranges to minimize > the number of entries, which by coincidence ends up producing a conforming > aranges list; LLD doesn’t do that, which means it produces a non-conforming > list (with zero-length entries), hence the issue. > > I’ll have to think about what a “modern” .debug_aranges might want to look > like. > > > A big issue with any of the DWARF sections is we are subject to making the > contents work with linkers that just want to concatenate + relocate. This > often leads to information being kept around when dead stripping occurs > because anything that is dead stripped will just have its address zero'ed > out or -1'ed out, but this bogus info is still in the data. > Yeah, we talked some last year about formalizing this more into the -1 tombstone - I thought maybe Paul had proposed that for standardization, though at a glance I don't see the proposal. It's probably somewhere there. > If we don't need a format that can simply be concatenated and relocated, > the GSYM format, which is open sourced in llvm.org already, might be good > inspiration for a .debug_aranges successor section that has very efficient > lookups. The GSYM format could actually be used as is by adding only a new > DIE offset IntoType. > > Besides ".debug_names", all other DWARF accelerator tables are really just > random indexes that must be linearly scanned or pre-indexed prior to being > used because of the concatenate + relocate style that is used for these > DWARF sections. It would be great if any future accelerator tables are "map > into memory and use as is" kind of tables like ".debug_names" and the > ".apple_XXX" name accelerator tables. > Ah, fair point - could come up with a rather different structure if it were designed for fast on-disk query (though then, like .debug_names (which I don't think we have any linkers that can link today, for instance), you'd probably /really/ want it to be linked in a content-aware manner, because probing separate lookup tables (even if they're more designed for that) per-CU doesn't probably gain you a lot). - Dave > > > Thanks, > --paulr > > *From:* David Blaikie <dblai...@gmail.com> > *Sent:* Thursday, March 11, 2021 3:48 PM > *To:* Robinson, Paul <paul.robin...@sony.com> > *Cc:* Cary Coutant <ccout...@gmail.com>; DWARF Discuss < > dwarf-discuss@lists.dwarfstd.org> > *Subject:* debug_aranges use and overhead > > On Thu, Mar 11, 2021 at 5:48 AM <paul.robin...@sony.com> wrote: > > Hopefully not to side-track things too much... maybe wants its own > thread, if there's more to debate here. > > > Yeah, how about we spin it off into another thread (done here) > > > >> For the case you suggested where it would be useful to keep the range > >> list for the CU in the .o file, I think .debug_aranges is what you're > >> looking for. > > > > aranges has been off by default in LLVM for a while - it adds a lot of > > overhead (doesn't have all the nice rnglist encodings for instance - > > nor can it use debug_addr, and if it did it'd still be duplicate with > > the CU ranges wherever they were). > > Did you want to file an issue to improve how .debug_aranges works? > > > I don't currently understand the value it provides, and I at least don't > have a use case for it, so I'm not sure I'd be the best person to > advocate/drive that work. > > Complaining that it duplicates CU ranges is missing the point, though; > it's an index, like .debug_names, of course it duplicates other info. > If you want to suggest an improved index, like we did with .debug_names, > that would be great too. > > > .debug_names is quite different though - it collects information from > across the DIE tree - information that is expensive to otherwise gather > (walking the whole DIE tree). > > .debug_aranges is not like that for most producers (producers that do > include the address ranges on the CU DIE) - the data is readily available > immediately on the CU. That does involve reading some of .debug_abbrev, and > interpreting a handful of attributes - but at least for the use cases I'm > aware of, that overhead isn't worth the size increase. > > Do you have numbers on the benefits of .debug_aranges compared to parsing > the ranges from CU DIEs? > > (one possible issue: the CU doesn't /have/ to contain low/high/ranges if > its children DIEs contain addresses - having that as a guarantee, or some > preferred way of encoding zero length (high/low of 0 would be acceptable, I > guess) would be nice & make it cheap to skip over CUs that don't have any > address ranges) > > Roughly, a modern debug_aranges to me would look something like: > > <length> > <version> > <CU sec_offset> > <addr_base> > <rnglist sec_offset> > > So it could fully re-use the rnglist encoding. If this was going to be as > compact as possible, it'd need to be configurable which encodings it uses - > ranges V high/low, addrx V addr - at which point it'd probably look like a > small DIE with an inline abbrev (similar to the way DWARFv5 encodes the > file and directory entries now, and how debug_names is self-describing) - > at which point it looks to me a lot like parsing the CU DIEs. > > _______________________________________________ > Dwarf-Discuss mailing list > Dwarf-Discuss@lists.dwarfstd.org > http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org > > >
_______________________________________________ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org