Responses inline. On Fri, Mar 19, 2021 at 9:59 PM David Blaikie <dblai...@gmail.com> wrote:
> On Fri, Mar 19, 2021 at 9:34 AM Samy Al Bahra <sba...@repnop.org> wrote: > [...] > This is quite old (excuse the formatting) but numbers are here: >> https://engineering.backtrace.io/2014-09-15-bt-lightweight-backtrace-tool/ >> , search for "Chromium". This is something other debuggers can take >> advantage of if they run in a non-interactive / batch mode (think bulk >> processing of millions - billions of dumps a month) >> > > "This is something... " - what is "this" you're referring to there? Lazy > loading? Yeah, for sure. Why do you restrict/suggest that a highly lazy > approach would only be suitable for non-interactive/batch execution? > This is quite old, this = blog post. This is something other debuggers can take advantage of: Lazy loading is more effective for automated analysis tools than interactive debuggers which more often than not don't benefit from lazy evaluation if folks are expecting auto-complete for types, variables, etc... Of course, it is still useful for non-blocking loads of debug data especially if you implement job cancellation (allow commands to be executed concurrently while loading is being completed). [...] > > >> I'm also happy to run benchmarks for you with and without .debug_aranges >> on top of our debugger if it'll be useful. >> > > Yeah, I'd certainly be curious if you have a chance! Though it may depend > a bit on what your implementation does in the absence of .debug_aranges. > I'll get back to you on this shortly! > > >> One of the crucial optimizations we made is incremental indexing on top >> of .debug_aranges based on PC values >> > > Could you explain that in more detail - and why that approach can't be > used with CU ranges? > .debug_aranges is significantly smaller and faster to load than scanning all of .debug_info. > > >> (+ complexities Greg mentions later in the thread). In cases where we >> lack this, we use our own persistent cache which introduces unnecessary >> complexity. Now I am considering going as far as adding a multi-threaded >> indexer for cases where a persistent cache / build system modifications >> aren't an option (work to begin in the next week or two). >> >> .debug_aranges would provide a lot of value to our users. >> >> On Thu, Mar 11, 2021 at 3:48 PM David Blaikie via Dwarf-Discuss < >> dwarf-discuss@lists.dwarfstd.org> wrote: >> >>> On Thu, Mar 11, 2021 at 5:48 AM <paul.robin...@sony.com> wrote: >>> >>>> Hopefully not to side-track things too much... maybe wants its own >>>> thread, if there's more to debate here. >>>> >>> >>> Yeah, how about we spin it off into another thread (done here) >>> >>> >>>> >> For the case you suggested where it would be useful to keep the range >>>> >> list for the CU in the .o file, I think .debug_aranges is what you're >>>> >> looking for. >>>> > >>>> > aranges has been off by default in LLVM for a while - it adds a lot of >>>> > overhead (doesn't have all the nice rnglist encodings for instance - >>>> > nor can it use debug_addr, and if it did it'd still be duplicate with >>>> > the CU ranges wherever they were). >>>> >>>> Did you want to file an issue to improve how .debug_aranges works? >>>> >>> >>> I don't currently understand the value it provides, and I at least don't >>> have a use case for it, so I'm not sure I'd be the best person to >>> advocate/drive that work. >>> >>> Complaining that it duplicates CU ranges is missing the point, though; >>>> it's an index, like .debug_names, of course it duplicates other info. >>>> If you want to suggest an improved index, like we did with .debug_names, >>>> that would be great too. >>>> >>> >>> .debug_names is quite different though - it collects information from >>> across the DIE tree - information that is expensive to otherwise gather >>> (walking the whole DIE tree). >>> >>> .debug_aranges is not like that for most producers (producers that do >>> include the address ranges on the CU DIE) - the data is readily available >>> immediately on the CU. That does involve reading some of .debug_abbrev, and >>> interpreting a handful of attributes - but at least for the use cases I'm >>> aware of, that overhead isn't worth the size increase. >>> >>> Do you have numbers on the benefits of .debug_aranges compared to >>> parsing the ranges from CU DIEs? >>> >>> (one possible issue: the CU doesn't /have/ to contain low/high/ranges if >>> its children DIEs contain addresses - having that as a guarantee, or some >>> preferred way of encoding zero length (high/low of 0 would be acceptable, I >>> guess) would be nice & make it cheap to skip over CUs that don't have any >>> address ranges) >>> >>> Roughly, a modern debug_aranges to me would look something like: >>> >>> <length> >>> <version> >>> <CU sec_offset> >>> <addr_base> >>> <rnglist sec_offset> >>> >>> So it could fully re-use the rnglist encoding. If this was going to be >>> as compact as possible, it'd need to be configurable which encodings it >>> uses - ranges V high/low, addrx V addr - at which point it'd probably look >>> like a small DIE with an inline abbrev (similar to the way DWARFv5 encodes >>> the file and directory entries now, and how debug_names is self-describing) >>> - at which point it looks to me a lot like parsing the CU DIEs. >>> >>> _______________________________________________ >>> Dwarf-Discuss mailing list >>> Dwarf-Discuss@lists.dwarfstd.org >>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org >>> >> >> >> -- >> Samy Al Bahra [http://repnop.org] >> > -- Samy Al Bahra [http://repnop.org]
_______________________________________________ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org