Hi Aaron, On Mon, 2023-12-11 at 18:18 -0500, Aaron Merey wrote: > No longer use .debug_aranges to build the aranges list since it could be > absent or incomplete. > > Instead build the aranges list by iterating over each CU and recording > each address range. > > https://sourceware.org/bugzilla/show_bug.cgi?id=22288 > https://sourceware.org/bugzilla/show_bug.cgi?id=30948 > > Signed-off-by: Aaron Merey <ame...@redhat.com> > --- > > v2 adds a test for generating aranges from a binary with no > .debug_aranges. > > This patch's method of building the aranges list is slower than simply > reading .debug_aranges. On my machine, running eu-stack on a 2.9G > firefox core file takes about 8.7 seconds with this patch applied, > compared to about 3.3 seconds without this patch.
That is significant. 2.5 times slower. Did you check with perf or some other profiler where exactly the extra time goes. Does the new method find more aranges (and so produces "better" backtraces)? > Ideally we could assume that .debug_aranges is complete if it is present > and build the aranges list via CU iteration only when .debug_aranges > is absent. This would let us save time on gcc-compiled binaries, which > include complete .debug_aranges by default. Right. This why the question is if the firefox case sees more/less aranges. If I remember correctly it is build with gcc and rustc, and rustc might not produce .debug_aranges. > However the DWARF spec appears to permit partially complete > .debug_aranges [1]. We could improve performance by starting with a > potentially incomplete list built from .debug_aranges. If a lookup > fails then search the CUs for missing aranges and add to the list > when found. > > This approach would complicate the dwarf_get_aranges interface. The > list it initially provides could no longer be assumed to be complete. > The number of elements in the list could change during calls to > dwarf_getarange{info, _addr}. This would invalidate the naranges value > set by dwarf_getaranges. The current API doesn't include a way to > communicate to the caller when narages changes and by how much. > > Due to these complications I think it's better to simply ignore > .debug_aranges altogether and build the aranges table via CU iteration, > as is done in this patch. Might it be an idea to leave dwarf_getaranges as it is and introduce a new (internal) function to get "dynamic" ranges? It looks like what programs (like eu-stack and eu-addr2line) really use is dwarf_addrdie and dwfl_module_addrdie. These are currently build on dwarf_getaranges, but could maybe use a new interface? > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22288#c5 So this comment says that "parsing CUs lightly (just enough to get their CU ranges) should be fairly cheap", but it seems that isn't really true. Or at least parsing .debug_aranges is a lot (2.5 times) faster (measured in seconds). It would be good to better understand why this is. Cheers, Mark