Hi Milian, On Mon, 2020-06-22 at 10:29 +0200, Milian Wolff wrote: > On Montag, 15. Juni 2020 18:54:41 CEST Josh Stone wrote: > > On 6/13/20 10:40 AM, Milian Wolff wrote: > > > Has anyone an idea on how to to post-process the DWARF data to optimize > > > the > > > lookup of inlined frames? > > > > SystemTap implements its own cache for repeated lookups -- see > > dwflpp::get_die_parents(). > > Thanks, I've come up with something similar over the weekend before reading > your mail. The performance boost is huge (5x and more). > > Looking at your code, I think that I'm not yet handling a few corner cases > (such as imported units). That, paired with the fact that at least three > users > of this API have apparently by now come up with a similar solution clearly > makes a case for upstreaming this into a common API.
Yes, I think having an elfutils/libdw API for this would be very useful. And it would also be useful for eu-addr2line and eu-stack when looking up inlined functions. The imported (partial) units are a little tricky because they cross CUs (and sometimes even Dwarfs for example when dealing with dwz/multi- files). It also means that a Die inside the partial unit can have multiple parents, because they might have been imported through different imports. But I guess that if we associate a parent cache with one CU, then this is clean (unless the CU imports the same partial unit multiple times...). > I believe that there is a lot of data that potentially needs to be cached. > Additionally, doing it behind the scenes may raise questions regarding multi > threaded usage of the API (see my other mail relating to that). > > Which means: an explicit API to opt-in to this behavior is safest and best I > believe. Maybe something as simple as the following would be sufficient? > > ``` > /* Cache parent DIEs chain to speed up repeated dwarf_getscopes calls. > > Returns -1 for errors or 0 if the parent chain was cached already. */ > extern int dwarf_cache_parent_dies(Dwarf_Die *cudie); > ``` > > Alternatively a function that returns the cache could be considered, which > would then require new versions of dwarf_getscopes* that take the cache as an > argument. I think an API that makes the "caching" explicit might be best. Maybe we can call it a DieTree? We would then have a function to create a DieTree and (new) functions that take a DieTree (and a Dwarf_Die) to operate on it. The user can then also destroy the DieTree again when done. Cheers, Mark