On Thu, Mar 20, 2025, at 10:28 AM, Serhei Makarov wrote:
> On Tue, Dec 10, 2024, at 4:42 PM, Serhei Makarov wrote:
>> This email sketches an 'unwinder cache' interface for libdwfl, derived 
>> from recent eu-stacktrace code [1] and based on Christian Hergert's 
>> adaptation of eu-stacktrace as sysprof-live-unwinder [2]. The intent is 
>> to remove the need for a lot of boilerplate code that would be 
>> identical among profiling tools that use libdwfl to unwind perf_events 
>> stack samples. But since this becomes a library design, it's in need of 
>> feedback and bikeshedding.
> In advance of finishing up the Dwfl_Process_Tracker patchset
> (initial version currently under review),
> wanted to post an update summarizing the current api design.

Redid performance analysis based on the released code. I'm seeing a reduction 
of sysprof-live-unwinder overhead from 7~8% to 3~6% (with the framepointer 
version of sysprof providing a baseline of about 1.5%). So there is some 
variance (the lower the overhead gets, the harder it is to keep conditions 
identical, apparently), but the performance is moving along; the next 
bottleneck to look at is dwfl_linux_proc_report. I may want to automate the 
performance testing procedure to get fully exact numbers I'm comfortable with 
reporting.

There is an issue with spontaneous exit of sysprofd (cleanly, with a "Stopping 
RAPL monitor" message) that I'm trying to understand. Hypotheses -- may be a 
polkit issue, or may be sysprof-live-unwinder not handling some error result 
gracefully. (The eu-stacktrace tool + prototype sysprof patches don't exhibit 
this behaviour.) I expect I'll need to make another revision of the 
sysprof-live-unwinder patches at 
https://git.sr.ht/~serhei/sysprof-experiments/log/serhei/live-unwinder

With the Elf caching in Dwfl_Process_Tracker, I counted (via attached simple 
patch) how many times an Elf struct was retrieved from cache vs how many were 
newly created. On a quick test with a swaywm system the 'created' number 
stabilized at ~186 created Elf structs with the 'cached' number ~400 and rising 
as I keep running the profiler. On gnome3, stabilizes at ~282 structs with the 
'cached' number ~500 and rising. Obviously, this is not super meaningful, as 
the number can be made arbitrarily good by running the profiler for longer 
periods of time :p but it's worth verifying that the caching works.

All the best,
     Serhei
diff --git a/libdwfl/dwfl_process_tracker_find_elf.c b/libdwfl/dwfl_process_tracker_find_elf.c
index 72621bb1..d966346f 100644
--- a/libdwfl/dwfl_process_tracker_find_elf.c
+++ b/libdwfl/dwfl_process_tracker_find_elf.c
@@ -38,6 +38,9 @@
 
 #include "libdwflP.h"
 
+static int created_elf = 0;
+static int cached_elf = 0;
+
 /* TODO: Consider making this a public api, dwfl_process_tracker_find_cached_elf. */
 bool
 find_cached_elf (Dwfl_Process_Tracker *tracker,
@@ -68,6 +71,8 @@ find_cached_elf (Dwfl_Process_Tracker *tracker,
   *elfp = ent->elf;
   *file_name = strdup(ent->module_name);
   *fdp = ent->fd;
+  cached_elf ++;
+  fprintf(stderr, "= dwfl_process_tracker_find_elf retrieves CACHED (%d created / %d cached) name=%s fd=%d elfp=%p ref_count=%d\n", created_elf, cached_elf, ent->module_name, ent->fd, ent->elf, ent->elf->ref_count); /* DEBUG */
   return true;
 }
 
@@ -118,6 +123,8 @@ cache_elf (Dwfl_Process_Tracker *tracker,
       ent->last_mtime = sb.st_mtime;
     }
   rwlock_unlock(tracker->elftab_lock);
+  created_elf ++;
+  fprintf(stderr, "+ dwfl_process_tracker_find_elf CREATES (%d created / %d cached) new name=%s file_name=%s fd=%d elfp=%p ref_count=%d\n", created_elf, cached_elf, ent->module_name, file_name, fd, elf, elf ? elf->ref_count : 0); /* DEBUG */
   return true;
 }
 

Reply via email to