Initial response from what I know, pending being actually sure about the answers.
On Wed, Sep 3, 2025, at 9:51 AM, Mark Wielaard wrote: > Could you explain again why we have the first set of arguments? > Dwfl *dwfl, Elf *elf, pid_t pid, pid_t tid > Some of this seems a little repetitive. Ideally I think you would just > provide a Dwflst_Process_Tracker and a tid. Assuming you can easily get > the pid from a tid (maybe not?). At least giving both Dwfl and pid > seems redundant. Also which Elf would you provide? A Dwfl can contain > multiple Dwfl_Modules each with their own associated Elf. I'm not sure if these args should be reduced. I do know that pid is wanted by dwfl_attach_state whereas tid is wanted by dwfl_getthread_frames. Pretty sure pid is not derivable from tid, so we end up wanting both. For now the questions reduce to: * Can we drop pid and derive it from dwfl? Pretty sure no -- the Dwfl would not have that info pre-dwfl_attach_state, surely? * Does this make sense for Dwfl with multiple Dwfl_Module/Elf? Has worked perfectly well in my testing, but I'm willing to pretzel my brain to come up with the explanation why. What I really need to think about, I guess, is whether the design with dwflst_sample_getframes calling dwfl_attach_state() dynamically can be improved on, since it's quite different from the ordinary Dwfl workflow. > Side note, there is dwfl_getthreads, but I assume that isn't how you > iterate through threads with an Dwflst_Process_Tracker derived Dwfl? Correct, we're dealing with one sample at a time, and the sample corresponds to one thread at a time. So there isn't really a need to use dwfl_getthreads() iteration at any point. > Then you get the stack (and size) and registers (as Dwarf_Words and > number of regs). So we just assume this is provided by some external > mechanism. > > And finally the n_regs_mapping which for each DWARF register number > maps it to an index in the regs array. So that the callback can get a > register value from the given Dwfl_Frame. > > I think this works, but do note that some arches have gaps in the DWARF > register numbers and so might have > 100 nregs. I think it is not a problem: regs_mapping is meant to map from the small contiguous register sample to the (large, noncontiguous) DWARF register numbers, so it is a compact array of potentially large DWARF register numbers, rather than a large sparse array of small numbers. I do need to make ebl_set_initial_registers_sample work efficiently for the architectures that have a large/noncontiguous DWARF layout, but the ebl/backends setup makes this trivial to handle. > Should the regmapping maybe be stored or set with the Dwfl or > Dwflst_Process_Tracker once? That's an option. The current design allows handling a stream of data with a mix of different register files, which is a fairly theoretical need as of now.