[Lldb-commits] [lldb] Add a python JIT loader class. (PR #142514)

Greg Clayton via lldb-commits Tue, 17 Jun 2025 13:57:14 -0700

clayborg wrote:

> It also seems architecturally wrong to try to guess and influence what 
> BreakpointResolvers do behind their backs. After all, the resolver might be 
> just some Python Code you know nothing about. How would you instrument that? 
> If I set a regular expression name breakpoint, will you know to compare that 
> regex against what the JIT produces? What about source regular expression 
> breakpoints? Do you figure out what the containing source file is and observe 
> that?


We currently only handle source file + line breakpoints and breakpoints by 
name. For source file + line breakpoints the JIT keeps metadata that says "this 
function contains these source file + line ranges". When we get notified that a 
breakpoint was set, we just need to know the source file + line, and then it 
allows the JIT to modify the metadata it contains if the function hasn't been 
JIT'ed yet. If it has been JIT'ed, it will load the debug info for that 
function immediately if it hasn't already been loaded. This allows the 
breakpoint to naturally resolve itself as soon as the debug info is loaded. If 
the debug info hasn't been loaded, then the modification to the metadata in the 
JIT marks the function as being needed by the debugger and if and when and only 
when it gets JIT'ed we will load the module for it and the breakpoint will 
naturally resolve itself. Same thing for functions by name.

> Having a system where "if you set these kinds of breakpoints we'll be able to 
> intervene, but other breakpoint types just won't work" seems awkward. If you 
> are going to only support certain breakpoint types for JIT debugging, it 
> seems much better to make that an explicit JIT breakpoint type writing a 
> custom resolver that cooperates with your JIT engine to register interest and 
> get called back when JIT events occur that are relevant to it.

We have a system that is already working just being notified about breakpoints 
in the JIT loader. Yes, it doesn't handle all breakpoint types right now, but 
we are getting this to work as proof of concept with a JIT loader that does 
everything lazilly. 

So right now only functions that have breakpoints set in them need debug info 
to be generated and that is if and only if they ever get JIT'ed. It works quite 
well, abeit we only support two kinds of breakpoints currently. Then if a stack 
trace goes through a function that doesn't resolve, we can lazily load the 
debug info for it on the fly only when  we have a backtrace that traverses 
through a JIT'ed frame that we don't have debug info for yet.

> Either that or we need to introduce the notion of a "dynamic symbol resolver" 
> that you can register information about file names or symbol names, and then 
> have the standard breakpoint resolvers check if one of these exists and 
> registers interest for the names and files it is looking for. But trying to 
> suss out what a resolver is going to do from the outside isn't the right way 
> to go.

Happy to meet and discuss anytime. But this PR has isn't doing any of those 
things yet. This just enabled python JIT loaders which we need for other 
purposes as well.

> I think I'd come at if from the opposite direction. We don't currently know 
> what the full set of messages that we want to send are, so making one class 
> that receives all the messages we know about at present seems limiting.
> 
> What I was proposing instead is that when we add a way to register a callback 
> to some event in lldb, we extend the registry to indicate not just the class 
> that will be instantiated to watch the event, but which method is the 
> responder.

How do we store an instance of a class and then call a method on it? We have no 
notion of a baton in python callbacks right now. If we allowed python callbacks 
to be registered with any python object as a baton, then this could be made to 
work, but we probably shouldn't try to call a method on some object as there is 
no way to specify that on the command line. We should also be able to do this 
with an API in the public API via callbacks with batons.
> 
> That way, for instance, you can register a stop-hook with your class, and 
> then you will have launch and attach callbacks already. But you need a way to 
> say "Use a common instance of this class per-whatever entity owns that 
> callback" and "use this method(s) on my object".

That is what I don't know how to implement. How would we do this on the command 
line? Would we need a global variable to contain the class instance? 

> 
> That way we don't have to hook everything up for you, but rather it will be 
> easy for designers to make a class where they can hook up the particular 
> callbacks they need.

I am fine with this as long as the solution doesn't require using command line 
commands to do it and we have APIs. How about public APIs where we register 
callbacks and make sure that when doing it through python we can specify a 
python object as a baton that gets given back when the callback is called. 
Right now all batons for python are non existent because we use the native 
baton for the python implementation.





https://github.com/llvm/llvm-project/pull/142514
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] Add a python JIT loader class. (PR #142514)

Reply via email to