Great, thanks. This means that the lldb-server issues are not in scope for this feature, right?
On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere <jdevliegh...@apple.com> wrote: > > > On Sep 19, 2018, at 6:49 PM, Leonard Mosescu <mose...@google.com> wrote: > > Sounds like a fantastic idea. > > How would this work when the behavior of the debugee process is > non-deterministic? > > > All the communication between the debugger and the inferior goes through > the > GDB remote protocol. Because we capture and replay this, we can reproduce > without running the executable, which is particularly convenient when you > were > originally debugging something on a different device for example. > > > On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > >> Hi everyone, >> >> We all know how hard it can be to reproduce an issue or crash in LLDB. >> There >> are a lot of moving parts and subtle differences can easily add up. We >> want to >> make this easier by generating reproducers in LLDB, similar to what clang >> does >> today. >> >> The core idea is as follows: during normal operation we capture whatever >> information is needed to recreate the current state of the debugger. When >> something goes wrong, this becomes available to the user. Someone else >> should >> then be able to reproduce the same issue with only this data, for example >> on a >> different machine. >> >> It's important to note that we want to replay the debug session from the >> reproducer, rather than just recreating the current state. This ensures >> that we >> have access to all the events leading up to the problem, which are >> usually far >> more important than the error state itself. >> >> # High Level Design >> >> Concretely we want to extend LLDB in two ways: >> >> 1. We need to add infrastructure to _generate_ the data necessary for >> reproducing. >> 2. We need to add infrastructure to _use_ the data in the reproducer to >> replay >> the debugging session. >> >> Different parts of LLDB will have different definitions of what data they >> need >> to reproduce their path to the issue. For example, capturing the commands >> executed by the user is very different from tracking the dSYM bundles on >> disk. >> Therefore, we propose to have each component deal with its needs in a >> localized >> way. This has the advantage that the functionality can be developed and >> tested >> independently. >> >> ## Providers >> >> We'll call a combination of (1) and (2) for a given component a >> `Provider`. For >> example, we'd have an provider for user commands and a provider for dSYM >> files. >> A provider will know how to keep track of its information, how to >> serialize it >> as part of the reproducer as well as how to deserialize it again and use >> it to >> recreate the state of the debugger. >> >> With one exception, the lifetime of the provider coincides with that of >> the >> `SBDebugger`, because that is the scope of what we consider here to be a >> single >> debug session. The exception would be the provider for the global module >> cache, >> because it is shared between multiple debuggers. Although it would be >> conceptually straightforward to add a provider for the shared module >> cache, >> this significantly increases the complexity of the reproducer framework >> because >> of its implication on the lifetime and everything related to that. >> >> For now we will ignore this problem which means we will not replay the >> construction of the shared module cache but rather build it up during >> replaying, as if the current debug session was the first and only one >> using it. >> The impact of doing so is significant, as no issue caused by the shared >> module >> cache will be reproducible, but does not limit reproducing any issue >> unrelated >> to it. >> >> ## Reproducer Framework >> >> To coordinate between the data from different components, we'll need to >> introduce a global reproducer infrastructure. We have a component >> responsible >> for reproducer generation (the `Generator`) and for using the reproducer >> (the >> `Loader`). They are essentially two ways of looking at the same unit of >> repayable work. >> >> The Generator keeps track of its providers and whether or not we need to >> generate a reproducer. When a problem occurs, LLDB will request the >> Generator >> to generate a reproducer. When LLDB finishes successfully, the Generator >> cleans >> up anything it might have created during the session. Additionally, the >> Generator populates an index, which is part of the reproducer, and used >> by the >> Loader to discover what information is available. >> >> When a reproducer is passed to LLDB, we want to use its data to replay the >> debug session. This is coordinated by the Loader. Through the index >> created by >> the Generator, different components know what data (Providers) are >> available, >> and how to use them. >> >> It's important to note that in order to create a complete reproducer, we >> will >> require data from our dependencies (llvm, clang, swift) as well. This >> means >> that either (a) the infrastructure needs to be accessible from our >> dependencies >> or (b) that an API is provided that allows us to query this. We plan to >> address >> this issue when it arises for the respective Generator. >> >> # Components >> >> We have identified a list of minimal components needed to make reproducing >> possible. We've divided those into two groups: explicit and implicit >> inputs. >> >> Explicit inputs are inputs from the user to the debugger. >> >> - Command line arguments >> - Settings >> - User commands >> - Scripting Bridge API >> >> In addition to the components listed above, LLDB has a bunch of inputs >> that are >> not passed explicitly. It's often these that make reproducing an issue >> complex. >> >> - GDB Remote Packets >> - Files containing debug information (object files, dSYM bundles) >> - Clang headers >> - Swift modules >> >> Every component would have its own provider and is free to implement it >> as it >> sees fit. For example, as we expect to have a large number of GDB remote >> packets, the provider might choose to write these to disk as they come in, >> while the settings can easily be kept in memory until it is decided that >> we >> need to generate a reproducer. >> >> # Concerns, Implications & Risks >> >> ## Performance Impact >> >> As the reproducer functionality will have to be always-on, we have to >> consider >> performance implications. As mentioned earlier, the provider gives the >> freedom >> to be implemented in such a way that works best for its respective >> component. >> We'll have to measure to know how big the impact is. >> >> ## Privacy >> >> The reproducer might contain sensitive user information. We should make it >> clear to the user what kind of data is contained in the reproducer. >> Initially >> we will focus on the LLDB developer community and the people already >> filing >> bugs. >> >> ## Versions >> >> Because the reproducer works by replaying a debug session, the versions >> of the >> debugger generating an replaying the session will have to match. Not only >> is >> this important for the serialization format, but more importantly a >> different >> LLDB might ask different questions in a different order. >> >> # Implementation >> >> I've put up a patch (<https://reviews.llvm.org/D50254>) which contains a >> minimal >> implementation of the reproducer framework as well as the GDB remote >> provider. >> >> It records the GDB packets and writes them to a YAML file (we can switch >> to a >> more performant encoding down the road). When invoking the LLDB driver and >> passing the reproducer directory to `--reproducer`, this file is read and >> a >> dummy server replies with the next packet from this file, without talking >> to >> the executable. >> >> It's still pretty rudimentary and only works if you enter the exact same >> commands (so the server receives the exact same requests form the client). >> >> The next steps are (in broad strokes): >> >> 1. Capturing the debugged binary. >> 2. Record and replay user commands and SB-API calls. >> 3. Recording the configuration of the debugger. >> 4. Capturing other files used by LLDB. >> >> Please let me know what you think! >> >> Thanks, >> Jonas >> _______________________________________________ >> lldb-dev mailing list >> lldb-dev@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> > > >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev