[lldb-dev] Parallelizing loading of shared libraries
After a dealing with a bunch of microoptimizations, I'm back to parallelizing loading of shared modules. My naive approach was to just create a new thread per shared library. I have a feeling some users may not like that; I think I read an email from someone who has thousands of shared libraries. That's a lot of threads :-) The problem is loading a shared library can cause downstream parallelization through TaskPool. I can't then also have the loading of a shared library itself go through TaskPool, as that could cause a deadlock - if all the worker threads are waiting on work that TaskPool needs to run on a worker thread then nothing will happen. Three possible solutions: 1. Remove the notion of a single global TaskPool, but instead have a static pool at each callsite that wants it. That way multiple paths into the same code would share the same pool, but different places in the code would have their own pool. 2. Change the wait code for TaskRunner to note whether it is already on a TaskPool thread, and if so, spawn another one. However, I don't think that fully solves the issue of having too many threads loading shared libraries, as there is no guarantee the new worker would work on the "deepest" work. I suppose each task would be annotated with depth, and the work could be sorted in TaskPool though... 3. Leave a separate thread per shared library. Thoughts? ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] Parallelizing loading of shared libraries
Under what conditions would a worker thread spawn additional work to be run in parallel and then wait for it, as opposed to just doing it serially? Is it feasible to just require tasks to be non blocking? On Wed, Apr 26, 2017 at 4:12 PM Scott Smith via lldb-dev < lldb-dev@lists.llvm.org> wrote: > After a dealing with a bunch of microoptimizations, I'm back to > parallelizing loading of shared modules. My naive approach was to just > create a new thread per shared library. I have a feeling some users may > not like that; I think I read an email from someone who has thousands of > shared libraries. That's a lot of threads :-) > > The problem is loading a shared library can cause downstream > parallelization through TaskPool. I can't then also have the loading of a > shared library itself go through TaskPool, as that could cause a deadlock - > if all the worker threads are waiting on work that TaskPool needs to run on > a worker thread then nothing will happen. > > Three possible solutions: > > 1. Remove the notion of a single global TaskPool, but instead have a > static pool at each callsite that wants it. That way multiple paths into > the same code would share the same pool, but different places in the code > would have their own pool. > > 2. Change the wait code for TaskRunner to note whether it is already on a > TaskPool thread, and if so, spawn another one. However, I don't think that > fully solves the issue of having too many threads loading shared libraries, > as there is no guarantee the new worker would work on the "deepest" work. > I suppose each task would be annotated with depth, and the work could be > sorted in TaskPool though... > > 3. Leave a separate thread per shared library. > > Thoughts? > > ___ > lldb-dev mailing list > lldb-dev@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] Parallelizing loading of shared libraries
A worker thread would call DynamicLoader::LoadModuleAtAddress. This in turn eventually calls SymbolFileDWARF::Index, which uses TaskRunners to 1. extracts dies for each DWARF compile unit in a separate thread 2. parse/unmangle/etc all the symbols The code distance from DynamicLoader to SymbolFileDWARF is enough that disallowing LoadModuleAtAddress to block seems to be a nonstarter. On Wed, Apr 26, 2017 at 4:23 PM, Zachary Turner wrote: > Under what conditions would a worker thread spawn additional work to be > run in parallel and then wait for it, as opposed to just doing it serially? > Is it feasible to just require tasks to be non blocking? > On Wed, Apr 26, 2017 at 4:12 PM Scott Smith via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > >> After a dealing with a bunch of microoptimizations, I'm back to >> parallelizing loading of shared modules. My naive approach was to just >> create a new thread per shared library. I have a feeling some users may >> not like that; I think I read an email from someone who has thousands of >> shared libraries. That's a lot of threads :-) >> >> The problem is loading a shared library can cause downstream >> parallelization through TaskPool. I can't then also have the loading of a >> shared library itself go through TaskPool, as that could cause a deadlock - >> if all the worker threads are waiting on work that TaskPool needs to run on >> a worker thread then nothing will happen. >> >> Three possible solutions: >> >> 1. Remove the notion of a single global TaskPool, but instead have a >> static pool at each callsite that wants it. That way multiple paths into >> the same code would share the same pool, but different places in the code >> would have their own pool. >> >> 2. Change the wait code for TaskRunner to note whether it is already on a >> TaskPool thread, and if so, spawn another one. However, I don't think that >> fully solves the issue of having too many threads loading shared libraries, >> as there is no guarantee the new worker would work on the "deepest" work. >> I suppose each task would be annotated with depth, and the work could be >> sorted in TaskPool though... >> >> 3. Leave a separate thread per shared library. >> >> Thoughts? >> >> ___ >> lldb-dev mailing list >> lldb-dev@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> > ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] Parallelizing loading of shared libraries
We started out with the philosophy that lldb wouldn't touch any more information in a shared library than we actually needed. So when a library gets loaded we might need to read in and resolve its section list, but we won't read in any symbols if we don't need to look at them. The idea was that if you did "load a binary, and run it" until the binary stops for some reason, we haven't done any unnecessary work. Similarly, if all the breakpoints the user sets are scoped to a shared library then there's no need for us to read any symbols for any other shared libraries. I think that is a good goal, it allows the debugger to be used in special purpose analysis tools w/o forcing it to pay costs that a more general purpose debug session might require. I think it would be hard to convert all the usages of modules to from "do something with a shared library" mode to "tell me you are interested in a shared library and give me a callback" so that the module reading could be parallelized on demand. But at the very least we need to allow a mode where symbol reading is done lazily. The other concern is that lldb keeps the modules it reads in a global cache, shared by all debuggers & targets. It is very possible that you could have two targets or two debuggers each with one target that are reading in shared libraries simultaneously, and adding them to the global cache. In some of the uses that lldb has under Xcode this is actually very common. So the task pool will have to be built up as things are added to the global shared module cache, not at the level of individual targets noticing the read-in of a shared library. Jim > On Apr 26, 2017, at 4:12 PM, Scott Smith via lldb-dev > wrote: > > After a dealing with a bunch of microoptimizations, I'm back to parallelizing > loading of shared modules. My naive approach was to just create a new thread > per shared library. I have a feeling some users may not like that; I think I > read an email from someone who has thousands of shared libraries. That's a > lot of threads :-) > > The problem is loading a shared library can cause downstream parallelization > through TaskPool. I can't then also have the loading of a shared library > itself go through TaskPool, as that could cause a deadlock - if all the > worker threads are waiting on work that TaskPool needs to run on a worker > thread then nothing will happen. > > Three possible solutions: > > 1. Remove the notion of a single global TaskPool, but instead have a static > pool at each callsite that wants it. That way multiple paths into the same > code would share the same pool, but different places in the code would have > their own pool. > > 2. Change the wait code for TaskRunner to note whether it is already on a > TaskPool thread, and if so, spawn another one. However, I don't think that > fully solves the issue of having too many threads loading shared libraries, > as there is no guarantee the new worker would work on the "deepest" work. I > suppose each task would be annotated with depth, and the work could be sorted > in TaskPool though... > > 3. Leave a separate thread per shared library. > > Thoughts? > > ___ > lldb-dev mailing list > lldb-dev@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] [llvm-dev] LLDB security and the use of an IPC library
(+LLDB-Dev) most of the LLDB developers hang out there more than on LLVM-Dev. > On Apr 26, 2017, at 12:26 PM, Demi Marie Obenour via llvm-dev > wrote: > > LLDB currently uses a client-server architecture. That appears fine, > but runs into an annoying security problem: other users on the same > machine can connect to the TCP socket and take over LLDB and thus the > user’s system. This means that LLDB is useless in multiuser > enviromnents on Linux, such as academic computer labs. > > The immediate problem can be solved by using either HMAC authentication > of all messages or by using Unix domain sockets. However, it might be > simpler to use a 3rd party library for the purpose: > https://github.com/DemiMarie/SlipRock (Disclaimer: I wrote SlipRock). > > Questions: > > - Would you be interested in using SlipRock? Probably not. Generally we shy away from using third party libraries. > > - What features would SlipRock need in order to be useful to you? In > particular, do you need an asynchronous API, or is synchronous fine? The biggest thing that I would see as a barrier for us using SlipRock is that it would have to provide a large advantage in order to justify using it. We generally don't use middleware libraries that are not readily available on the platforms that we support, usually either via a package manager or shipping on the OS. > > - If not, would you be willing to accept patches to fix the existing > bug? I was actually looking at this code earlier this week. On OS X we do use Unix socketpair to construct domain sockets between debugserver and lldb. Presently lldb-server (the debugserver implementation used everywhere other than OS X) doesn't support accepting a socket pair, but we can and should fix that. I've been working recently on making more of LLDB's code properly configured based on system capabilities rather than hard coded assumptions. This will make it easier for us to do these kinds of things right in the future. If you're interested in working on getting lldb-server working with socketpair we would certainly take the patches, and I'd be happy to provide review or guidance as needed. Thanks, -Chris > > Sincerely, > > Demi Obenour > ___ > LLVM Developers mailing list > llvm-...@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev