For my app I think it's largely parsing debug symbols tables for shared libraries. My main performance improvement was to increase the parallelism of parsing that information.
Funny, gdb/gold has a similar accelerator table (created when you link with -gdb-index). I assume lldb doesn't know how to parse it. I'll work on bisecting the change. On Wed, Apr 12, 2017 at 12:26 PM, Jason Molenda <ja...@molenda.com> wrote: > I don't know exactly when the 3.9 / 4.0 branches were cut, and what was > done between those two points, but in general we don't expect/want to see > performance regressions like that. I'm more familiar with the perf > characteristics on macos, Linux is different in some important regards, so > I can only speak in general terms here. > > In your example, you're measuring three things, assuming you have debug > information for MY_PROGRAM. The first is "Do the initial read of the main > binary and its debug information". The second is "Find all symbol names > 'main'". The third is "Scan a newly loaded solib's symbols" (assuming you > don't have debug information from solibs from /usr/lib etc). Technically > there's some additional stuff here -- launching the process, detecting > solibs as they're loaded, looking up the symbol context when we hit the > breakpoint, backtracing a frame or two, etc, but that stuff is rarely where > you'll see perf issues on a local debug session. > > Which of these is likely to be important will depend on your MY_PROGRAM. > If you have a 'int main(){}', it's not going to be dwarf parsing. If your > binary only pulls in three solib's by the time it is running, it's not > going to be new module scanning. A popular place to spend startup time is > in C++ name demangling if you have a lot of solibs with C++ symbols. > > > On Darwin systems, we have a nonstandard accelerator table in our DWARF > emitted by clang that lldb reads. The "apple_types", "apple_names" etc > tables. So when we need to find a symbol named "main", for Modules that > have a SymbolFile, we can look in the accelerator table. If that > SymbolFile has a 'main', the accelerator table gives us a reference into > the DWARF for the definition, and we can consume the DWARF lazily. We > should never need to do a full scan over the DWARF, that's considered a > failure. > > (in fact, I'm working on a branch of the llvm.org sources from > mid-October and I suspect Darwin lldb is often consuming a LOT more dwarf > than it should be when I'm debugging, I need to figure out what is causing > that, it's a big problem.) > > > In general, I've been wanting to add a new "perf counters" infrastructure > & testsuite to lldb, but haven't had time. One thing I work on a lot is > debugging over a bluetooth connection; it turns out that BT is very slow, > and any extra packets we send between lldb and debugserver are very > costly. The communication is so fast over a local host, or over a usb > cable, that it's easy for regressions to sneak in without anyone noticing. > So the original idea was hey, we can have something that counts packets for > distinct operations. Like, this "next" command should take no more than 40 > packets, that kind of thing. And it could be expanded -- "b main should > fully parse the DWARF for only 1 symbol", or "p *this should only look up 5 > types", etc. > > > > > > On Apr 12, 2017, at 11:26 AM, Scott Smith via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > > > > I worked on some performance improvements for lldb 3.9, and was about to > forward port them so I can submit them for inclusion, but I realized there > has been a major performance drop from 3.9 to 4.0. I am using the official > builds on an Ubuntu 16.04 machine with 16 cores / 32 hyperthreads. > > > > Running: time lldb-4.0 -b -o 'b main' -o 'run' MY_PROGRAM > /dev/null > > > > With 3.9, I get: > > real 0m31.782s > > user 0m50.024s > > sys 0m4.348s > > > > With 4.0, I get: > > real 0m51.652s > > user 1m19.780s > > sys 0m10.388s > > > > (with my changes + 3.9, I got real down to 4.8 seconds! But I'm not > convinced you'll like all the changes.) > > > > Is this expected? I get roughly the same results when compiling > llvm+lldb from source. > > > > I guess I can spend some time trying to bisect what happened. 5.0 looks > to be another 8% slower. > > > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev@lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev