Ok I stripped out the zlib crc algorithm and just left the parallelism + calls to zlib's crc32_combine, but only if we are actually linking with zlib. I left those calls here (rather than folding them info JamCRC) because I'm taking advantage of TaskRunner to parallelize the work.
I moved the system include block after the llvm includes, both because I had to (to use the config #defines), and because it fit the published coding convention. By itself, it reduces my test time from 55 to 47 seconds. (The original time is slower than before because I pulled the latest code, guess there's another slowdown to fix). On Wed, Apr 12, 2017 at 12:15 PM, Scott Smith <scott.sm...@purestorage.com> wrote: > The algorithm included in ObjectFileELF.cpp performs a byte at a time > computation, which causes long pipeline stalls in modern processors. > Unfortunately, the polynomial used is not the same one used by the SSE 4.2 > instruction set, but there are two ways to make it faster: > > 1. Work on multiple bytes at a time, using multiple lookup tables. (see > http://create.stephan-brumme.com/crc32/#slicing-by-8-overview) > 2. Compute crcs over separate regions in parallel, then combine the > results. (see http://stackoverflow.com/questions/23122312/crc- > calculation-of-a-mostly-static-data-stream) > > As it happens, zlib provides functions for both: > 1. The zlib crc32 function uses the same polynomial as ObjectFileELF.cpp, > and uses slicing-by-4 along with loop unrolling. > 2. The zlib library provides crc32_combine. > > I decided to just call out to the zlib library, since I see my version of > lldb already links with zlib; however, the llvm CMakeLists.txt declares it > optional. > > I'm including my patch that assumes zlib is always linked in. Let me know > if you prefer: > 1. I make the change conditional on having zlib (i.e. fall back to the old > code if zlib is not present) > 2. I copy all the code from zlib and put it in ObjectFileELF.cpp. > However, I'm going to guess that requires updating some documentation to > include zlib's copyright notice. > > This brings startup time on my machine / my binary from 50 seconds down to > 32. > (time ~/llvm/build/bin/lldb -b -o 'b main' -o 'run' MY_PROGRAM) > >
zlib_crc.patch
Description: Binary data
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev