Ok I stripped out the zlib crc algorithm and just left the parallelism +
calls to zlib's crc32_combine, but only if we are actually linking with
zlib.  I left those calls here (rather than folding them info JamCRC)
because I'm taking advantage of TaskRunner to parallelize the work.

I moved the system include block after the llvm includes, both because I
had to (to use the config #defines), and because it fit the published
coding convention.

By itself, it reduces my test time from 55 to 47 seconds. (The original
time is slower than before because I pulled the latest code, guess there's
another slowdown to fix).

On Wed, Apr 12, 2017 at 12:15 PM, Scott Smith <scott.sm...@purestorage.com>
wrote:

> The algorithm included in ObjectFileELF.cpp performs a byte at a time
> computation, which causes long pipeline stalls in modern processors.
> Unfortunately, the polynomial used is not the same one used by the SSE 4.2
> instruction set, but there are two ways to make it faster:
>
> 1. Work on multiple bytes at a time, using multiple lookup tables. (see
> http://create.stephan-brumme.com/crc32/#slicing-by-8-overview)
> 2. Compute crcs over separate regions in parallel, then combine the
> results.  (see http://stackoverflow.com/questions/23122312/crc-
> calculation-of-a-mostly-static-data-stream)
>
> As it happens, zlib provides functions for both:
> 1. The zlib crc32 function uses the same polynomial as ObjectFileELF.cpp,
> and uses slicing-by-4 along with loop unrolling.
> 2. The zlib library provides crc32_combine.
>
> I decided to just call out to the zlib library, since I see my version of
> lldb already links with zlib; however, the llvm CMakeLists.txt declares it
> optional.
>
> I'm including my patch that assumes zlib is always linked in.  Let me know
> if you prefer:
> 1. I make the change conditional on having zlib (i.e. fall back to the old
> code if zlib is not present)
> 2. I copy all the code from zlib and put it in ObjectFileELF.cpp.
> However, I'm going to guess that requires updating some documentation to
> include zlib's copyright notice.
>
> This brings startup time on my machine / my binary from 50 seconds down to
> 32.
> (time ~/llvm/build/bin/lldb -b -o 'b main' -o 'run' MY_PROGRAM)
>
>

Attachment: zlib_crc.patch
Description: Binary data

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to