http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47000
Summary: Major performance regression in parallel SSE2 impl of SHA256 hash algorithm Product: gcc Version: 4.5.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: jgar...@pobox.com Created attachment 22805 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22805 4-way SHA256 implementation, whose performance decreases markedly 4.4.x -> 4.5.x OS: Fedora 14 My "cpuminer" open source project is -very- sensitive to performance of generated code, and experiences a severe performance regression going from gcc 4.4.x to 4.5.x. Our program core is essentially for (n = 0; n < 0xffffff; n++) sha256( sha256( data ) ) /* one iteration of inner loop */ Building with gcc 4.4.5 -or- Fedora 13 gcc (4.4.x derivative), we achieve 1850.85 kilo-iterations per second Building with gcc 4.5.1 -or- Fedora 14 gcc (4.5.x derivative), we achieve 1389.82 kilo-iterations per second This is a significant performance decrease, and the only variable is the compiler. I have presented x86_64 data below, but similar slowdowns are seen on i686-mingw in Fedora 13 (fast gcc 4.4.x) or Fedora 14 (slow gcc 4.5.x). This interesting variant of the standard SHA256 algorithm is implemented using Intel/AMD SSE2-specific operations, effectively running four (4) SHA256 iterations in parallel, generating four (4) SHA256 hashes on four distinct datasets. See attachment sha256_4way.i. -------------------------------------------------------------------------- fast, working gcc -v: Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../src/gcc-4.4.5/configure --prefix=/garz/gcc44 --enable-languages=c Thread model: posix gcc version 4.4.5 (GCC) -------------------------------------------------------------------------- slow, broken gcc -v: Using built-in specs. COLLECT_GCC=/garz/gcc45/bin/gcc COLLECT_LTO_WRAPPER=/garz/gcc45/libexec/gcc/x86_64-unknown-linux-gnu/4.5.1/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../src/gcc-4.5.1/configure --prefix=/garz/gcc45 --enable-languages=c Thread model: posix gcc version 4.5.1 (GCC)