https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122041
--- Comment #6 from Petr Sumbera <sumbera at volny dot cz> ---
Interesting! By programmatically avoiding memcpy, GCC is now faster than
Studio. But even Studio now produce little bit faster code.
gdiff -u crc.c.orig crc.c
--- crc.c
+++ crc.c
@@ -41,7 +41,7 @@
crc32_update_no_xor_slice_by_8 (uint32_t crc, const char *buf)
{
uint64_t local_buf;
- memcpy (&local_buf, buf, 8);
+ local_buf = *(const uint64_t *)buf;
local_buf = le64toh (local_buf) ^ crc;
crc = crc32_sliceby8_table[0][(local_buf >> 56) & 0xFF]
^ crc32_sliceby8_table[1][(local_buf >> 48) & 0xFF]
uls-0 14:25 /builds/psumbera/userland-gzip-sparc-gcc/components/gzip/TMP/test:
gmake test
gcc -o test.o -c test.c
gcc -O3 -funroll-loops -mcpu=niagara4 -mtune=niagara4 -o crc-gcc.o -c crc.c
gcc -o test-gcc test.o crc-gcc.o
/opt/developerstudio12.6/bin/cc -m64 -xO4 -xtarget=generic -xarch=sparcvis
-xchip=generic -xregs=no%appl -xmemalign=16s -o crc-studio.o -c crc.c
gcc -o test-studio test.o crc-studio.o
time ./test-gcc
real 10.8
user 10.7
sys 0.0
time ./test-studio
real 11.4
user 11.3
sys 0.0