memcmp-intensive code becomes up to 6 times slower if compiled with the -O3 option than with the -g or -O0 option. The reason for this is that the inline memcmp function is *much* slower than the glibc memcmp.
Here's a simple test case: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/times.h> void* list[1024 * 1024]; int main(void) { int count = sizeof(list) / sizeof(char*); int i; for (i=0; i < count; i++) list[i] = calloc(1024, 1); int dupes = 0; int start = times(NULL); for (i=0; i<count-1; i++) if (!memcmp(list[i], list[i+1], 1024)) dupes++; int ticks = times(NULL) - start; printf("Time: %d ticks (%d memcmp/tick)\n", ticks, dupes/ticks); return 0; } # gcc -O3 -o test test.c # ./test Time: 188 ticks (5577 memcmp/tick) # gcc -O0 -o test test.c # ./test Time: 30 ticks (34952 memcmp/tick) System is Debian testing with libc package 2.10.1-7. -- Summary: Inline memcmp is *much* slower than glibc's Product: gcc Version: 4.4.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bjorn at haxx dot se GCC build triplet: i486-linux-gnu GCC host triplet: i486-linux-gnu GCC target triplet: i486-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052