https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573
Bug ID: 96573 Summary: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 Product: gcc Version: 10.2.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: remi.andruccioli at gmail dot com Target Milestone: --- Created attachment 49046 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49046&action=edit This file contains the source code of the function described in the report. Hello, I'd like to describe here what seems to be a regression/missed optimization since GCC 10. The host and target architecture is x86-64. I'm using Fedora 32 with Linux 5.7, but I could reproduce it on many other Linux platforms. I'm using GCC 10.2, but I could reproduce it with any GCC 10 minor version. Here is a function, written in pure ANSI C99: #include <stdlib.h> #include <stdint.h> void * ReverseBytesOfPointer(void * const pointer) { const size_t maxIndex = sizeof(pointer) - 1; const uint8_t * const oldPointerPointer = (uint8_t*)&pointer; void *newPointer; uint8_t * const newPointerPointer = (uint8_t *)&newPointer; uint8_t i; for (i = 0; i <= maxIndex; ++i) { newPointerPointer[maxIndex - i] = oldPointerPointer[i]; } return newPointer; } What this function does is simply to reverse all the bytes of a pointer. It is written in pure C99 and is extremely portable, as it works from 16-bit to 64-bit machines. (I wrote it and use it for embedded development and I'm happy with it). What makes it magical is that when compiled with -O3 (and -std=c99 -Wall -Werror -Wextra), GCC 9 (yes, 9) is clever enough to deduce the intent of this function and compiles it all as: ReverseBytesOfPointer: mov rax, rdi bswap rax ret However, since GCC 10, the magic seems to have disappeared. This is the ASM code that is generated now, with the exact same command line invocation: ReverseBytesOfPointer: movq %rdi, %rax movb %dil, -1(%rsp) movzbl %ah, %edx shrq $56, %rax movb %dl, -2(%rsp) movq %rdi, %rdx shrq $16, %rdx movb %al, -8(%rsp) movb %dl, -3(%rsp) movq %rdi, %rdx shrq $24, %rdx movb %dl, -4(%rsp) movq %rdi, %rdx shrq $32, %rdx movb %dl, -5(%rsp) movq %rdi, %rdx shrq $40, %rdx movb %dl, -6(%rsp) movq %rdi, %rdx shrq $48, %rdx movb %dl, -7(%rsp) movq -8(%rsp), %rax ret For your convenience, I include here a link to a snippet on compiler-explorer to show the comparison between versions 9.3 and 10.2. This snippet also includes a few unit tests: https://godbolt.org/z/YG1KPf Regards, Remi Andruccioli