https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118094
Bug ID: 118094
Summary: Missed Optimization of memcpy-Like Loop
Product: gcc
Version: 14.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: jonathan.gruber.jg at gmail dot com
Target Milestone: ---
GCC insufficiently optimizes memcpy-like loops. I tested this with optimization
levels -Oz, -Os, -O2, and -O3; on architectures x86_64, aarch64, and riscv64. I
will attach the architecture-specific preprocessed versions of this minimal
test case:
#include <stddef.h> /* For size_t. */
void my_void_memcpy(void *restrict dst, const void *restrict src, size_t n) {
for (size_t i = 0; i < n; ++i) {
*((char *)dst + i) = *((char *)src + i);
}
}
x86_64 assembly, -O3:
my_void_memcpy:
.cfi_startproc
testq %rdx, %rdx
je .L1
jmp memcpy@PLT
.p2align 4,,10
.p2align 3
.L1:
ret
.cfi_endproc
aarch64 assembly, -O3:
my_void_memcpy:
.cfi_startproc
cbz x2, .L1
b memcpy
.p2align 2,,3
.L1:
ret
.cfi_endproc
riscv64 assembly, -O3:
my_void_memcpy:
.cfi_startproc
beq a2,zero,.L1
tail memcpy@plt
.L1:
ret
.cfi_endproc
Each of these architectures first check if n is 0. If n is indeed 0, they
execute a return instruction. If n is nonzero, they unconditionally branch to
memcpy. But memcpy already checks if n is 0, so the function my_void_memcpy
could be further optimized into just the unconditional branch to memcpy.
Host system type: Arch Linux, x86_64
gcc information:
Version: 14.2.1 20240910 (GCC)
Configured with: /build/gcc/src/gcc/configure
--enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++,rust
--enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=https://gitlab.archlinux.org/archlinux/packaging/packages/gcc/-/issues
--with-build-config=bootstrap-lto --with-linker-hash-style=gnu
--with-system-zlib --enable-__cxa_atexit --enable-cet=auto
--enable-checking=release --enable-clocale=gnu --enable-default-pie
--enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object
--enable-libstdcxx-backtrace --enable-link-serialization=1
--enable-linker-build-id --enable-lto --enable-multilib --enable-plugin
--enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch
--disable-werror
aarch64-linux-gnu-gcc information:
Version: 14.2.0
Configured with: /build/aarch64-linux-gnu-gcc/src/gcc-14.2.0/configure
--prefix=/usr --program-prefix=aarch64-linux-gnu-
--with-local-prefix=/usr/aarch64-linux-gnu
--with-sysroot=/usr/aarch64-linux-gnu
--with-build-sysroot=/usr/aarch64-linux-gnu
--with-native-system-header-dir=/include --libdir=/usr/lib
--libexecdir=/usr/lib --target=aarch64-linux-gnu --host=x86_64-pc-linux-gnu
--build=x86_64-pc-linux-gnu --disable-nls --enable-default-pie
--enable-languages=c,c++,fortran --enable-shared --enable-threads=posix
--with-system-zlib --with-isl --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
--disable-libssp --enable-gnu-unique-object --enable-linker-build-id
--enable-lto --enable-plugin --enable-install-libiberty
--with-linker-hash-style=gnu --enable-gnu-indirect-function --disable-multilib
--disable-werror --enable-checking=release
riscv64-linux-gnu-gcc information:
Version: 14.2.0
Configured with: /build/riscv64-linux-gnu-gcc/src/gcc-14.2.0/configure
--prefix=/usr --program-prefix=riscv64-linux-gnu-
--with-local-prefix=/usr/riscv64-linux-gnu
--with-sysroot=/usr/riscv64-linux-gnu
--with-build-sysroot=/usr/riscv64-linux-gnu --libdir=/usr/lib
--libexecdir=/usr/lib --target=riscv64-linux-gnu --host=x86_64-pc-linux-gnu
--build=x86_64-pc-linux-gnu --with-system-zlib --with-isl
--with-linker-hash-style=gnu --disable-nls --disable-libunwind-exceptions
--disable-libstdcxx-pch --disable-libssp --disable-multilib --disable-werror
--enable-languages=c,c++ --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --enable-gnu-unique-object
--enable-linker-build-id --enable-lto --enable-plugin
--enable-install-libiberty --enable-gnu-indirect-function --enable-default-pie
--enable-checking=release