https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80905
Bug ID: 80905 Summary: ARM: Useless initialization of struct passed by value Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gergo.barany at inria dot fr Target Milestone: --- Created attachment 41432 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41432&action=edit Input C file for triggering the issue Input program: $ cat tst.c struct S0 { int f0; int f1; int f2; int f3; }; int f1(struct S0 p) { return p.f0; } int f2(struct S0 p) { return p.f0 + p.f3; } When entering the function, GCC copies the entire struct from registers to the stack, even fields that are never used. Fields that *are* used are then reloaded from the stack even if they are still available in the very same registers: $ gcc tst.c -Wall -W -O3 -S -o - .arch armv7-a .eabi_attribute 28, 1 .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 1 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .file "tst.c" .text .align 2 .global f1 .syntax unified .arm .fpu vfpv3-d16 .type f1, %function f1: @ args = 0, pretend = 0, frame = 16 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. sub sp, sp, #16 add ip, sp, #16 stmdb ip, {r0, r1, r2, r3} ldr r0, [sp] add sp, sp, #16 @ sp needed bx lr .size f1, .-f1 .align 2 .global f2 .syntax unified .arm .fpu vfpv3-d16 .type f2, %function f2: @ args = 0, pretend = 0, frame = 16 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. sub sp, sp, #16 add ip, sp, #16 stmdb ip, {r0, r1, r2, r3} ldr r0, [sp] ldr r3, [sp, #12] add r0, r0, r3 add sp, sp, #16 @ sp needed bx lr .size f2, .-f2 .ident "GCC: (GNU) 8.0.0 20170527 (experimental)" Target: armv7a-eabihf Configured with: --target=armv7a-eabihf --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float-abi=hard --with-float=hard gcc version 8.0.0 20170527 (experimental) (GCC) This seems to be specific to ARM as I cannot reproduce this behavior on x86-64 or PowerPC. For comparison, LLVM generates the following code for ARM: f1: .fnstart @ BB#0: bx lr f2: .fnstart @ BB#0: add r0, r0, r3 bx lr