https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108386
Bug ID: 108386 Summary: Missed optimization with -fno-omit-frame-pointer on x86 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- void bar (char *); void foo (void) { char buf[384]; bar (&buf[0]); bar (&buf[16]); bar (&buf[127]); bar (&buf[128]); bar (&buf[256]); bar (&buf[380]); bar (&buf[384]); } compiles with -O2 -fno-omit-frame-pointer to: pushq %rbp movq %rsp, %rbp subq $384, %rsp leaq -384(%rbp), %rdi call bar leaq -368(%rbp), %rdi call bar leaq -257(%rbp), %rdi call bar leaq -256(%rbp), %rdi call bar leaq -128(%rbp), %rdi call bar leaq -4(%rbp), %rdi call bar movq %rbp, %rdi call bar leave ret but this is unnecessarily large. As frame pointer is here only because user asked for it, the compiler knows there is always constant difference between the stack pointer and frame pointer and perhaps in machine reorg could interchange those cases which would be smaller and not slower. For these particular leaq/movq instructions, movq %r{sp,bp}, %rdi is 3 bytes, leaq SIMM8(%rbp), %rdi 4 bytes, leaq SIMM8(%rsp), %rdi 5 bytes, leaq SIMM32(%rbp), %rdi 7 bytes and leaq SIMM32(%rsp), %rdi 8 bytes. So at least from code size POV, movq %rsp, %rdi is smaller than any %rbp based leaq, and similarly leaq SIMM8(%rbp), %rdi 2 bytes smaller than leaq SIMM32(%rbp), %rdi. So, the mov would be a win always, and for frame sizes of more than 128 bytes %rsp up to 127 offset too. Though, I think the aliasing code hardcodes frame pointer knowledge and ditto I think the unwinder would be quite upset if we did such changes early.