https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63533
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> --- > The attached reprocase contains an innocent function which can not ever > benefit from inlining. That is not true any more. In GCC 10+, we get: g: testl %edi, %edi jne .L8 subq $1016, %rsp xorl %eax, %eax movq %rsp, %rdi call f addq $1016, %rsp ret .p2align 4,,10 .p2align 3 .L8: jmp g.part.0 g.part.0: subq $8, %rsp xorl %eax, %eax call f xorl %eax, %eax call f xorl %eax, %eax call f xorl %eax, %eax call f xorl %eax, %eax addq $8, %rsp jmp f So we conserve stack space now due to two things, conditional shrink wrapping and being able to tail callto g.part.0. The performance miss is just one unconditional jump (there is another bug handling conditional tail calls already). So all fixed in GCC 10+.