While looking at PR42722 I noticed that gcc generates awful code for a
tail-call involving a trivial pass-through of a large struct parameter.

> cat bug1.c
struct s1 { int x[16]; };
extern void g1(struct s1);
void f1(struct s1 s1) { g1(s1); }

struct s2 { int x[17]; };
extern void g2(struct s2);
void f2(struct s2 s2) { g2(s2); }
> gcc -O2 -fomit-frame-pointer -S bug1.c
> cat bug1.s
        .file   "bug1.c"
        .text
        .p2align 4,,15
.globl f1
        .type   f1, @function
f1:
        subl    $12, %esp
        addl    $12, %esp
        jmp     g1
        .size   f1, .-f1
        .p2align 4,,15
.globl f2
        .type   f2, @function
f2:
        subl    $12, %esp
        movl    $17, %ecx
        movl    %edi, 8(%esp)
        leal    16(%esp), %edi
        movl    %esi, 4(%esp)
        movl    %edi, %esi
        rep movsl
        movl    4(%esp), %esi
        movl    8(%esp), %edi
        addl    $12, %esp
        jmp     g2
        .size   f2, .-f2
        .ident  "GCC: (GNU) 4.5.0 20100128 (experimental)"
        .section        .note.GNU-stack,"",@progbits

There are two problems with this code:
1. For the larger struct gcc generates a block copy with identical source and
destination addresses, which amounts to a very slow NOP.
2. For the smaller struct gcc manages to eliminate the block copy, but it
leaves pointless stack manipulation behind in the function (f1). However,
gcc-4.3 generates no pointless stack manipulation:

.globl f1
        .type   f1, @function
f1:
        jmp     g1
        .size   f1, .-f1
        .ident  "GCC: (GNU) 4.3.5 20100103 (prerelease)"

so there's a code size and performance regression in 4.5/4.4.


-- 
           Summary: inefficient code for trivial tail-call with large struct
                    parameter
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: mikpe at it dot uu dot se
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42909

Reply via email to