http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49881

           Summary: [AVR] Inefficient stack manipulation around calls
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: r...@gcc.gnu.org


Created attachment 24848
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24848
Hack to set ACCUMULATE_OUTGOING_ARGS

While looking at PR49864 I noticed some awful code.

First, the argument setup code doesn't use push:

        rcall .
        rcall .
        rcall .
        in r30,__SP_L__
        in r31,__SP_H__
        adiw r30,1
        in r26,__SP_L__
        in r27,__SP_H__
        adiw r26,1+1
        st X,r15
        st -X,r14
        sbiw r26,1
        ld r24,Y
        ldd r25,Y+1
        std Z+3,r25
        std Z+2,r24
        lds r24,a1
        lds r25,a1+1
        std Z+5,r25
        std Z+4,r24
        rcall printf

vs a hand-written

        lds r24,a1
        lds r25,a1+1
        push r25
        push r24
        ld r24,Y
        ldd r25,Y+1
        push r25
        push r24
        push r15
        push r14
        rcall printf

If that can be fixed, then the 9 insns to pop the stack afterward,

        in r18,__SP_L__
        in r19,__SP_H__
        subi r18,lo8(-(6))
        sbci r19,hi8(-(6))
        in __tmp_reg__,__SREG__
        cli 
        out __SP_H__,r19
        out __SREG__,__tmp_reg__
        out __SP_L__,r18

might be ok.  If that's tricky, consider switching the port to use
ACCUMULATE_OUTGOING_ARGS.  A quick hack (attached) showed a nice
to this test case:

        in r30,__SP_L__
        in r31,__SP_H__
        std Z+2,r13
        std Z+1,r12
        mov r30,r16
        mov r31,r17
        ld r24,Z
        ldd r25,Z+1
        in r30,__SP_L__
        in r31,__SP_H__
        std Z+4,r25
        std Z+3,r24
        lds r24,a1
        lds r25,a1+1
        std Z+6,r25
        std Z+5,r24
        rcall printf

With total output

   text       data        bss        dec        hex    filename
   2311         32          0       2343        927    z-before.o
   1805         32          0       1837        72d    z-after.o

Even if you do manage to fix the push problem, it might be worthwhile
to add -maccumulate-outgoing-args, like for the i386 port.  That would
give a user the option of changing the logic to suit their source.

Reply via email to