https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115817

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P4
           Keywords|                            |missed-optimization

--- Comment #1 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
(In reply to Dmytro Bagrii from comment #0)
> First function compiles to the following code:
> 
> 0000000c <__vector_set_zero>:
>    c:   1f 92           push    r1
>    e:   1f b6           in      r1, 0x3f        ; 63
>   10:   1f 92           push    r1
>   12:   11 24           eor     r1, r1
>   14:   10 92 00 00     sts     0x0000, r1      ; 0x800000
> <__SREG__+0x7fffc1>
>   18:   1f 90           pop     r1
>   1a:   1f be           out     0x3f, r1        ; 63
>   1c:   1f 90           pop     r1
>   1e:   18 95           reti

The explanation:  When the compiler is setting a memory location zu zero, it
uses code like

store <addres>, __zero_reg__

which requires of course that the zero-register has been set up to hold 0. 
There is no way to set R1 to zero that does not clobber SREG, hence SREG has
also to be saved.  These are the two PUSHs / POPs: One for SREG, one for
__zero_reg__.

> Expected code:
> __vector_set_zero:
>     push    r24
>     ldi     r24, 0x00
>     sts     flag, r24
>     pop     r24
>     reti

There is currently no way to find out whether LDI 0x0 would be better than
using __zero_reg__.  Of curse just reloading 0x0 to a register would mean that
in other places such loads might accumulate.

So I have absolutely no idea how this could be approached in the current
context.

Reply via email to