https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115817
Georg-Johann Lay <gjl at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P4 Keywords| |missed-optimization --- Comment #1 from Georg-Johann Lay <gjl at gcc dot gnu.org> --- (In reply to Dmytro Bagrii from comment #0) > First function compiles to the following code: > > 0000000c <__vector_set_zero>: > c: 1f 92 push r1 > e: 1f b6 in r1, 0x3f ; 63 > 10: 1f 92 push r1 > 12: 11 24 eor r1, r1 > 14: 10 92 00 00 sts 0x0000, r1 ; 0x800000 > <__SREG__+0x7fffc1> > 18: 1f 90 pop r1 > 1a: 1f be out 0x3f, r1 ; 63 > 1c: 1f 90 pop r1 > 1e: 18 95 reti The explanation: When the compiler is setting a memory location zu zero, it uses code like store <addres>, __zero_reg__ which requires of course that the zero-register has been set up to hold 0. There is no way to set R1 to zero that does not clobber SREG, hence SREG has also to be saved. These are the two PUSHs / POPs: One for SREG, one for __zero_reg__. > Expected code: > __vector_set_zero: > push r24 > ldi r24, 0x00 > sts flag, r24 > pop r24 > reti There is currently no way to find out whether LDI 0x0 would be better than using __zero_reg__. Of curse just reloading 0x0 to a register would mean that in other places such loads might accumulate. So I have absolutely no idea how this could be approached in the current context.