http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32629
Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2012-06-06 Ever Confirmed|0 |1 --- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-06 10:05:57 UTC --- Confirmed with -Os on trunk (4.8). With -O2 we unroll completely to f: .LFB0: .cfi_startproc movq $0, (%rdi) movq $0, 8(%rdi) movq $0, 16(%rdi) movq $0, 24(%rdi) movq $0, 32(%rdi) movq $0, 40(%rdi) movq $0, 48(%rdi) movq $0, 56(%rdi) movq $0, 64(%rdi) movq $0, 72(%rdi) ret which lacks the size optimization to use a zeroed %rax. Likewise for -Os which now looks like 0: 48 8d 57 30 lea 0x30(%rdi),%rdx 4: 48 c7 07 00 00 00 00 movq $0x0,(%rdi) b: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi) 12: 00 13: 48 c7 47 10 00 00 00 movq $0x0,0x10(%rdi) 1a: 00 1b: 48 c7 47 18 00 00 00 movq $0x0,0x18(%rdi) 22: 00 23: b9 08 00 00 00 mov $0x8,%ecx 28: 48 c7 47 20 00 00 00 movq $0x0,0x20(%rdi) 2f: 00 30: 48 c7 47 28 00 00 00 movq $0x0,0x28(%rdi) 37: 00 38: 31 c0 xor %eax,%eax 3a: 48 89 d7 mov %rdx,%rdi 3d: f3 ab rep stos %eax,%es:(%rdi) 3f: c3 retq I suppose with -Os we use rep stosl because that's one byte smaller ...(?) I suppose doing the $0x0 optimization should be done post-reload.