http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60889
Bug ID: 60889 Summary: -Os generate much bigger code Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: mjambor at suse dot cz On Linux/x86-64: [hjl@gnu-6 gcc]$ cat /tmp/space.i typedef float __v4sf __attribute__ ((__vector_size__ (16))); typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); struct S { __m128 a, b; }; struct T { int a; struct S s; }; void foo (struct T *p, __m128 v) { struct S s; s = p->s; s.b = (__m128) __builtin_ia32_addps ((__v4sf)s.b, (__v4sf)v); p->s = s; } [hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 /tmp/space.i [hjl@gnu-6 gcc]$ cat space.s .file "space.i" .section .text.unlikely,"ax",@progbits .LCOLDB0: .text .LHOTB0: .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc addps 32(%rdi), %xmm0 movaps %xmm0, 32(%rdi) ret .cfi_endproc .LFE0: .size foo, .-foo .section .text.unlikely .LCOLDE0: .text .LHOTE0: .ident "GCC: (GNU) 4.9.0 20140409 (experimental)" .section .note.GNU-stack,"",@progbits [hjl@gnu-6 gcc]$ ./xgcc -B./ -S -Os /tmp/space.i [hjl@gnu-6 gcc]$ cat space.s .file "space.i" .section .text.unlikely,"ax",@progbits .LCOLDB0: .text .LHOTB0: .globl foo .type foo, @function foo: .LFB0: .cfi_startproc movq %rdi, %rax movl $8, %ecx leaq -40(%rsp), %rdi leaq 16(%rax), %rsi rep movsl addps 32(%rax), %xmm0 leaq 16(%rax), %rdi leaq -40(%rsp), %rsi movb $8, %cl movaps %xmm0, -24(%rsp) rep movsl ret .cfi_endproc .LFE0: .size foo, .-foo .section .text.unlikely .LCOLDE0: .text .LHOTE0: .ident "GCC: (GNU) 4.9.0 20140409 (experimental)" .section .note.GNU-stack,"",@progbits [hjl@gnu-6 gcc]$ analyze_all_variable_accesses in tree-sra.c has max_total_scalarization_size = UNITS_PER_WORD * BITS_PER_UNIT * MOVE_RATIO (optimize_function_for_speed_p (cfun)); -Os sets MOVE_RATIO to 3. Should we have a different parameter to control SRA?