https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99128
Bug ID: 99128 Summary: Stack used for concatenating values when returning struct by register Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: magiblot at hotmail dot com Target Milestone: --- Consider the following C++ example (https://godbolt.org/z/deqf4T): > struct A { int x, y; }; > > A makeA(int a, int b) > { > return {a, b}; > } > > struct B { int x, y, z; }; > > B makeB(int a, int b) > { > return {a, b}; > } In 'makeA' the whole struct fits into a 64-bit register and 'a' and 'b' are concatenated with arithmetic operations, as expected: > makeA(int, int): > sal rsi, 32 > mov eax, edi > or rax, rsi > ret But in 'makeB' the struct has a third field which gets returned in a different register. In this case, 'a' and 'b' are written into the stack and then moved to the return register: > makeB(int, int): > mov DWORD PTR [rsp-20], edi > xor edx, edx > mov DWORD PTR [rsp-16], esi > mov rax, QWORD PTR [rsp-20] > ret Here is another example: > struct C { > union { > char a; > char b[3]; > }; > char c; > }; > > C makeC(char a) > { > return {a, 1}; > } The assembly looks like this: > makeC(char): > xor eax, eax > mov BYTE PTR [rsp-4], dil > mov WORD PTR [rsp-3], ax > mov BYTE PTR [rsp-1], 1 > mov eax, DWORD PTR [rsp-4] > ret Which is pretty bad compared to what Clang produces: > makeC(char): > movzx eax, dil > or eax, 16777216 > ret The 'makeB' example results in suboptimal x86_64 assembly in all versions of GCC available at Godbolt which can compile the code. The 'makeC' example, though, is fine in GCC 8.3 and earlier but produces suboptimal assembly in GCC 9.1 and later. EXPECTED RESULT GCC should be able to concatenate primitive types with arithmetic-logic operations instead of using the stack when returning structs by register. Thank you.