https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99128

            Bug ID: 99128
           Summary: Stack used for concatenating values when returning
                    struct by register
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: magiblot at hotmail dot com
  Target Milestone: ---

Consider the following C++ example (https://godbolt.org/z/deqf4T):

> struct A { int x, y; };
> 
> A makeA(int a, int b)
> {
>     return {a, b};
> }
> 
> struct B { int x, y, z; };
> 
> B makeB(int a, int b)
> {
>     return {a, b};
> }

In 'makeA' the whole struct fits into a 64-bit register and 'a' and 'b' are
concatenated with arithmetic operations, as expected:

> makeA(int, int):
>         sal     rsi, 32
>         mov     eax, edi
>         or      rax, rsi
>         ret

But in 'makeB' the struct has a third field which gets returned in a different
register. In this case, 'a' and 'b' are written into the stack and then moved
to the return register:

> makeB(int, int):
>         mov     DWORD PTR [rsp-20], edi
>         xor     edx, edx
>         mov     DWORD PTR [rsp-16], esi
>         mov     rax, QWORD PTR [rsp-20]
>         ret

Here is another example:

> struct C {
>     union {
>         char a;
>         char b[3];
>     };
>     char c;
> };
> 
> C makeC(char a)
> {
>     return {a, 1};
> }

The assembly looks like this:

> makeC(char):
>         xor     eax, eax
>         mov     BYTE PTR [rsp-4], dil
>         mov     WORD PTR [rsp-3], ax
>         mov     BYTE PTR [rsp-1], 1
>         mov     eax, DWORD PTR [rsp-4]
>         ret

Which is pretty bad compared to what Clang produces:

> makeC(char):
>         movzx   eax, dil
>         or      eax, 16777216
>         ret

The 'makeB' example results in suboptimal x86_64 assembly in all versions of
GCC available at Godbolt which can compile the code.

The 'makeC' example, though, is fine in GCC 8.3 and earlier but produces
suboptimal assembly in GCC 9.1 and later.

EXPECTED RESULT

GCC should be able to concatenate primitive types with arithmetic-logic
operations instead of using the stack when returning structs by register.

Thank you.

Reply via email to