https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571
Bug ID: 103571 Summary: ABI: V2HF, V4HF and V8HFmode argument passing issues Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- typedef _Float16 v2hf __attribute__((vector_size(4))); typedef _Float16 v4hf __attribute__((vector_size(8))); typedef _Float16 v8hf __attribute__((vector_size(16))); v2hf foo (v2hf a, v2hf b) { return b; } v4hf bar (v4hf a, v4hf b) { return b; } v8hf baz (v8hf a, v8hf b) { return b; } --cut here-- compiles with -O2 -msse2 -m64 to: foo: movl 16(%rsp), %edx # 6 [c=9 l=4] *movsi_internal/0 movq %rdi, %rax # 2 [c=4 l=3] *movdi_internal/3 movl %edx, (%rdi) # 7 [c=4 l=2] *movsi_internal/1 ret # 18 [c=0 l=1] simple_return_internal and with -O2 -msse2 -m32 to: foo: movl 4(%esp), %eax # 2 [c=9 l=4] *movsi_internal/0 movl 12(%esp), %edx # 6 [c=9 l=4] *movsi_internal/0 movl %edx, (%eax) # 7 [c=4 l=2] *movsi_internal/1 ret $4 # 17 [c=0 l=3] simple_return_pop_internal bar: movq %mm1, %mm0 # 14 [c=4 l=3] *movv4hf_internal/6 ret # 18 [c=0 l=1] simple_return_internal baz: pushl %esi # 53 [c=4 l=1] *pushsi2/0 pushl %ebx # 54 [c=4 l=1] *pushsi2/0 subl $52, %esp # 55 [c=4 l=3] movaps %xmm1, 16(%esp) # 5 [c=4 l=5] movv8hf_internal/3 movl 20(%esp), %ecx # 34 [c=9 l=4] *movsi_internal/0 movl 24(%esp), %edx # 35 [c=9 l=4] *movsi_internal/0 movl 28(%esp), %eax # 36 [c=9 l=4] *movsi_internal/0 movd %xmm1, (%esp) # 46 [c=4 l=5] *movsi_internal/11 movl %ecx, 4(%esp) # 47 [c=4 l=4] *movsi_internal/1 movl %edx, 8(%esp) # 48 [c=4 l=4] *movsi_internal/1 movl %eax, 12(%esp) # 49 [c=4 l=4] *movsi_internal/1 movdqa (%esp), %xmm0 # 50 [c=17 l=5] *movti_internal/4 addl $52, %esp # 58 [c=4 l=3] popl %ebx # 59 [c=9 l=1] *popsi1 popl %esi # 60 [c=9 l=1] *popsi1 ret # 61 [c=0 l=1] simple_return_internal Does ABI specify how to handle V2HF arguments and returns? foo looks a bit suspicious to me, corresponding V2HI arguments are simply returned in %eax register. Also, baz iz highly un-optimal for 32bit targets.